Multi-Level Model Reduction and Data-Driven Identiﬁcation of the Lithium-Ion Battery

: The lithium-ion battery is a complicated non-linear system with multi electrochemical processes including mass and charge conservations as well as electrochemical kinetics. The calculation process of the electrochemical model depends on an in-depth understanding of the physicochemical characteristics and parameters, which can be costly and time-consuming. We investigated the electrochemical modeling, reduction, and identiﬁcation methods of the lithium-ion battery from the electrode-level to the system-level. A reduced 9th order linear model was proposed using electrode-level physicochemical modeling and the cell-level mathematical reduction method. The data-driven predictor-based subspace identiﬁcation algorithm was presented for the estimation of lithium-ion battery model in the system-level. The e ﬀ ectiveness of the proposed modeling and identiﬁcation methods was validated in an experimental study based on LiFePO 4 cells. The accuracy and dynamic characteristics of the identiﬁed model were found to be much more likely related to the operating State of Charge (SOC) range. Experimental results showed that the proposed methods perform well with high precision and good robustness in the SOC range of 90% to 10%, and the tracking error increases signiﬁcantly within higher (100–90%) or lower (10–0%) SOC ranges. Moreover, to achieve an optimal balance between high-precision and low complexity, statistical analysis revealed that the 6th, 3rd, and 5th order battery model is the optimal choice in the SOC range of 90% to 100%, 90% to 10%, and 10% to 0%, respectively.


Introduction
The lithium-ion battery is currently one of the most promising energy storage devices due to its high energy density, high power density, and long cycle life performance, and has been widely used in many fields, such as electric vehicles, energy storage, and industrial electronics [1]. In order to ensure the safe, reliable, and efficient operation of lithium-ion batteries, researchers need to understand the electrochemical and thermodynamic characteristics of the battery, which is usually costly and time-consuming through experimental and model-based approaches [2]. In addition, the performance of the battery can be strongly affected by operation conditions. These facts increase the difficulties in building the battery model and reducing computational complexity in applications with high real-time requirements. From an application-oriented perspective, the battery model needs to be simple and easy to calculate, while still being complex enough to provide valid and accurate results. This creates a problem where the models are often either too simplified and approximated to provide high-precision analysis or too complicated for low-complexity processing. However, little research to date has focused This paper is organized as follows. Section 2 introduces the electrode-level electrochemical model of the lithium-ion battery. Section 3 surveys the cell-level model reduction methods. Section 4 outlines the methods and experiments for the system-level model identification. In Section 5 the model validation tests as well as the statistical analysis of the test results are presented. Sections 6 and 7 contain the discussion and conclusions, respectively.

Electrode-Level Modeling of Lithium-Ion Battery
In this section, the electrochemical fundamentals of lithium-ion batteries are introduced, and the electrode-level electrochemical modeling process is then studied. Symbols employed are listed below in Nomenclature. It should be noted that this research is based on the ideal assumption that we do not consider the side reactions, and we mainly focus on the main reactions within the lithium-ion battery, i.e., the lithium insertion/extraction reactions. Figure 1 shows a schematic diagram of the lithium-ion battery with three main domains: a negative electrode (width δ n ), a separator (width δ sep ), and a positive electrode (width δ p ). We can treat the lithium-ion battery as a one-dimensional model on electrode-level from the negative electrode (x = 0) to the positive electrode (x = L). During discharge, lithium ions diffuse to the surface of the active material particles in the negative electrode and migrate to the positive electrode through the electrolyte via diffusion and ionic conduction. Meanwhile, electrons are released through the external circuit to supply power to the load. The exact opposite process happens during charge.

Governing Equations
Energies 2020, 13, 3791 21 of 23 model validation tests as well as the statistical analysis of the test results are presented. Section 6 and Section 7 contain the discussion and conclusions, respectively.

Electrode-Level Modeling of Lithium-Ion Battery
In this section, the electrochemical fundamentals of lithium-ion batteries are introduced, and the electrode-level electrochemical modeling process is then studied. Symbols employed are listed below in Nomenclature. It should be noted that this research is based on the ideal assumption that we do not consider the side reactions, and we mainly focus on the main reactions within the lithium-ion battery, i.e., the lithium insertion/extraction reactions. Figure 1 shows a schematic diagram of the lithium-ion battery with three main domains: a negative electrode (width δn), a separator (width δsep), and a positive electrode (width δp). We can treat the lithium-ion battery as a one-dimensional model on electrode-level from the negative electrode (x = 0) to the positive electrode (x = L). During discharge, lithium ions diffuse to the surface of the active material particles in the negative electrode and migrate to the positive electrode through the electrolyte via diffusion and ionic conduction. Meanwhile, electrons are released through the external circuit to supply power to the load. The exact opposite process happens during charge. To start, we summarize the four PDEs governing the charge and discharge dynamics of the lithium-ion battery in Equations (1) to (8). The governing equations were first developed by Doyle [21] and have been widely applied in many research studies [22][23][24][25]. They mainly describe the charge and species conservation both in the solid and electrolyte phases. To start, we summarize the four PDEs governing the charge and discharge dynamics of the lithium-ion battery in Equations (1) to (8). The governing equations were first developed by Doyle [21] and have been widely applied in many research studies [22][23][24][25]. They mainly describe the charge and species conservation both in the solid and electrolyte phases.

Governing Equations
Conservation of species in solid phase: with the boundary condition: Conservation of species in electrolyte phase: with the boundary condition: Conservation of charge in solid phase: with the boundary condition: Conservation of charge in electrolyte phase: with the boundary condition: The four PDEs are coupled by the Butler-Volmer equation [31], which describes the reaction kinetics at the solid/electrolyte interface: where symbols and their meanings are listed below in Nomenclature. The overpotential η is defined as: where U is the thermodynamic equilibrium potential. Based on Equation (10), we obtain: The voltage across the cell terminals is determined by: where R f is the contact resistance. According to Equations (11) and (12), we can rewrite the terminal voltage in Equation (12) as follows: where U p and U n are the equilibrium potential of positive and negative electrode, respectively.

Simplification of Model Parameters
In this subsection, we mainly focus on the simplification of the electrochemical parameters by neglecting the non-uniform distribution of c s , c e , and j Li . According to Equations (5) and (6), we obtain: where j Li n,avg and j Li p,avg are the volume-averaged reaction current density in the negative and positive solid phase, respectively. Furthermore, we replace the real values of the reaction current density with the volume-average ones: Suppose that c s,e is evenly distributed along the x-axes, then ∂ 2 ln c e /∂x 2 = 0, and Equation (7) can be simplified as follows: κ e f f ∂ 2 φ e,n /∂x 2 n + j Li According to Equations (15) and (16), and the boundary conditions in Equation (8), we can obtain: In Equation (17), x n = x, x p = x − δ sp . φ e,n and φ e,p are the electrical potential in the negative and positive electrolyte phase, respectively.
Since the electric potential is equal at both ends of the separator, it yields the following: Based on Equations (17) and (18), we can approximate that: Furthermore, according to Equations (9) and (14), we can obtain: where α a and α c are the anodic and cathodic transfer coefficient, respectively. Assuming that α a = α c = α = 0.5, η p and η n can be approximated as: where ξ p = −I/ 2Aδ p a s,p i 0,p , ξ n = I/(2Aδ n a s,n i 0,n ). Finally, according to Equations (19) and (21), the terminal voltage in Equation (13) can be calculated by: In Equation (22), U p and U n can be considered as functions of the li-ion concentration in the solid phase.

Cell-level Model Reduction
The coupled governing partial differential equations (PDEs) in Equations (1) to (8) are complex and difficult to calculate, which makes their use in control-oriented applications impractical. In this section, our goal is to find a low-dimensional approximation for the nonlinear electrochemical model using cell-level mathematical theories including discretization and linearization methods.

Model Discretization
The cell terminal voltage in Equation (22) mainly describes the steady-state behavior of the cell; it can be written as a function of current I, concentration of lithium-ion in solid phase c s,p and c s,n . In this subsection, we aim to build the transient-state model of the cell in the state-space form. In the state-space model structure, the li-ion concentration in both the solid and electrolyte phase are modeled as the state vectors, which are highly correlated with the transient characteristics of the lithium-ion cell. Therefore, the emphasis of this subsection is on the transient modelling of the li-ion concentration.
Comparing Equation (7) with (16), we can find that Equation (16) fails to model the influence of c e on electrical potential, which may result in the loss of model information and accuracy. Alexander et al. [32] give an approximate solution to Equation (7) that involves the effects of c e , then Equation (19) can be approximated by: φ e,p − φ e,n = RTβ ln c e,p − ln c e,n /F − I δ n + 2δ sep + δ p /2Aκ e f f , where R is the gas constant, T is the temperature in Kelvin, F is the Faraday's constant, and β is a constant. Consequently, the terminal voltage of the cell in Equation (22) can be rewritten as: Equation (24) gives the output equation of the state-space model of the lithium-ion cell. The next goal is to establish the equation of state with respect to state vectors including c s,p , c s,n , c e,p , and c e,n .
Volume integration of Equation (1) yields: where V s = 4πR 3 s /3, dV s = 4πr 2 dr, subscript avg means volume-average. Further, substituting Equations (2) and (15) into Equation (25) yields: ∂c s,n,avg /∂t = −I/(δ n Aε s,n F), ∂c s,p,avg /∂t = I/ δ p Aε s,p F , where c s,n,avg and c s,p,avg are the volume-averaged li-ion concentration in negative and positive solid phase, respectively. However, input/output data are sampled in discrete form in real applications. We can rewrite Equation (26) in the discrete form as: c s,n,avg (k + 1) = c s,n,avg (k) − ∆tI(k)/(δ n Aε s,n F), c s,p,avg (k + 1) = c s,p,avg (k) + ∆tI(k)/ δ p Aε s,p F , (27) where ∆t is the sampling period. It can be shown that the li-ion concentration decreases in the negative solid phase during discharge (i.e., I(k) > 0), while it increases in the positive solid phase, which is consistent with the electrochemical reaction process. Substituting Equations (4) and (15) into the volume integration of Equation (3) yields: Similarly, we rewrite Equation (28) in the discrete form, and the negative electrolyte phase yields: c e,n,avg (k + 1) = c e,n,avg (k) + where c e,n,avg is the volume-averaged li-ion concentration in negative electrolyte phase, ∇ x c e,n is the gradient of c e,n in the x-direction, and ∇ x denotes the vector differential operator, ∇ x c e,n = ∂c e,n /∂x. The positive electrolyte phase yields: where c e,p,avg is the volume-averaged li-ion concentration in positive electrolyte phase and ∇ x c e,p is the gradient of c e,p in the x-direction, ∇ x c e,p = ∂c e,p /∂x. Due to the hysteresis of lithium-ion migration in the electrolyte phase, the values of ∇ x c e,n and ∇ x c e,p at time k + 1 are not only affected by reaction current I(k), but also related to their own values the at time k. Therefore, ∇ x c e,n and ∇ x c e,p can be approximated by: Finally, according to Equations (24), (27), and (29)-(31), we can rewrite the reduced order lithium-ion battery model in discretized state-space form as: In Equation (32), u k is the input matrix, u k = I(k), y out,k is the output matrix, y out,k = V(k), x k is the state vector: = c s,n,avg (k), c s,p,avg (k), c e,n,avg (k), c e,p,avg (k), ∇ x c e,n,avg (x) x=δ n (k), Energies 2020, 13, 3791 8 of 21 g(x k , u k ) is the output equation: f (x k , u k ) is the state equation: where, C 1 , C 2 , C 3 , C 4 , C 5 , C 6 , and C 7 are all constants.

Model Linearization
The proposed reduced-order model in Equation (32) is still extraordinarily complex due to the involved nonlinear equations. In this subsection, our goal is to linearize the reduced-order battery model in Equation (32). As such, we employ the small-signal analysis method to remove the nonlinearities of the battery model. The basic principle of the small-signal analysis method is to expand a nonlinear function into Taylor series, and retain only the constant and the first-order term of the series, which lead to a linear approximation of the nonlinear equation. More specifically, a nonlinear function y = f (x) can be approximated by: where x denotes the equilibrium working point and (x − x) represents a small variation in Similarly, for a nonlinear system y = f (x 1 , x 2 ) which consists of two input variables x 1 and x 2 , a linear approximation can be given by: A similar linearization process can be employed when the number of variables is larger than two.
The linearization of function f 1 (x 5 (k)) + f 2 (I(k)) using small-signal analysis method yields: When a lithium-ion cell operates at the equilibrium point, the reaction current j = I = 0. In this case, li-ions are stored in the solid electrode with very few remaining in the electrolyte; therefore, we can approximate that c e,n,avg = 0, so that y = x 5 (k + 1) = x 5 (k) = 0. Consequently, Equation (38) can be simplified as: Energies 2020, 13, 3791 Similar linearization results can be obtained for the nonlinear function with zero equilibrium variables (i.e., f 3 , f 4 , f 5 , f 6 , f 7 ).
For the functions with non-zero variables (i.e., f 8 , f 9 ), the linearization process yields: with: where c s,p,avg (k) and c s,n,avg are the volume-averaged li-ion concentration at the equilibrium point, and they are non-zero constant. E(k) is a non-zero constant which denotes the equilibrium inner potential.
Finally, the battery model in Equation (32) can be linearized as a 9th order model: where u k is the input current sequence, y k is the output voltage sequence, y k = V(k) − E(k), and A is the system matrix consists of constant: B is the input matrix consists of constant: C is the observation matrix: D is the feed-through matrix:

System-Level Model Identification
The linearized 9th order model in Equation (42) shows that there are numerous unknown parameters that need to be predetermined. Theoretically, the model parameters (i.e., the values of A, B, C, D) can be obtained from battery tests. However, the determination of these parameters is a non-trivial task. It is always the case that one must make empirical guesses about many parameters, which makes the model unsatisfactory. In this section, we aim to provide a complete and easily implementable parameter identification algorithm for the 9th order battery model in Equation (42) and to explore the relationship between model order and accuracy.

Subspace Identification Algorithm
Conventionally, a system is modeled by a transfer function, and is identified using optimization methods such as nonlinear least-squares. Subspace identification methods offer an alternative solution based on the state-space model structure. Subspace identification methods are non-iterative procedures and can achieve a globally optimal solution while avoid local minima problems [33], which makes the algorithm particularly suitable for the identification of complex systems like lithium-ion battery models.
Rewrite Equation (42) in the predictor form: where Further, we define the system order denoted by n, a past window denoted by p, a future window denoted by f, where n ≤ f ≤ p.
Based on the input and output sequence over a given time, we define the following stacked matrices: Similarly, we can define the vectors u k−p,p , u k, f , U, U p in the same way.
Suppose that A j = 0 for all j ≥ p, we rewrite the state equation of the system in Equation (47) as ΓL and ΓK are defined as follows: where Similarly, with the assumption that A j = 0 for all j ≥ p, the output equation of the system in Equation (47) can be approximated as: where Ξ Ξ (u k−p ) · · · Ξ (u k ) Ξ (y k−p ) · · · Ξ (y k−1 ) , and it is known as the Markov Matrix; ϕ k is the input/output matrix ϕ k = u T k−p,p , u T k , y T k−p,p . Finally, based on the above discrete and iterative processing of state-space systems, the detailed steps of predictor-based subspace identification (PBSID) algorithm [34,35] are summarized in Table 1. Table 1. Detailed scheme of the predictor-based subspace identification (PBSID) algorithm. Step

Detailed Scheme of PBSID Algorithm
Step 1 First, sample the input and output sequence u k and y k ; Second, set the initial value of n, p, f, and the weight matrix W, require n ≤ f ≤ p; Third, construct the matrices Y, Y p , U, U p given by Equation (48).
Step 2 First, construct the matrix Ψ, Second, solve the linear regression problem in Equation (52) using the least square method, then estimate the value of Ξ, Ξ = YΨ † .
Step 3 First, build the matrices ΓL and ΓK given by Equations (50) and (51); Second, calculateΓ X = W ΓL U p + ΓK Y p given by Equation (49), and estimate the state sequence through singular value decomposition (SVD) ofΓ X : Third, determine the system order n through detecting a big gap between the singular values, and calculate the state sequence:X = 1/2 n V n .
Step 4 First, estimate the system matrix C and D through solving the following least squares problem: Second, estimate the system matrix A, B, and K in the same way:

Identification Experiments
The test bench in Figure 2 was used in order to identify the parameters of the battery model. It consists of a high precision battery test system for single cell test (BT2000, Arbin Instruments, College Station, TX, USA), a high voltage battery test system for the battery module test (BTS2000, Techpow Electric Co., Xiangyang, China), a thermal chamber for temperature control (HLT2005P, Hardy Technology Co., Chongqing, China), and a computer for user-machine interface and data analysis. In addition, a set of lithium iron phosphate cells were selected for testing and validation (LiFePO 4 , 11Ah, Lishen Battery Co., Ltd., Tianjin, China). The detailed parameters of the LiFePO 4 cells are shown in Table 2.
Third, determine the system order n through detecting a big gap between the singular values, and calculate the state sequence: Step 4 First, estimate the system matrix C and D through solving the following least squares problem: Second, estimate the system matrix A, B, and K in the same way:

Identification Experiments
The test bench in Figure 2 was used in order to identify the parameters of the battery model. It consists of a high precision battery test system for single cell test (BT2000, Arbin Instruments, College Station, TX, USA), a high voltage battery test system for the battery module test (BTS2000, Techpow Electric Co., Xiangyang, China), a thermal chamber for temperature control (HLT2005P, Hardy Technology Co., Chongqing, China), and a computer for user-machine interface and data analysis. In addition, a set of lithium iron phosphate cells were selected for testing and validation (LiFePO4, 11Ah, Lishen Battery Co., Ltd., Tianjin, China). The detailed parameters of the LiFePO4 cells are shown in Table 2.  Based on the 9th order battery model and the PBSID algorithm proposed above, the accuracy of the model will only be affected by the test method. In order to obtain a battery model with high accuracy, we expect that the test method can continuously excite most of the dynamic characteristics of the LiFePO4 cells. In this research, the hybrid pulse tests at different SOC points were performed  Based on the 9th order battery model and the PBSID algorithm proposed above, the accuracy of the model will only be affected by the test method. In order to obtain a battery model with high accuracy, we expect that the test method can continuously excite most of the dynamic characteristics of the LiFePO 4 cells. In this research, the hybrid pulse tests at different SOC points were performed based on methods described in our previous paper [36]. The current profile of the battery identification test procedure consists of -6 A continuous discharge and hybrid pulse test, separated by 1800-s rest (Figure 3a). More specifically, -6 A continuous discharge was employed to make the battery be tested at fixed SOC points including 95%, 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, 10%, and 5%. In addition, the hybrid pulse test is a kind of pseudo-random charge and discharge sequence which was used to test the dynamic properties of the batteries. The hybrid pulse test has a wide frequency band, which can capture more detailed information of battery dynamics. After performing one set of identification test profile, we repeated it until the battery voltage reached the 2.5 V cut-off voltage. The corresponding variation in the battery voltage of the samples over time is shown in Figure 3b. The above tests were conducted at 25 • C.
Energies 2020, 13, 3791 31 of 23 based on methods described in our previous paper [36]. The current profile of the battery identification test procedure consists of -6 A continuous discharge and hybrid pulse test, separated by 1800-s rest (Figure 3a). More specifically, -6 A continuous discharge was employed to make the battery be tested at fixed SOC points including 95%, 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, 10%, and 5%. In addition, the hybrid pulse test is a kind of pseudo-random charge and discharge sequence which was used to test the dynamic properties of the batteries. The hybrid pulse test has a wide frequency band, which can capture more detailed information of battery dynamics. After performing one set of identification test profile, we repeated it until the battery voltage reached the 2.5 V cut-off voltage. The corresponding variation in the battery voltage of the samples over time is shown in Figure 3b. The above tests were conducted at 25 °C.

Identification Results
In this subsection, the data sets of hybrid pulse tests at different SOC points were used to obtain the input and output sequence for the identification experiments. According to the 9th battery model in Equation (42), the collected current data were used as input sequence u k , and the voltage data were used as output sequence y k . Before identification, we set the model order n = 9, and the initial values f = p = 9 in the PBSID algorithm.
When analyzing the identification results, we used the Bode diagrams in the frequency domain instead of parametric displays in the time domain as the main analysis method. The main reason is that the identified A, B, C, and D matrices contain a large number of elements, which are not suitable for intuitive analysis and comparison. Figure 4 shows the Bode diagrams of the identified 9th battery model at different SOC points. The identification results suggest that the dynamic properties of the tested LiFePO 4 batteries vary across different SOC points and show high consistency in the SOC range of 10% to 90%. In contrast, we found significant differences in dynamic properties when the batteries were tested at 95% and 5% SOC points. Therefore, we can conclude that the dynamic characteristics of the LiFePO 4 batteries will be significantly different when the SOC is high or low. the input and output sequence for the identification experiments. According to the 9th battery model in Equation (42), the collected current data were used as input sequence uk, and the voltage data were used as output sequence yk. Before identification, we set the model order n = 9, and the initial values f = p = 9 in the PBSID algorithm.
When analyzing the identification results, we used the Bode diagrams in the frequency domain instead of parametric displays in the time domain as the main analysis method. The main reason is that the identified A, B, C, and D matrices contain a large number of elements, which are not suitable for intuitive analysis and comparison. Figure 4 shows the Bode diagrams of the identified 9th battery model at different SOC points. The identification results suggest that the dynamic properties of the tested LiFePO4 batteries vary across different SOC points and show high consistency in the SOC range of 10% to 90%. In contrast, we found significant differences in dynamic properties when the batteries were tested at 95% and 5% SOC points. Therefore, we can conclude that the dynamic characteristics of the LiFePO4 batteries will be significantly different when the SOC is high or low. In Figure 5, the singular values of matrix  X Γ are given using the PBSID-based model identification. As expected, the singular values of PBSID-based identification show clear gaps after the first three larger values in the SOC range of 10% to 90%. The fourth and subsequent values are less relevant during identification, since their contributions are small compared with the noise, which suggest the 3rd model may achieve a better balance between high-precision and low-complexity in a wide SOC range of 10% to 90%.
Furthermore, the singular values of matrix  X Γ at 5% and 95% SOC points are given in Figure   6. The results of the PBSID-based identification suggest that 5th order model is the preferred choice for a battery model at 5% SOC point (Figure 6a). In a similar way, we can infer that the 6th order model is a better choice for a battery model at 95% SOC point (Figure 6b). In Figure 5, the singular values of matrixΓ X are given using the PBSID-based model identification. As expected, the singular values of PBSID-based identification show clear gaps after the first three larger values in the SOC range of 10% to 90%. The fourth and subsequent values are less relevant during identification, since their contributions are small compared with the noise, which suggest the 3rd model may achieve a better balance between high-precision and low-complexity in a wide SOC range of 10% to 90%.  Furthermore, the singular values of matrixΓ X at 5% and 95% SOC points are given in Figure 6. The results of the PBSID-based identification suggest that 5th order model is the preferred choice for a battery model at 5% SOC point (Figure 6a). In a similar way, we can infer that the 6th order model is a better choice for a battery model at 95% SOC point (Figure 6b).

Model Validation
To investigate the effectiveness and robustness of the proposed PBSID algorithm, two different model-based validation experiments were conducted in electric vehicle applications. The experiments were carried out on a new cell in the same group and included two different input

Model Validation
To investigate the effectiveness and robustness of the proposed PBSID algorithm, two different model-based validation experiments were conducted in electric vehicle applications. The experiments were carried out on a new cell in the same group and included two different input current input sequences: the hybrid pulse test and the real-time UDDS driving cycle test. Our validation method is highly effective and less expensive than other options currently available.
The validation results were analyzed in time domains, which consists of the computation of tracking error and Variance-Accounted-For (VAF) on a data set that different from the data set used for determining the model. The VAF resembles the percentage of the output variation that is estimated by the model. The VAF is defined as: where y k is the measured output voltage,ŷ k denotes the estimated output voltage obtained by the identified model, and var() denotes the variance of a quasi-stationary signal.

Experimental Validation
For the hybrid pulse validation experiment, the tracking error at each sampling point and the upper envelope bound was plotted to depict the model quality (Figure 7a). The corresponding SOC variation of the samples over time and the measured voltage output are also shown in Figure 7b,c, respectively. It can be observed that in the SOC range of 90% to 5%, the tracking error upper envelope varies from 0.88% to 2.99% with good robustness and high accuracy. In addition, within the entire range of SOC from 100% to 0%, the calculated value of the VAF is 96.11%, and the maximum tracking error is 6.40%. Importantly, the experiment results also show that the tracking error increases significantly within higher (100-90%) or lower (5-0%) SOC range.
For the hybrid pulse validation experiment, the tracking error at each sampling point and the upper envelope bound was plotted to depict the model quality (Figure 7a). The corresponding SOC variation of the samples over time and the measured voltage output are also shown in Figure 7b and Figure 7c, respectively. It can be observed that in the SOC range of 90% to 5%, the tracking error upper envelope varies from 0.88% to 2.99% with good robustness and high accuracy. In addition, within the entire range of SOC from 100% to 0%, the calculated value of the VAF is 96.11%, and the maximum tracking error is 6.40%. Importantly, the experiment results also show that the tracking error increases significantly within higher (100-90%) or lower (5-0%) SOC range. The second experiment deals with the model validation in Urban Dynamometer Driving Schedule (UDDS)driving cycle. The tracking error at each sampling point was plotted to depict the model quality (Figure 8a). In addition, the current profile of the UDDS driving cycle test is shown in Figure 8b, and the corresponding measured voltage output response is also plotted in Figure 8c. It can be observed that the tracking errors remain less than 2.83% in the SOC range of 100% to 3% and increase significantly in the lower SOC range (SOC < 3%). In addition, within the entire range of SOC from 100% to 0%, the calculated value of the VAF is 94.12%, and the maximum tracking error is 5.77%. The experiment results also show that the identified 9th order battery model can achieve high-precision in the UDDS driving cycle. Therefore, we can conclude that the PBSID algorithm performs very well with high precision and good robustness under random high rate charging and discharging experimental conditions.
The experiment results also show that the identified 9th order battery model can achieve highprecision in the UDDS driving cycle. Therefore, we can conclude that the PBSID algorithm performs very well with high precision and good robustness under random high rate charging and discharging experimental conditions.

Statistical Analysis
The above tests show that the accuracy and dynamic characteristics of the model are related to the operating SOC range of the battery. In real-time applications, the higher the order of the battery model, the lower computational efficiency will be. To strike a balance between the model order and computation complexity, a series of model identification experiments were carried out based on the same data set of hybrid pulse test. During the identification experiments, the model order n was set to vary from 2 to 9 in the PBSID algorithm so as to obtain the nth order battery model. All the

Statistical Analysis
The above tests show that the accuracy and dynamic characteristics of the model are related to the operating SOC range of the battery. In real-time applications, the higher the order of the battery model, the lower computational efficiency will be. To strike a balance between the model order and computation complexity, a series of model identification experiments were carried out based on the same data set of hybrid pulse test. During the identification experiments, the model order n was set to vary from 2 to 9 in the PBSID algorithm so as to obtain the nth order battery model. All the identified models were validated in the same UDDS driving cycle, and their performance was then evaluated by investigating the vector-plots of tracking errors (Figure 9). In addition, to address the question of whether the accuracy of the model depends on the operating SOC range, we used a boxplot graph to analyze the distribution of the tracking errors in the SOC range of 90% to 100% (Figure 10a), 50% to 90% (Figure 10b), 10% to 50% (Figure 10c), and 0 to 10% (Figure 10d), respectively. It can be observed that the distribution of the tracking errors vary depending on the model order and the operating SOC range. Meanwhile, the identification results demonstrate a significant variation of dynamic properties as well as model parameters across different SOC working points. The comparison plots in Figures 9  and 10 reveal that the model order should be further reduced to achieve an optimal balance between high-precision and low-complexity. Therefore, it is important to choose and appropriate model order according to the operating SOC ranges in real-time applications.
As shown in Figure 10a, the boxplots are used to visually show the distribution of the tracking errors and skewness by displaying the data percentiles and averages in the SOC range of 90% to 100%. Each boxplot provides a visual summary of the tracking errors including five main percentiles (the 5%, 15%, 50%, 85%, and 95% percentile), outliers (the minimums, maximums, 1% and 99% percentile) the mean value. As expected, the 95% and 50% percentiles decrease until the 6th order model. After the 6th order model, the dispersion of the data set and signs of skewness show a high level of consistency, which suggests the 6th order battery model is the optimal choice in the SOC range of 90% to 100%. question of whether the accuracy of the model depends on the operating SOC range, we used a boxplot graph to analyze the distribution of the tracking errors in the SOC range of 90% to 100% (Figure 10a), 50% to 90% (Figure 10b), 10% to 50% (Figure 10c), and 0 to 10% (Figure 10d), respectively. It can be observed that the distribution of the tracking errors vary depending on the model order and the operating SOC range. Meanwhile, the identification results demonstrate a significant variation of dynamic properties as well as model parameters across different SOC working points. The comparison plots in Figure 9 and Figure 10 reveal that the model order should be further reduced to achieve an optimal balance between high-precision and low-complexity. Therefore, it is important to choose and appropriate model order according to the operating SOC ranges in real-time applications. Figure 9. Vector-plots of tracking errors for identified models with different orders in UDDS driving cycle tests.
As shown in Figure 10a, the boxplots are used to visually show the distribution of the tracking errors and skewness by displaying the data percentiles and averages in the SOC range of 90% to 100%. Each boxplot provides a visual summary of the tracking errors including five main percentiles (the 5%, 15%, 50%, 85%, and 95% percentile), outliers (the minimums, maximums, 1% and 99% percentile) the mean value. As expected, the 95% and 50% percentiles decrease until the 6th order model. After the 6th order model, the dispersion of the data set and signs of skewness show a high level of consistency, which suggests the 6th order battery model is the optimal choice in the SOC range of 90% to 100%.
In a similar way, we can conclude that the 3rd order battery model is the optimal choice in the wide SOC range of 10% to 90% (Figure 10b,c), and the 5th order battery model is optimal in the SOC range of 0% to 10% (Figure 10d). These results agree well with the findings of the singular values analysis in Section 4.3, which can achieve an optimal balance between high-precision and lowcomplexity.

Discussion
In this study, we showed that the lithium ion battery is a complicated non-linear electrochemical system with multi physicochemical processes. The simplified electrochemical modeling analyses suggest that the 9th order linear model structure can provide a comprehensive cell-level description of the main electrochemical reaction processes with little loss in accuracy. In addition, it has also been illustrated that the overall performance of 9th order battery model depends on not only electrolyte and electrode materials, but also on the operation condition and choice of massive physical parameters. Conversely, the calculation process of the model depends on an in-depth understanding of the thermodynamic characteristics and In a similar way, we can conclude that the 3rd order battery model is the optimal choice in the wide SOC range of 10% to 90% (Figure 10b,c), and the 5th order battery model is optimal in the SOC range of 0% to 10% (Figure 10d). These results agree well with the findings of the singular values analysis in Section 4.3, which can achieve an optimal balance between high-precision and low-complexity.

Discussion
In this study, we showed that the lithium ion battery is a complicated non-linear electrochemical system with multi physicochemical processes. The simplified electrochemical modeling analyses suggest that the 9th order linear model structure can provide a comprehensive cell-level description of the main electrochemical reaction processes with little loss in accuracy. In addition, it has also been illustrated that the overall performance of 9th order battery model depends on not only electrolyte and electrode materials, but also on the operation condition and choice of massive physical parameters. Conversely, the calculation process of the model depends on an in-depth understanding of the thermodynamic characteristics and physical parameters of the battery, which can be costly and time-consuming with experimental methods. While not all the results were significant, the overall direction of the investigation showed trends that could be helpful to further simplifying the battery model from the electrode-level to the system-level. Furthermore, the system-level experimental results of the LiFePO 4 battery model identification demonstrate a significant variation of dynamic properties as well as model parameters across different SOC working points. Meanwhile, the validation experiments indicated that it is important to choose the appropriate model order according to the operating SOC ranges. The boxplots of tracking errors distribution suggested that the 6th, 3rd, and 5th order battery model is the optimal choice in the SOC range of 90% to 100%, 90% to 10%, and 10% to 0%, respectively. Further data analysis showed that increasing the model order after the optimal one barely improves the accuracy, possibly due to the limited sampling precision and the additional white noise in the collected data sets. One limitation of our research is that our data only refers to one kind of lithium ion battery (LiFePO 4 ). Clearly, the data sets of LiFePO 4 batteries are not enough to make generalizations about all types of lithium ion batteries. However, the innovative approaches proposed in our search are generic, and the same approaches can used for conducting similar research on other types of lithium ion batteries.
In contrast to other research studies [22][23][24][25][26], we used the data-driven PBSID identification algorithm instead of physicochemical parameter measurements to obtain the model parameters. This difference mainly relates to the time-effective and application-oriented advantages involved in the PBSID algorithm: it provides the researchers with the ability to identify battery model with multiple inputs and multiple outputs and does not require any physicochemical related information.
In addition, it should be noted that the identification of the battery model was performed under random high rate charging and discharging experimental conditions with wide SOC ranges. It is expected that the accuracy of the identified models will improve considerably when applied on a more realistic operating condition, such as cycle charge and discharge with 10-90% DOD (depth of discharge) in EV and PHEV applications.

Conclusions
To summarize, in this paper we investigated the electrode-level modeling, cell-level model reduction, and the system-level model identification of the lithium-ion battery. This work demonstrated that the lithium-ion battery is a complicated electrochemical system with multi electrode-level physicochemical processes such as the mass and charge conservations as well as the electrochemical kinetics. We showed that it is possible to build a reduced 9th order battery model through cell-level physicochemical and mathematical theories including the volume-average analysis method and small-signal analysis method.
The system-level predictor-based subspace identification algorithm was presented and its effectiveness for the estimation of lithium-ion battery model was shown. This data-driven identification technique does not require any physicochemical processes related information, which makes the proposed modeling and identification method generic and applicable to all type of batteries.
The effectiveness and robustness of the proposed methods were shown in an experimental study, where the algorithms were used on the data sets of hybrid pulse test and UDDS driving cycle test. It can be concluded that the PBSID algorithm performs very well with high precision and good robustness.
A statistical study of the tracking error distribution in different SOC ranges was conducted based on the identified battery models with different model orders. The comparison results revealed that the model order should be further reduced to achieve an optimal balance between high-precision and low-complexity. We showed that the 6th, 3rd, and 5th order battery model is the optimal choice in the SOC range of 90% to 100%, 90% to 10%, and 10% to 0%, respectively. Future work in this subject will involve the application of the proposed approaches to data from other types of lithium ion batteries.  electrode plate area (cm 2 ) φ s electrical potential in solid phase (V) φ e electrical potential in electrolyte phase (V) η overpotential (V) U p thermodynamic equilibrium potential of positive electrode (V) U n thermodynamic equilibrium potential of negative electrode (V) α a anodic transfer coefficient α c cathodic transfer coefficient R f contact resistance (Ω·cm 2 ) δ n negative electrode width (cm) δ sep separator width (cm) δ p positive electrode width (cm) T absolute temperature (K) avg subscript related to volume average