Integrated Approach Based on Dual Extended Kalman Filter and Multivariate Autoregressive Model for Predicting Battery Capacity Using Health Indicator and SOC / SOH

: To enhance the e ﬃ ciency of an energy storage system, it is important to predict and estimate the battery state, including the state of charge (SOC) and state of health (SOH). In general, the statistical approaches for predicting the battery state depend on historical data measured via experiments. The statistical methods based on experimental data may not be suitable for practical applications. After reviewing the various methodologies for predicting the battery capacity without measured data, it is found that a joint estimator that estimates the SOC and SOH is needed to compensate for the data shortage. Therefore, this study proposes an integrated model in which the dual extended Kalman ﬁlter (DEKF) and autoregressive (AR) model are combined for predicting the SOH via a statistical model in cases where the amount of measured data is insu ﬃ cient. The DEKF is advantageous for estimating the battery state in real-time and the AR model performs better for predicting the battery state using previous data. Because the DEKF has limited performance for capacity estimation, the multivariate AR model is employed and a health indicator is used to enhance the performance of the prediction model. The results of the multivariate AR model are signiﬁcantly better than those obtained using a single variable. The mean absolute percentage errors are 1.45% and 0.5183%, respectively.


Introduction
Due to the Paris Agreement for preventing catastrophic climate change, the current energy system requires a rapid global shift toward decarbonization in all sectors, such as industry, transportation, and residential and commercial buildings, through the use of renewable energy [1]. The generation of renewable energy must be increased to achieve decarbonization of power and energy systems. Because renewable energy resources have intermittent characteristics, an energy storage system (ESS) is essential for the transition to a sustainable energy system with flexibility and reliability [2]. Among different types of battery, the Lithium-ion battery is effective for high power and high energy density applications and is also widely used from small-to-large scale electronics, electric vehicles, and utility-scale storages [3].
In the global market, the installation of ESSs is expected to increase exponentially, from 9 GW/17 GWh deployed as of 2018 to 1095 GW/2850 GWh expected by 2040 [4]. Studies have indicated that the ability to store and release electricity nearly instantaneously offers multiple benefits in the power system not only for the integration of diverse renewable energy sources but also for grid reliability [5].
To enhance grid efficiency, it is important to consider both the battery cost and the performance of the ESS over its lifespan [6]. The cost of battery packs is decreasing by an average of 13% per year and is expected to reach $156/kWh in 2019 and $100/kWh in 2023 [7]. Because improvements in battery performance can reduce operation and management costs, the importance of the battery management system (BMS) is gradually increasing to manage and monitor the battery state more efficiently. Accurate estimation and prediction of the state of charge (SOC) and the state of health (SOH) can ensure the BMS prolong the cycle life of energy management and reduce the utilization cost for replacing the battery [8].
The SOC represents the remaining capacity that is compared with the stored capacity [9]. The SOC varies between 0% to 100%, which corresponds to the fully discharged and charged state, respectively. However, if the battery is aged, the maximum value of the SOC decreases because of the capacity degradation. When the capacity is reduced to 80% of the initial value, the SOC region of the battery is reduced by 20%. Thus, accurate capacity estimation is required for the efficient and safe operation of the battery [10]. Research on estimating the SOC and SOH, according to the variability of the battery characteristics, is crucial for developing an advanced BMS. Among the estimation strategies, the Kalman filter is advantageous in the sense that it has a very flexible coordinator to handle battery characteristic changes [8,10].

Literature Review
The SOH is the ratio of the characterization parameters (e.g., capacity and internal resistance). The remaining useful life (RUL) is the number of available cycles from the SOH to the failure threshold. Among research studies regarding battery, it is a necessity for efficient utilization to identify the aging mechanism and predict the capacity loss based on SOH, according to the different perspectives [11,12].
From an economic viewpoint, because the battery wear cost can be calculated according to the achievable cycle count, depth of discharge, battery size, and round-trip efficiency [13], the SOH can be a reference to determine the battery replacement cost and economic utilization cost for maximizing the profitability [14][15][16]. From a technical perspective, the battery cycle life and SOH can be utilized to optimize the energy management strategy by determining the threshold for energy capability [17]. In the long term, SOH management improves the lifecycle by assessing the impact of battery aging using a health indicator (HI), which tracks the degradation of batteries [17].
In recent studies, estimation methods of the battery capacity have employed the correlation between the characterization parameters and aging. This seems to be an important issue that the estimation method for the HI should be selected properly to reflect the nonlinear pattern of degradation. The integrated HI is defined in the aging model by incorporating a certain period of capacity and resistance. Xiong et al. [18] investigated the correlation between the partial charge capacity obtained in a certain region of the voltage and the capacity for extracting the effective HI in the charging mode. A linear aging model was constructed using the moving-window-based method considering the HI. Zheng et al. [19] utilized the HI to determine the optimal charge region of the fixed voltage window and proposed an online capacity estimation method that incorporates both the discrete Arrhenius aging model and extended Kalman filter. Liu et al. [20] proposed a relevance vector machine using the optimized HI to enhance the accuracy and stability of RUL prediction. Zhou et al. [21] optimized the HI considering the different times of voltage drops in the discharge curve and used the Box-Cox transformation to stabilize the variance. Even though these methods are widely used for estimating the SOH because the capacity is highly dependent on the charge/discharge region, the HI extraction is limited to the charge/discharge region. If the battery voltage does not reach a certain region, the HI cannot be updated and the prediction performance is unreliable. Therefore, aging characterization parameters based on real-time methods are needed to estimate the SOH accurately.
To select the appropriate HI in real-time, resistance is the most influential candidate for reflecting the aging condition of the battery. The resistance is related to the solid-electrolyte interphase (SEI) layer in the physicochemical mechanism of the battery. One of the most influential factors affecting the Energies 2020, 13, 2138 3 of 20 capacity loss with aging is the SEI layer growth [22]. According to cell size, capacity, cell design, and types of active material in the battery, thermal abuse tolerance of the battery can be varied from the SEI formation [23]. Even though it has a strong relationship with the capacity loss, it is difficult to identify the multi-physical behavior and thermodynamics of the SEI layer in real-time applications because of the model's complexity and a large number of parameters [24]. The SEI growth for the prediction of the SOH can be simplified by using electrochemical impedance spectroscopy (EIS) and an equivalent circuit model (ECM). The resistance of the SEI layer can be extracted from the high-frequency region of the EIS results and the parameters from EIS can be applied to the ECM [25]. This method has the advantage of quickly extracting the parameters to be implemented in the BMS board [26]. However, it is too complex to conduct computations for fitting a model. Additionally, the EIS is difficult to apply for real-time estimators and applications because it requires the high cost for the experiment [27].
Due to this limitation, the method for estimating the SOH in real-time applications needs to define the relationship between the internal resistance based on the direct-current signal and the capacity. Pan et al. [28] proposed an online SOH estimation method using machine learning and the ECM parameters, such as the enlarged ohmic and polarized internal resistances. Because these parameters are intuitive values and are flexible for different types of dynamic load profiles, they can be measured by an online parameter identification algorithm. However, the machine learning method is difficult to apply to online applications because of complex computation.
In recent years, to enhance the flexibility of the capacity model, the prediction model has been merged with the real-time estimation method without measured data. Qie et al. [29] proposed the RUL prediction method based on the SOC/SOH joint estimator. The SOC/SOH was estimated using a multiscale hybrid Kalman filter and the estimated SOH was used to update the RUL prediction model. Xue et al. [30] presented an integrated algorithm. These studies revealed it is difficult to construct the prediction model and the relationship between the capacity loss and other factors because of the nonlinear characteristics of the battery capacity according to the cycle and historical conditions. The studies indicated the limitation of conventional prediction methods that use the measured capacity for constructing the aging model. The measured capacity may not apply to real-time applications because the capacity in a real application has a different trend. Therefore, researchers should consider methods for estimating the SOH or optimization methods of the HI.
To address the need for a capacity model and HI selection, an integrated model is proposed herein for predicting the battery capacity using a dual extended Kalman filter (DEKF) and a multivariate autoregressive (AR) model due to the limited data availability for the measured capacity. The capacity model was developed using the DEKF based on the ECM and SOC-open-circuit voltage (OCV) relationship, which reflects the aging conditions and battery characteristics. The multiple HIs are analyzed using the correlation coefficient and extracted using a parameter identification method based on the battery model. To verify the proposed method, this study addresses the effect of the multiple HIs to the performance of the prediction model by dividing the cases into relatively better and worse estimation performance of the DEKF.
The remainder of this paper is organized as follows. Section 3 presents the experimental setup, the algorithm for the battery parameters, state estimation, and the prediction model. Section 4 presents the simulation results for capacity prediction compared with the conventional method and the integrated model. Section 5 presents the conclusions and directions for future research.

Experimental Conditions for a Battery Aging Test
To validate the proposed method and accumulate the aging data, the experimental testbench shown in Figure 1 was used. The battery test-bench was set up to collect the experimental data and consisted of a battery charge/discharge regulator (Maccor 4300K), a thermal chamber, and a personal computer (PC). The battery regulator used a charging/discharging battery. The computer controlled Energies 2020, 13, 2138 4 of 20 the battery regulator and recorded the voltage and current data. The thermal chamber was used to maintain a constant temperature of 25 • C. The battery used in the experiment was a nickel manganese cobalt-oxide (NMC) battery with a rated discharge capacity of 3.3 Ah and a rated voltage of 3.56 V, as represented in Table 1. Energies 2020, 13, x FOR PEER REVIEW 4 of 20 consisted of a battery charge/discharge regulator (Maccor 4300K), a thermal chamber, and a personal computer (PC). The battery regulator used a charging/discharging battery. The computer controlled the battery regulator and recorded the voltage and current data. The thermal chamber was used to maintain a constant temperature of 25 °C. The battery used in the experiment was a nickel manganese cobalt-oxide (NMC) battery with a rated discharge capacity of 3.3 Ah and a rated voltage of 3.56 V, as represented in Table 1.  The short-period profile is insufficient for capturing the change in capacity. To obtain the variable capacity with charge and discharge cycles, a cycling test was performed, as shown in Figure  2. The overall profile exhibited repeated charging and discharging, as shown in Figure 2a. Figure 2b indicates that the battery was aged in the experiment with charging and discharging at 3.3 A. In this study, since the coulomb efficiency of the NMC battery is almost 100% within the useful life, the discharge capacity is used for the prediction model [31].  The short-period profile is insufficient for capturing the change in capacity. To obtain the variable capacity with charge and discharge cycles, a cycling test was performed, as shown in Figure 2. The overall profile exhibited repeated charging and discharging, as shown in Figure 2a. Figure 2b indicates that the battery was aged in the experiment with charging and discharging at 3.3 A. In this study, since the coulomb efficiency of the NMC battery is almost 100% within the useful life, the discharge capacity is used for the prediction model [31].

Battery Equivalent Circuit Model
The ECM corresponding to the battery is essential for estimating the internal state. The ECM consists of the SOC, OCV, ohmic resistance (R0), diffusion resistance (R1), and diffusion capacitance (C1), as shown in Figure 3. The OCV is SOC and time-dependent and it is defined as a function of SOC. The SOC is the ratio of the remaining capacity of the battery to the nominal capacity (Cn). The SOC is calculated using the Ampere hour (Ah) counting method, as follows.
where SOC0 represents the initial SOC, ∆t represents the sampling time of the experimental setup, ik represents the current applied to the battery model, and Cn is the nominal capacity of the battery. The diffusion voltage (V1) to the parallel circuit of the resistance and capacitance is represented as: where τ is the time constant of the battery. The terminal voltage (Vt,k) of the battery pack is represented as: Figure 3. Battery equivalent circuit model.

Battery Equivalent Circuit Model
The ECM corresponding to the battery is essential for estimating the internal state. The ECM consists of the SOC, OCV, ohmic resistance (R 0 ), diffusion resistance (R 1 ), and diffusion capacitance (C 1 ), as shown in Figure 3. The OCV is SOC and time-dependent and it is defined as a function of SOC. The SOC is the ratio of the remaining capacity of the battery to the nominal capacity (C n ). The SOC is calculated using the Ampere hour (Ah) counting method, as follows.
where SOC 0 represents the initial SOC, ∆t represents the sampling time of the experimental setup, i k represents the current applied to the battery model, and C n is the nominal capacity of the battery.

Battery Equivalent Circuit Model
The ECM corresponding to the battery is essential for estimating the internal state. The ECM consists of the SOC, OCV, ohmic resistance (R0), diffusion resistance (R1), and diffusion capacitance (C1), as shown in Figure 3. The OCV is SOC and time-dependent and it is defined as a function of SOC. The SOC is the ratio of the remaining capacity of the battery to the nominal capacity (Cn). The SOC is calculated using the Ampere hour (Ah) counting method, as follows.
where SOC0 represents the initial SOC, ∆t represents the sampling time of the experimental setup, ik represents the current applied to the battery model, and Cn is the nominal capacity of the battery. The diffusion voltage (V1) to the parallel circuit of the resistance and capacitance is represented as: where τ is the time constant of the battery. The terminal voltage (Vt,k) of the battery pack is represented as: , 01 The diffusion voltage (V 1 ) to the parallel circuit of the resistance and capacitance is represented as: where τ is the time constant of the battery. The terminal voltage (V t,k ) of the battery pack is represented as: The purpose of the OCV test is to identify the electrical potential capability and parameters of the ECM, according to the SOC, as shown in Figure 4. The OCV test is conducted to acquire the OCV according to the SOC. The full charged battery (SOC 100%) is discharged with constant current (CC) and we settle the discharging amount to SOC 5%. The OCV points were extracted to a 5% interval in the SOC range of 0-100%. The applied current value of the pulse test is set to 1 C-rate, which refers to a multiple of the rated capacity. Figure 4 represents the method of parameter identification. The OCV is obtained when the region that current is not applied rests, as shown in Figure 5. The ohmic resistance (R 0 ), diffusion resistance (R 1 ), and capacitance (C 1 ) are acquired when the current applies to the battery as follows: where ∆V 0 is the voltage drop in OCV, ∆V 10s is the voltage drop during the discharge at 10 s, and τ is the time constant as shown in Figure 5.
Energies 2020, 13, x FOR PEER REVIEW 6 of 20 The purpose of the OCV test is to identify the electrical potential capability and parameters of the ECM, according to the SOC, as shown in Figure 4. The OCV test is conducted to acquire the OCV according to the SOC. The full charged battery (SOC 100%) is discharged with constant current (CC) and we settle the discharging amount to SOC 5%. The OCV points were extracted to a 5% interval in the SOC range of 0-100%. The applied current value of the pulse test is set to 1 C-rate, which refers to a multiple of the rated capacity. Figure 4 represents the method of parameter identification. The OCV is obtained when the region that current is not applied rests, as shown in Figure 5. The ohmic resistance (R0), diffusion resistance (R1), and capacitance (C1) are acquired when the current applies to the battery as follows: where ΔV0 is the voltage drop in OCV, ΔV10s is the voltage drop during the discharge at 10 s, and τ is the time constant as shown in Figure 5. The purpose of the OCV test is to identify the electrical potential capability and parameters of the ECM, according to the SOC, as shown in Figure 4. The OCV test is conducted to acquire the OCV according to the SOC. The full charged battery (SOC 100%) is discharged with constant current (CC) and we settle the discharging amount to SOC 5%. The OCV points were extracted to a 5% interval in the SOC range of 0-100%. The applied current value of the pulse test is set to 1 C-rate, which refers to a multiple of the rated capacity. Figure 4 represents the method of parameter identification. The OCV is obtained when the region that current is not applied rests, as shown in Figure 5. The ohmic resistance (R0), diffusion resistance (R1), and capacitance (C1) are acquired when the current applies to the battery as follows: where ΔV0 is the voltage drop in OCV, ΔV10s is the voltage drop during the discharge at 10 s, and τ is the time constant as shown in Figure 5.

Multiple Adaptive Forgetting Factor-Recursive Least Square (MAFF-RLS) Algorithm
During the battery operation, the battery parameters are varied according to the number of the cycle. The simplified battery model is used as shown in Figure 3. By identifying these parameters, it is possible to indirectly estimate the battery capacity reduction. To estimate the battery parameters and merge with the time-series model, the autoregressive exogenous (ARX) model is used to apply the recursive method to the battery terminal, as indicated by Equation (5) [32]. The input vector (Φ k ) and parameter vector (ε k ) are defined as shown in Equations (6) and (7), respectively. The battery parameter can be rearranged as Equation (8).
The conventional RLS is set as the forgetting factor for one variable. However, because the battery state is changed and multivariate parameters as the optimal forgetting factor can be changed, according to the parameters. In particular, each parameter such as the OCV, resistance, and capacitance have different characteristics and change, according to the aging condition. Therefore, a single and constant forgetting factor cannot maintain the optimal estimation performance. To reflect the aging condition, in this study, we estimated various parameters using the MAFF-RLS method reported in Reference [32].
In the first step, the initial value is defined. The multiple forgetting factors (λ i,k ) are calculated as follows.
where ζ k is the constant parameter for forgetting factor and E i,k is the error covariance of parameters. The multiple adaptive gains (L i,k ) and the error covariance are calculated as Equations (10) and (11), respectively. The adaptive gain is updated using Equation (12) and the parameter vector is calibrated as shown in Equation (13). Figure 6 is the comparison of the SOC-OCV curve, which is extracted from Figure 4. The fresh value is the initial condition and aged when the battery is charged and discharged at 700 cycles. The tendency of OCV-SOC is very similar and the average difference is about 0.014 V. OCV has little change in the SOC 100% region, but OCV in SOC = 0%, which can be selected as a candidate for HI because the difference is largest in the entire SOC region. Therefore, since the OCV is highly dependent on the SOC than the number of cycles, it is not suitable to define the OCV for HI. because the difference is largest in the entire SOC region. Therefore, since the OCV is highly dependent on the SOC than the number of cycles, it is not suitable to define the OCV for HI. Using the Multiple Adaptive Forgetting Factor-Recursive Least Square (MAFF-RLS) method, R0 and R1 were identified according to time-series data. These parameters were utilized as HIs to supplement the nonlinear relationship of variables in the prediction model. As the cycle progressed, the battery capacity increased, along with the ohmic resistance and diffusion resistance, as shown in Figure 7. In the entire cycle, when the capacity decreases, both ohmic (R0) and diffusion resistance (R1) increase, according to the cycles, as shown in Figure 7a and Figure 7b. Therefore, those results show that the capacity and the two resistances have a relationship for each other.

Parameter Estimation Results and Relationship with Capacity
(a) Relationship between ohmic resistance and capacity Using the Multiple Adaptive Forgetting Factor-Recursive Least Square (MAFF-RLS) method, R 0 and R 1 were identified according to time-series data. These parameters were utilized as HIs to supplement the nonlinear relationship of variables in the prediction model. As the cycle progressed, the battery capacity increased, along with the ohmic resistance and diffusion resistance, as shown in Figure 7. In the entire cycle, when the capacity decreases, both ohmic (R 0 ) and diffusion resistance (R 1 ) increase, according to the cycles, as shown in Figure 7a,b. Therefore, those results show that the capacity and the two resistances have a relationship for each other. because the difference is largest in the entire SOC region. Therefore, since the OCV is highly dependent on the SOC than the number of cycles, it is not suitable to define the OCV for HI. Using the Multiple Adaptive Forgetting Factor-Recursive Least Square (MAFF-RLS) method, R0 and R1 were identified according to time-series data. These parameters were utilized as HIs to supplement the nonlinear relationship of variables in the prediction model. As the cycle progressed, the battery capacity increased, along with the ohmic resistance and diffusion resistance, as shown in Figure 7. In the entire cycle, when the capacity decreases, both ohmic (R0) and diffusion resistance (R1) increase, according to the cycles, as shown in Figure 7a and Figure 7b. Therefore, those results show that the capacity and the two resistances have a relationship for each other.   The Pearson correlation analysis (PCA) was used to confirm the linear relationship between the estimated resistance and capacity. The correlation coefficient (r) is calculated by the equation below.
where n is the sample size, Xi and Y i are the individual sample points with i, X and Y are the sample mean values of two variables. When the two variables are not distributed normally, the Spearman rank correlation (SRC) is used to evaluate the monotonic relationship between Xi and Yi via the same procedure. If r is closer to 1 or −1, the relationship between the two variables is more linear. If it is larger than 0.8, the relationship between the two variables has strong linearity [33]. The results of the correlation analysis are presented in Table 2. R0 was almost linearly related to the capacity because the coefficient was close to −1. Even though the R1 coefficient was smaller than the R0 coefficient, it is larger than 0.8. This indicates a strong relationship between the capacity and the resistance, as shown in Figures 8 and 9. R0 had a stronger linear relationship with the capacity than R1. Therefore, the ohmic and diffusion resistances were selected as the HIs for developing the multivariate prediction model. Considering these two HI values, the prediction performance to improve the nonlinearity of the prediction model, in the case where the estimated capacity is inaccurate when compared with the measured data, was examined.  The Pearson correlation analysis (PCA) was used to confirm the linear relationship between the estimated resistance and capacity. The correlation coefficient (r) is calculated by the equation below.
where n is the sample size, X i and Y i are the individual sample points with i, X and Y are the sample mean values of two variables. When the two variables are not distributed normally, the Spearman rank correlation (SRC) is used to evaluate the monotonic relationship between X i and Y i via the same procedure. If r is closer to 1 or −1, the relationship between the two variables is more linear. If it is larger than 0.8, the relationship between the two variables has strong linearity [33]. The results of the correlation analysis are presented in Table 2. R 0 was almost linearly related to the capacity because the coefficient was close to −1. Even though the R 1 coefficient was smaller than the R 0 coefficient, it is larger than 0.8. This indicates a strong relationship between the capacity and the resistance, as shown in Figures 8 and 9. R 0 had a stronger linear relationship with the capacity than R 1 . Therefore, the ohmic and diffusion resistances were selected as the HIs for developing the multivariate prediction model. Considering these two HI values, the prediction performance to improve the nonlinearity of the prediction model, in the case where the estimated capacity is inaccurate when compared with the measured data, was examined.

Dual Extended Kalman Filter
The Dual Extended Kalman Filter (DEKF) is a recursive algorithm for accurately estimating the two-state of the system by merging the previous state and the measurement value. Using a single extended Kalman filter (EKF) has the advantages of a relatively simple mechanism and a small computational burden. Although EKF is widely accepted, there are several limitations to this approach. Nikolaos et al. [34] analyze the performance of EKF and DEKF, according to the different SOH degrees. Although the DKEF shows weaker performance than the EKF in the fresh condition, the DEKF enhances the convergence ability of the observer when the SOH is decreasing. As a result, the DEKF has relatively better performance than EKF in the long term. Thus, to merge with the prediction model and state estimator, a single EKF is not suitable for maintaining the stability and accuracy of the observer in the circumstance of battery aging [34,35]. To simultaneously identify the SOC and capacity of a battery and improve the stability of an observer, in the DEKF, the ECM has two EKFs. The two EKF observers run in parallel to estimate the SOC and capacity.
Due to the battery parameter observer, the DEKF does not need experimental data for composing the functions. The parameter observer calibrates the OCV, ohmic resistance (R0), and diffusion resistance (R1), according to the ECM voltage error. Because the observer can reflect the nonlinear characteristics of the battery, the DEKF can estimate the SOC and the capacity more accurately than the offline method. Because the observer improves the performance of the ECM, the DEKF can estimate the SOC and capacity more accurately than the offline method when the ECM accuracy increases.

Dual Extended Kalman Filter
The Dual Extended Kalman Filter (DEKF) is a recursive algorithm for accurately estimating the two-state of the system by merging the previous state and the measurement value. Using a single extended Kalman filter (EKF) has the advantages of a relatively simple mechanism and a small computational burden. Although EKF is widely accepted, there are several limitations to this approach. Nikolaos et al. [34] analyze the performance of EKF and DEKF, according to the different SOH degrees. Although the DKEF shows weaker performance than the EKF in the fresh condition, the DEKF enhances the convergence ability of the observer when the SOH is decreasing. As a result, the DEKF has relatively better performance than EKF in the long term. Thus, to merge with the prediction model and state estimator, a single EKF is not suitable for maintaining the stability and accuracy of the observer in the circumstance of battery aging [34,35]. To simultaneously identify the SOC and capacity of a battery and improve the stability of an observer, in the DEKF, the ECM has two EKFs. The two EKF observers run in parallel to estimate the SOC and capacity.
Due to the battery parameter observer, the DEKF does not need experimental data for composing the functions. The parameter observer calibrates the OCV, ohmic resistance (R0), and diffusion resistance (R1), according to the ECM voltage error. Because the observer can reflect the nonlinear characteristics of the battery, the DEKF can estimate the SOC and the capacity more accurately than the offline method. Because the observer improves the performance of the ECM, the DEKF can estimate the SOC and capacity more accurately than the offline method when the ECM accuracy increases.

Dual Extended Kalman Filter
The Dual Extended Kalman Filter (DEKF) is a recursive algorithm for accurately estimating the two-state of the system by merging the previous state and the measurement value. Using a single extended Kalman filter (EKF) has the advantages of a relatively simple mechanism and a small computational burden. Although EKF is widely accepted, there are several limitations to this approach. Nikolaos et al. [34] analyze the performance of EKF and DEKF, according to the different SOH degrees. Although the DKEF shows weaker performance than the EKF in the fresh condition, the DEKF enhances the convergence ability of the observer when the SOH is decreasing. As a result, the DEKF has relatively better performance than EKF in the long term. Thus, to merge with the prediction model and state estimator, a single EKF is not suitable for maintaining the stability and accuracy of the observer in the circumstance of battery aging [34,35]. To simultaneously identify the SOC and capacity of a battery and improve the stability of an observer, in the DEKF, the ECM has two EKFs. The two EKF observers run in parallel to estimate the SOC and capacity.
Due to the battery parameter observer, the DEKF does not need experimental data for composing the functions. The parameter observer calibrates the OCV, ohmic resistance (R 0 ), and diffusion resistance (R 1 ), according to the ECM voltage error. Because the observer can reflect the nonlinear characteristics of the battery, the DEKF can estimate the SOC and the capacity more accurately than the offline method. Because the observer improves the performance of the ECM, the DEKF can estimate the SOC and capacity more accurately than the offline method when the ECM accuracy increases.

SOC Estimation
To estimate the SOC, a nonlinear state space equation represented as f (x k , u k , θ k and z(x k , u k , θ k ) indicating the process and measurement equations of the battery is needed. The state-space matrix and the input are expressed as follows.
wherex k represents the estimated value of the battery state and P k represents the error covariance indicating the deviation of the estimated value from the true value. The P k influences the state estimation performance because it affects the Kalman gain, as indicated by Equation (22). The system input is the current, which is represented by the equation below.
The process equation of the ECM indicating the SOC k and V 1 yield is expressed as follows.
where w x and v x are the state and measurement noise of state filer, respectively. The measurement function of the ECM is defined as follows.
In the first EKF, the state and the system matrices are defined as a Jacobian matrix through differentiation with respect to x k for linearization, as follows.
The state filter recursively progresses as follows. x In the first step, the initial values (x 0 and P 0 ) and noise parameters (Q x and R x ) of the system are set. The second step involves predicting when the estimated prior value (x − k+1 ) and the error covariance (P − k+1 ) are calculated. This step is related to the system model corresponding to the ECM. The final step is the innovation step in which the Kalman gain (K x k ) is calculated using the system variables (H k and R x ) from the prediction step.x + k+1 is estimated and calibrated by adding the prior estimate to the value obtained by multiplying the Kalman gain and the measurement error. Because the Kalman gain adjusts the state, the error covariance, which indicates the difference between the estimate and the true value, is calculated. All the calculation steps were repeated at each sampling time.

Capacity Identification
The second EKF observer identifies the ohmic resistance and capacity of the battery. The weight matrix vector is defined by the formula below.
where θ k is the parameter indicating the ohmic resistance and capacity of the battery. The process equation, which indicates the weight filter and measurement equation of the second EKF, is as follows.
where w θ and v θ are the state and measurement noise of the weight filer, respectively. For the successful operation of the DEKF, the linearization of the weight filter in the innovation step of EKF θ is essential. The Jacobian matrix is defined by the equation below.
The C k identification is more complex than the H k identification. The relationship between the estimated terminal voltage of the ECM and θ k is not determined directly. To obtain the Jacobian matrix, we use the relationship between the partial difference and the total difference, by decomposing Equation (26) as follows.
The calculation of Kalman gain and calibration procedure is the same with Equations (21) and (22) where S k is the error covariance of θ k . The DEKF procedure is implemented as follows.
1. Initial value setting and adjustment: According to the measured voltage and SOC-OCV relationship, the initial values of the system variables (OCV 0 , H 0 , and P 0 ) are automatically determined. The OCV is applied to the ECM and the initial value of the H k is calculated using the SOC-OCV curve based on the experimental data, as shown in Figure 6. According to the OCV-SOC curve for the initial condition, the error covariance (P 0 ) is approximated as follows. SOC and capacity estimation: The DEKF estimates the SOC and capacity in parallel structure, as shown in Figure 10.  In this study, the DEKF was not only used for estimating the state of the battery (SOC and capacity) but also applied to the prediction model in real-time (except for the initial value). The proposed algorithm identifies the parameters according to the MAFF-RLS. These parameters are applied to the DEKF and prediction models.

Autoregressive (AR) Model
The time-series approach helps predict the capacity loss without the complete aging data because the prediction model for time-series data assumes that the predicted data are related to past data. The AR model represents the current variable using the linear relationship of the previous time variable, as follows.
x(t) = a 1 x t−1 + a 2 x t−2 + · · · + a p x t−p + ε t where p represents the model order, which indicates the number of past data. a p is a coefficient and ε represents white noise with zero mean. The conventional AR model cannot predict nonlinear trends. Long et al. [36] used an AR model based on particle swarm optimization to determine the optimal order. Because the AR model is affected by the previous trend, the prediction result for the capacity fading is characterized by a linear trend. To overcome the limitation of the AR model, Liu et al. [37] proposed an AR model that considers two capacity degradation trends fitting different aging curves. However, in conventional studies, the capacity value was not feasible because it was measured via an under restricted experimental condition [12,26]. These findings indicate the following. First, the statistical model requires a large amount of experimental data to fit the capacity fading and additional nonlinear factor. Second, the measurement value in the experiment is not useful in real applications because the capacity changes, according to the SOC, depth of discharge (DOD), cycle, C-rate, etc. Lastly, if the variability of the battery capacity is nonlinear, it is difficult to predict using a linear relationship.

Integrated Model
For the previously mentioned AR model, the estimation and capacity prediction results were dependent on the accuracy of the measured data and the selection of the HI. In this study, the capacity was predicted using the multivariate AR model and the HI. The capacity for the next 500 cycles was predicted according to the capacity in the previous 200 cycles. The multivariate AR model combining R 0 and R diff prediction data, which was expected to have strong correlation with capacity, was used for the AR model. The equation for predicting the capacity was as follows.
where t represents the prediction time, C n,t−τ is the index value of the previous 200 cycles data from the prediction time, and p represents the number of previous load values used. The prediction for 500 cycles was performed by increasing the prediction time through the iterative multivariate regression model and the previous predicted value was used when there was no previous measured load value.

Results and Discussion
The capacity prediction result was compared with the measured capacity and the estimated capacity from the SOC and SOH joint estimator (DEKF). These simulations were implemented for two cases: the measured capacity and estimated capacity of the DEKF. To validate the multivariate AR model and integrated model, the capacity prediction results were compared according to the variable numbers used for defining the AR model. Then, the performance for the two cases was evaluated using the mean absolute percentage error (MAPE), which was defined as follows.
where R k is the reference value and F k is the forecast value. As shown in Figure 11, the simulation results depended on which variable was applied to the AR model. The red line represents the measured capacity for fitting the data using 200 cycles that were obtained from the experimental data. The blue line represents the prediction results and the green line represents the reference. Figure 11a shows the prediction results obtained using only capacity data for the AR model. The prediction results exhibited a linear trend because the approximate capacity trend was almost linear in the 200 cycles. The MAPE for these results was 4.4006%. Figure 11b shows the results obtained using the capacity and ohmic resistance (R 0 ). Although the MAPE was 1.0606%, the prediction error increased exponentially with the cycle number. According to Figure 11c,d, the prediction results were more reliable, as the MAPE was 0.5324% and 0.4165%, respectively. This indicates that, in addition to the capacity and R 1 , R diff reflects the nonlinear characteristics of capacity prediction.
The prediction results of the AR model utilizing the estimated capacity from the DEKF are shown in Figure 12. The red and purple lines represent the estimated capacity. The average MAPE was under 1%. The prediction result in Figure 12a is similar to that in Figure 12a obtained using the estimated capacity only. The MAPE of the prediction results was 1.4518%, and the performance of prediction model was higher than that achieved using the measured capacity because the overall trend between 50 and 200 cycles tended to decrease, but the estimated capacity increased and decreased repeatedly. When R 0 and R 1 were applied to the AR model, the prediction results were superior to those obtained using the capacity only, as shown in Table 3. approximate capacity trend was almost linear in the 200 cycles. The MAPE for these results was 4.4006%. Figure 11b shows the results obtained using the capacity and ohmic resistance (R0). Although the MAPE was 1.0606%, the prediction error increased exponentially with the cycle number. According to Figures 11c and 11d, the prediction results were more reliable, as the MAPE was 0.5324% and 0.4165%, respectively. This indicates that, in addition to the capacity and R1, Rdiff reflects the nonlinear characteristics of capacity prediction. The prediction results of the AR model utilizing the estimated capacity from the DEKF are shown in Figure 12. The red and purple lines represent the estimated capacity. The average MAPE was under 1%. The prediction result in Figure 12a is similar to that in Figure 12a obtained using the estimated capacity only. The MAPE of the prediction results was 1.4518%, and the performance of prediction model was higher than that achieved using the measured capacity because the overall trend between 50 and 200 cycles tended to decrease, but the estimated capacity increased and decreased repeatedly. When R0 and R1 were applied to the AR model, the prediction results were superior to those obtained using the capacity only, as shown in Table 3.  Figure 11. Capacity prediction results obtained using the autoregressive (AR) model and the measured capacity. The prediction results of the AR model utilizing the estimated capacity from the DEKF are shown in Figure 12. The red and purple lines represent the estimated capacity. The average MAPE was under 1%. The prediction result in Figure 12a is similar to that in Figure 12a obtained using the estimated capacity only. The MAPE of the prediction results was 1.4518%, and the performance of prediction model was higher than that achieved using the measured capacity because the overall trend between 50 and 200 cycles tended to decrease, but the estimated capacity increased and decreased repeatedly. When R0 and R1 were applied to the AR model, the prediction results were superior to those obtained using the capacity only, as shown in Table 3  The underestimated capacity of DEKF is selected in order to determine the HI, as shown in Figure 13. The underestimated capacity has a more nonlinear trend because this value fluctuates and the accuracy decreases more than Figure 12. In Figure 13a and 13b, the prediction results are almost linear and MAPE is 3.6067% and 3.0266%, respectively. Thus, it seems that the other parameter is required for predicting the nonlinear characteristics and enhancing the prediction performance. When the diffusion resistance is applied to the multivariate AR model, the prediction result has weakening performance than other results, as shown in Figure 13c, which represents the result of the multivariate AR model and the MAPE is 9.3908%. If two resistances (R0 and R1) are applied to the AR model at the same time, the MAPE is improved to 2.55%. In particular, it can be seen that the prediction performance is improved by changing the slope from about 500 cycles, as shown in Figure  13d. ( Figure 12. Capacity prediction results obtained using the AR model and dual extended Kalman filter (DEKF) (case 1). The underestimated capacity of DEKF is selected in order to determine the HI, as shown in Figure 13. The underestimated capacity has a more nonlinear trend because this value fluctuates and the accuracy decreases more than Figure 12. In Figure 13a,b, the prediction results are almost linear and MAPE is 3.6067% and 3.0266%, respectively. Thus, it seems that the other parameter is required for predicting the nonlinear characteristics and enhancing the prediction performance. When the diffusion resistance is applied to the multivariate AR model, the prediction result has weakening performance than other results, as shown in Figure 13c, which represents the result of the multivariate AR model and the MAPE is 9.3908%. If two resistances (R 0 and R 1 ) are applied to the AR model at the same time, the MAPE is improved to 2.55%. In particular, it can be seen that the prediction performance is improved by changing the slope from about 500 cycles, as shown in Figure 13d.  The underestimated capacity of DEKF is selected in order to determine the HI, as shown in Figure 13. The underestimated capacity has a more nonlinear trend because this value fluctuates and the accuracy decreases more than Figure 12. In Figure 13a and 13b, the prediction results are almost linear and MAPE is 3.6067% and 3.0266%, respectively. Thus, it seems that the other parameter is required for predicting the nonlinear characteristics and enhancing the prediction performance. When the diffusion resistance is applied to the multivariate AR model, the prediction result has weakening performance than other results, as shown in Figure 13c, which represents the result of the multivariate AR model and the MAPE is 9.3908%. If two resistances (R0 and R1) are applied to the AR model at the same time, the MAPE is improved to 2.55%. In particular, it can be seen that the prediction performance is improved by changing the slope from about 500 cycles, as shown in Figure  13d. ( As indicated by the foregoing simulation results, the capacity prediction results depend on the accuracy of the capacity data and the multivariate HI, which indicates the nonlinear trend of the prediction results. Thus, the prediction performance is highly dependent on selecting the appropriate experimental data and HI. The limitation of the AR model that the prediction result is linear can be resolved by applying another HI. However, the measured capacity cannot be used in real applications because the capacity is only measured under restricted experimental conditions. Therefore, in this study, the AR model was integrated with the joint estimator to estimate the SOC and SOH.

Conclusions
In practical applications, it is difficult to determine the appropriate HI and capacity because of the fluctuation of the battery parameters with the aging of the battery. The battery cycle is difficult to determine according to the operating range of applications. Hence, in this study, capacity data were obtained from a real-time estimator for predicting the capacity loss. The multivariate AR model and DKEF were integrated for real-time applications. The HI was estimated via MAFF-RLS, which was synchronized with the DEKF. Two resistances of the battery were applied to the AR model to reflect the nonlinear characteristics of capacity reduction. These resistances were estimated by the ARX model and MAFF-RLS to be utilized for the HI. Additionally, to overcome the limitations of existing methods predicted using measurement data, integration of the multivariate AR model and DEKF was necessary to estimate the capacity values in real-time.
According to the simulation results, the capacity prediction performance depends on aging factors related to the battery capacity. If the accuracy of the capacity value is increased, better performance can be achieved. Therefore, the capacity is the factor with the greatest effect on the prediction performance. However, for practical applications, the AR model should utilize the capacity estimated by a real-time estimator. Future research plans are as follows. First, a significant HI, such as the discharging rate or temperature, will be applied to the AR model since the lifespan is affected by these factors. Second, data-driven methods will be utilized for optimizing both the prediction model and HI. Our technique can be applied to a wide range of data-driven methods by adopting the adaptive controller that can estimate the optimal battery parameters in real-time. Third, since the proposed method utilized the adaptive controller and parameter identification, it has the potential to contstruct the new model for the next generation battery, which has different characteristics.  As indicated by the foregoing simulation results, the capacity prediction results depend on the accuracy of the capacity data and the multivariate HI, which indicates the nonlinear trend of the prediction results. Thus, the prediction performance is highly dependent on selecting the appropriate experimental data and HI. The limitation of the AR model that the prediction result is linear can be resolved by applying another HI. However, the measured capacity cannot be used in real applications because the capacity is only measured under restricted experimental conditions. Therefore, in this study, the AR model was integrated with the joint estimator to estimate the SOC and SOH.

Conclusions
In practical applications, it is difficult to determine the appropriate HI and capacity because of the fluctuation of the battery parameters with the aging of the battery. The battery cycle is difficult to determine according to the operating range of applications. Hence, in this study, capacity data were obtained from a real-time estimator for predicting the capacity loss. The multivariate AR model and DKEF were integrated for real-time applications. The HI was estimated via MAFF-RLS, which was synchronized with the DEKF. Two resistances of the battery were applied to the AR model to reflect the nonlinear characteristics of capacity reduction. These resistances were estimated by the ARX model and MAFF-RLS to be utilized for the HI. Additionally, to overcome the limitations of existing methods predicted using measurement data, integration of the multivariate AR model and DEKF was necessary to estimate the capacity values in real-time.
According to the simulation results, the capacity prediction performance depends on aging factors related to the battery capacity. If the accuracy of the capacity value is increased, better performance can be achieved. Therefore, the capacity is the factor with the greatest effect on the prediction performance. However, for practical applications, the AR model should utilize the capacity estimated by a real-time estimator. Future research plans are as follows. First, a significant HI, such as the discharging rate or temperature, will be applied to the AR model since the lifespan is affected by these factors. Second, data-driven methods will be utilized for optimizing both the prediction model and HI. Our technique can be applied to a wide range of data-driven methods by adopting the adaptive controller that can estimate the optimal battery parameters in real-time. Third, since the proposed method utilized the adaptive controller and parameter identification, it has the potential to contstruct the new model for the next generation battery, which has different characteristics.