Sensor Fault Detection and Isolation for Degrading Lithium-Ion Batteries in Electric Vehicles Using Parameter Estimation with Recursive Least Squares

With the increase in usage of electric vehicles (EVs), the demand for Lithium-ion (Li-ion) batteries is also on the rise. The battery management system (BMS) plays an important role in ensuring the safe and reliable operation of the battery in EVs. Sensor faults in the BMS can have significant negative effects on the system, hence it is important to diagnose these faults in real-time. Existing sensor fault detection and isolation (FDI) methods have not considered battery degradation. Degradation can affect the long-term performance of the battery and cause false fault detection. This paper presents a model-based sensor FDI scheme for a Li-ion cell undergoing degradation. The proposed scheme uses the recursive least squares (RLS) method to estimate the equivalent circuit model (ECM) parameters in real time. The estimated ECM parameters are put through weighted moving average (WMA) filters, and then cumulative sum control charts (CUSUM) are implemented to detect any significant deviation between unfiltered and filtered data, which would indicate a fault. The current and voltage faults are isolated based on the responsiveness of the parameters when each fault occurs. The proposed FDI scheme is then validated through conducting a series of experiments and simulations.


Introduction
Lithium-ion (Li-ion) batteries are the most popular form of energy storage in the world, amounting to 85.6% of energy storage systems utilized in 2015. Although it has the highest price, it shows the lowest cost per cycle [1]. The substantial demand for Li-ion batteries is due to portable devices and electric vehicles (EVs). Li-ion batteries are used in EVs due to their high power and energy density, long life span, and low environmental impact. EVs require a battery system that consists of hundreds or thousands of single cells. In order to manage this large number of cells, the battery pack needs a battery management system (BMS). It is important that the performance of the BMS is accurate and reliable, to ensure the performance and safety in EVs application. The functions of the BMS include state of charge (SOC) and state of health (SOH) estimation, and over-current and over-voltage protection [2]. These functions rely heavily on voltage and current sensor measurements [3]. It is possible for the sensors to experience malfunctions during the operation of the battery, due to manufacturing defects or environmental factors. The estimation of the SOC (similar to a fuel meter in conventional vehicles) and the SOH (similar to an odometer), would be affected if there were any faults with the sensors, leading to over-charge and/or over-discharge phenomenon which would degrade the battery faster. The current and voltage protection would also fail to work properly due to faulty sensors. This can lead to more catastrophic failures since the current and voltage can exceed their operational limits undetected, due to incorrect sensor readings [4]. Even though a sensor fault with a small magnitude does not immediately affect the battery performance, it can have a significant effect over time. This can be prevented by detecting and resolving the sensor fault promptly after it develops. Although the authors are not aware of any published data on the failure rates of BMS sensors in EVs, it is reasonable to anticipate some failures due to the nature of the application. The sensors are subject to vibration and physical damage from collisions, which can ultimately lead to disconnection or resistance build-up of the wires and cause deviations in the readings. Therefore, it is critical to develop an algorithm that can reliably and accurately diagnose any faulty operation of the voltage and current sensors in real time.
The reviews on fault mechanism and diagnosis approaches for Li-ion batteries can be found in [2,5]. Desirable characteristics of a fault detection and isolation (FDI) scheme include quick detection and diagnosis, isolability, robustness, adaptability, low modelling requirements, and a reasonable balance between storage and computational requirements [6]. Several existing FDI methods were able to accomplish some of the desired characteristics stated above. An extended Kalman filter was used in [4] to diagnose sensor faults, but fault isolation was not achieved. This study confirms that the battery can be over-charged or over-discharged due to sensor faults, caused by the inaccuracy of SOC estimation. In [7], the nonlinear parity equation approach, coupled with sliding mode observers, were used to develop an FDI scheme to detect sensor faults for a single battery cell. A set of Luenberger and learning observers were used in [8] for simultaneous single-fault isolation and estimation of a faulty cell in a battery string. In [9], an FDI strategy using structural analysis theory and statistical inference residual evaluation was presented, but the computational effort was rather high. An FDI scheme using sliding mode observers with equivalent output error injection was introduced in [10], with findings that show false detection rate is affected by the variation in model parameters. All of the methods mentioned above work under the assumption that the battery model parameters remain constant throughout the battery pack's life span. However, the parameters can be affected by degradation, a significant property of battery operation. There has not been any mention of cell degradation in any FDI works or literature.
There are a few models used to illustrate battery behavior, but the equivalent circuit model (ECM) is the most widely used in FDI works [5]. The parameters of the ECM were derived using conservation of species, conservation of charge, and reaction kinetics in [11]. The results show that the parameters have physical meanings and can be affected by the chemistry of the battery, as well as the environment of operation. Therefore, degradation of the battery would have some effects on the parameters. The existing FDI schemes can be improved by integrating degradation into the ECM. However, this has been proven to be a difficult task. Currently, battery degradation models can be obtained by fitting experimental data under constant conditions. However, this is not an appropriate model for battery degradation in EVs applications, due to its complex operating state [12]. Experimental models are also less accurate, time-consuming, and costly. Adaptive models are more accurate, but require training to estimate the parameters that correlate with degradation. Moreover, the models can have high computational effort which is not suitable for real-time BMS applications [13]. Another approach is needed to effectively diagnose faults while considering the effect of degradation on ECM parameters, which this paper will present.
The key contribution of this paper is the proposal of a model-based sensor FDI scheme for Liion battery in EVs while considering battery degradation. The ECM parameters are expected to change during battery operation due to the effect of degradation. The paper studies and confirms this effect through a series of experiments. The proposed FDI scheme uses the recursive least squares (RLS) method to estimate the ECM parameters in real time, then applies a weighted moving average (WMA) filter coupled with a cumulative sum control chart (CUSUM) to detect any voltage and current sensor faults. The use of RLS is suggested because of its low computational demand and easy implementation [14]. The implementation of the WMA filter eliminates the concern of battery degradation, in addition to the effect of SOC and temperature on ECM parameters. Furthermore, the sensor faults are isolated based on the responsiveness of the parameters when a specific fault occurs. Finally, the Urban Dynamometer Driving Schedule (UDDS) cycle with sensor fault simulation is applied to validate and evaluate the performance of the proposed FDI scheme for a lithium iron phosphate (LFP) cell.
The rest of this paper is organized as follows: Section 2 describes the battery model used for this work, while Section 3 outlines the details of the proposed FDI scheme. Section 4 provides the experimental design and analysis of the effect of degradation and various faults on the parameters. The evaluation of the proposed fault diagnosis scheme is presented in Section 5, and the resulting conclusions are given in Section 6.

Battery Modelling
The most common model used to describe battery behaviors in EVs application is the equivalent circuit model. For an LFP battery running drive cycles that are highly dynamic, such as UDDS, an ECM with at least two RC pairs is recommended [15]. This is because the first order ECM neglects the effect of diffusion. However, the higher the model order is, the more computational effort it demands, due to the larger number of model parameters. For the implementation of the proposed FDI, it is not required for the model to have great accuracy, since the extraction of ECM parameters is used to monitor the state of battery operation, rather than to model the battery performance. Therefore, in order to optimize the computational complexity of the approach, the first order ECM is used in this paper. The simplified ECM model is shown in Figure 1. The state space equation of this battery model can be expressed as follows: In order to perform the proposed recursive approach on the model, an autoregressive exogenous model is needed. This is done through obtaining the transfer function of the battery impedance from Equation (1) in the s-domain, as shown in Equation (2). The transfer function is then discretized using the basic forward Euler transformation method, in which s is replaced by R1 and C1 can be determined as follows: The autoregressive exogenous model can then be obtained as follows: with , which can be rewritten as: The values for OCV (open-circuit voltage) in Equation (12) will be determined from the OCV-SOC relationship, established experimentally. This reduces the computational effort for , which gives more accurate ECM parameter estimations. Equations (10)-(12) will be used in the proposed RLS algorithm, and Equations (5), (7), and (8) will be used to extract the ECM parameters for the purpose of fault diagnosis.

Proposed Fault Diagnosis Scheme
In other industrial applications, parameter estimation is a common fault diagnosis method, due to its ability to be implemented online. The method involves the online estimation of the parameters, and the results are compared with a reference model [16]. For real-time identification of ECM parameters, the RLS method is selected because it has low computational demand, fast convergence speed, and can easily be implemented in an embedded system [14]. In this particular case, this method can estimate the model parameters, while adapting to their changes with the degradation and operational conditions of the battery [17]. The resulting estimations are in the form of a time series, for which a change point detection method can be used to diagnose faults [18]. The change point detection method proposed in this paper consists of a WMA filter and a CUSUM control chart.

Recursive Least Squares Estimation
The RLS algorithm used in this paper employs an optimal forgetting factor to give more weight to recent data, and avoid the saturation phenomenon [19]. The forgetting factor is applied to the parameter vector . The recursive algorithm of Equation (10) can be represented as follows: where � is the estimated parameter vector , is the algorithm gain, is the covariance matrix, and is the forgetting factor, which will be optimized in the range of [0.95, 1]. The values of 0 and P 0 are initially guessed. The schematic diagram for the RLS algorithm is shown in Figure 2.

Sensor Faults
A fault is defined as a deviation of at least one property or parameter of the system from the standard condition. Faults are commonly classified as actuator faults, sensor faults, and component/parameter faults. They can affect the control action from the controller, produce measurement errors, or change the input/output properties of the system, which leads to degradation and damage of the system [20]. This paper focuses on sensor faults.
Readings from the sensors in the BMS have an important role in estimating other characteristics of the battery. For instance, the measurements from voltage and current sensors can affect the estimation of SOC. A ±1 mV voltage accuracy system used to calculate SOC in a lithium nickel manganese cobalt oxide (NMC) cell can have a base error of 0.2%. If the same accuracy is used to acquire a lithium iron phosphate (LFP) cell's SOC, then a base error of 5.9% can be expected [21].
The BMS current and voltage sensors used in EVs application can be affected by two types of fault: bias (offset), and gain (scaling) faults. Bias fault is a constant offset from the sensor signal during normal operation. Gain fault happens when the measurement magnitudes are scaled by a factor, while the signal form itself does not change. The faults are considered additive and can be modelled as follows [4]: where � is the measured value of current and voltage from the sensors, is the actual current or voltage, and is the sensor fault.

Online Fault Detection Using Weighted Moving Average Filter and Cumulative Sum Control Chart
WMA is a low-pass filter that is used for smoothing fluctuations, such as noise in a time series, to allow for more reliable trend analysis. Additionally, one can use WMA to compute short-term forecasts of time series [22]. The RLS-estimated ECM parameters are time series that contain noise and small fluctuations due to operational conditions (SOC and temperature) and degradation of the cells. A fault, however, is expected to affect the parameters more significantly. Therefore, the difference between WMA-filtered and unfiltered values of the ECM parameters during normal operation of the battery should be considerably smaller than when a fault first occurs. The WMA chosen for the proposed FDI is a two-term WMA to minimize storage requirement. The formula is presented in Equation (17).
where , is the kth WMA value, , is the kth unfiltered value obtained from RLS ( represents R 0 , R 1 , and C 1 ), and λ WMA is the weighting factor. The discrepancy between P f,k and P i,k is characterized by an absolute fractional error term, as shown in Equation (18).
The error is monitored using CUSUM, a common change-point detection algorithm, which accumulates deviations of data and signals when the cumulative sum exceeds a certain threshold. The algorithm is outlined in Equation (19) below [23]: where S is the cumulative sum value, S(e(P 0 )) = 0; e is the absolute fractional error from Equation (18); µ 0 and σ are the mean and standard deviation of the error population; and L is a specified constant.
In this paper, the λ WMA value from Equation (17) is set to 0.01, since it is more favourable for the filter to obtain a smooth line which can adapt to minor changes over a long period of time, such as noise or degradation effect. In Equation (19), the expected value for µ 0 is 0, and is estimated experimentally. During normal operation, the unfiltered values should not deviate from the smooth filtered line, because the amplitude of fluctuation is not significant. When a fault occurs, the unfiltered values would diverge significantly from the smooth filtered series. The CUSUM algorithm detects this divergence by indicating a fault (F(P k ) = 1) when S(e(P k )) exceeds an experimentally calibrated threshold J, as shown in Equation (20). When a fault is detected, the BMS will produce an alarm; and appropriate actions, such as replacing the faulty sensor, will be taken to resolve the fault.
The method outlined in this section can only be used for fault detection, not fault isolation. The full proposed FDI scheme will be shown in Section 4.5, after determining the effects of different sensor faults on ECM parameters. Since there has not been any work done in literature to determine fault effects on parameters, preliminary experiments will need to be performed to obtain this data before completing the full FDI scheme. The isolation will be based on the response time of the parameters when a certain fault occurs.

Effect of Degradation and Faults on ECM Parameters
In order to determine and validate the effect of degradation and faults on the ECM parameters, testing was done on an LFP pouch cell in a laboratory environment. The specifications of the cell at the initial state are listed in Table 1.

Experimental Setup
The experimental setup consists of a battery test system (Maccor 4200), connected to a testing station and a computer. The full setup is shown in Figure 3. All experiments were carried out at a room temperature of 23 °C. The computer has a software program that controls the battery test system to charge and discharge the cell. The current is assumed to be positive when discharging, and negative when charging. The data was collected at a frequency of 1 Hz, and then stored in the computer. Two test profiles were used: a set of multiple UDDS driving cycles, and a degradation cycle. The UDDS cycle is a velocity profile, and was translated and scaled into a current profile. It was run from the cell SOC of 95% to 20%. The degradation cycle involves charging and discharging multiple times between the extreme limits of the cell to degrade it quickly. Profiles of these cycles are shown in Figure 4. Characterization of the cell was also done through performing the OCV-SOC and capacity tests [24]. The sequence of tests began with cell characterization, then the testing cycle (UDDS and degradation), and all were repeated multiple times.

Cell Characterization Results
The cell capacity was captured at the beginning of each testing cycle, and it best represents the cell degradation since capacity decreases with degradation [12]. The results are presented in Table 2. The OCV-SOC relationship was also established and a look-up table was built, which was needed to estimate the cell OCV for the RLS algorithm. The OCV-SOC curve was found to change minimally with cell degradation, hence only one curve was used for all cell capacities in the RLS algorithm. The results can be seen in Figure 5.

Effect of Degradation on ECM Parameters
The RLS estimation was used to estimate the ECM parameters for the UDDS driving cycle at different cell capacities. The selected value for λ is 0.9999, as it gives optimal estimation accuracy for the LFP cell tested. Figure 6 shows how degradation affects these parameters. The effect of degradation on R 0 does not show any clear trend. However, it can be clearly seen that R 1 increases, while 1 decreases, with degradation. This makes sense as the RC pair represents the charge-transfer phenomenon, and degradation can affect the amount of available charge in the battery, which is simply capacity. The changes in these parameters are not significant over a short amount of time, i.e., a few drive cycles, but can be very prominent over the lifetime of the battery. These results confirm that the assumption about the parameters being constant in existing state observer FDI methods, is not valid. Therefore, a reliable FDI scheme should take into consideration the changes in the ECM parameters due to cell degradation.

Effect of Faults on ECM Parameters
Bias and gain faults were injected into the UDDS driving cycles at various cell capacities, times, and sizes. The effects of the faults were found to be similar across fault types, regardless of the injection time and fault size. The changes in the parameters when the fault is injected can be seen to be more significant, than changes with SOC and temperature [25]. An example is shown in Figure 7, where a voltage gain fault of +10% was injected at the time 30,000 s. When this fault occurs, as shown in Figure  7b,d,f, the parameters diverge away from their original trends. It can also be seen from Figure 7a,c,e that the unfiltered values follow the WMA-filtered line closely during normal operation, while Figure  7b,d,f show that the two lines deviate significantly at the time the fault occurs. This confirms the workability of the proposed change-point detection method using WMA and CUSUM. It is noted that the ECM parameters estimated by RLS require some time to converge. This can be seen at the beginning of Figure 7a-f. Therefore, the proposed FDI scheme would not be able to detect sensor faults for the first hour of battery operation. Considering the long lifespan of Li-ion batteries and the unlikelihood of sensor faults happening within the first hour of operation, it is reasonable to assume there is no fault during the converging period of the RLS algorithm.

Isolation of Faults
Through multiple simulations, it was found that 0 responds the fastest to current sensor faults, while either 1 or C 1 responds the fastest to voltage sensor faults. From these findings, it is possible to establish a fault isolation schematic to complement the proposed fault detection method. It is uncertain whether these faults would have the same effects on a different type of cell, but this will be focused on and further validated in future studies. For this paper, the FDI scheme will be based on the observations from the tested LFP cell. The full FDI scheme is shown in Figure 8. This scheme will be used to diagnose faults, and validated through simulation in the next section.

Diagnostic Implementation and Evaluation
This section shows the validation results of the proposed FDI scheme. The UDDS was selected for use in validation, as it is a realistic daily driving cycle. The experimental runs consisted of multiple UDDS cycles. The experimental setup is described in Section 4.1. The same set of data obtained in Section 4 was also used for the simulations in this section. The simulations started at an SOC of 95% and ended at 20%, and was conducted at various decreasing cell capacities. Faults were injected at random time points. The FDI scheme was validated at all tested capacities to ensure faults can be diagnosed while the cell underwent degradation.

Voltage Sensor Fault Detection
Multiple voltage sensor faults were injected at different cell capacities in simulation. One specific case will be shown as an example. At a cell capacity of 16.47 Ah, a bias fault of +0.5 V was added to the voltage sensor at the time 30,000 s. The diagnostic results are plotted in Figure 9. Figure 9a,c,e show the deviation between the filtered and unfiltered data. As can be seen, the error increases significantly at the fault injection time. Figure 9b,d,f show the corresponding CUSUM values for the errors. Both the CUSUM values for 1 and C 1 exceed the threshold at 30,003 s, which is 3 s after the voltage sensor fault occurs. The CUSUM value for 0 takes longer to respond to the fault, which is expected for voltage sensor faults, and also helps to achieve correct fault isolation. The detected voltage sensor fault signal is plotted in Figure 9g. Table 3 presents results for detection time of the voltage sensor faults of different fault sizes and cell capacities at an injection time of 30,000 s.

Current Sensor Fault Detection
Similar to the simulation done for voltage sensor fault diagnosis validation, current sensor faults of various sizes were injected at different available cell capacities. The case that will be shown as an example is at a cell capacity of 16.47 Ah, where a gain fault of +10% was injected at the time 30,000 s. The diagnostic results are plotted in Figure 10. The errors were also found to increase at the time of fault injection, as seen in Figure 10a,c,e. Figure 10b,d,f show that the CUSUM values all exceed their respective thresholds after the fault occurs. The CUSUM for the error of R 0 is the fastest to exceed the threshold, at 30,165 s; while the CUSUM values for 1 and 1 exceed their thresholds afterward. This indicates a current sensor fault, according to the proposed FDI scheme. Figure 10g shows the detected and isolated current sensor fault signal. The detection time for current sensor faults suffers from a delay, as the CUSUM values take longer to pass their thresholds. Lowering these thresholds should give faster detection time, but risks giving false detection, which is a common trade-off in practice [3]. Table 4 summarizes the results for detection time for the current sensor at an injection time of 30,000 s, with different fault sizes and at different cell capacities. For both voltage and current sensors, more simulations were conducted at different injection times, sizes, and capacities to test the validity and effectiveness of the proposed FDI scheme; but it is impossible to show all the results individually, so a summary will be presented. The injection times were set at 10,000 s, 20,000 s, and 30,000 s. It should be noted that faults were not added at the beginning of the runs, due to the proposed FDI scheme's aforementioned inability to detect faults during the converging period of the RLS algorithm, which typically lasts an hour at the start of the battery operation. The considered faults for the voltage sensor are [±0.1 V; ±0.5 V; ±10%], while the considered faults for the current sensor are [±4 A; ±7 A; ±10%]. Approximately 200 runs were simulated. Table 5 shows the results for maximum, minimum, and mean detection time (DT-time from fault occurrence to correct detection of fault), false detection rate (FDR-fraction of tests where fault is detected, but there is no fault) and missed detection rate (MDR-fraction of tests where fault is not detected, but there is a fault). The isolation time depends on the fault size; the larger the fault, the faster the isolation time. It is thus concluded that faults can be detected within a reasonable time using the proposed FDI scheme, with no false detection or missed detection. Table 5. Summary of the performance evaluation metrics.

Conclusions
This paper presented a model-based sensor FDI scheme for a Li-ion cell used in EVs with cell degradation consideration. The scheme uses the RLS algorithm to estimate the ECM parameters in real time, and the WMA filter coupled with CUSUM control chart to detect faults. Experiments and simulations were conducted on an LFP cell in a controlled environment, to verify that ECM parameters are affected by degradation and faults to different degrees; the latter having a more significant effect. It was also found that certain parameters respond faster to specific types of fault, enabling the isolation of faults. Finally, the UDDS driving cycles were used to validate the performance of the proposed FDI scheme. Various injection times, fault sizes, fault types, and cell capacities were considered. The validation results showed that the proposed scheme could detect and isolate voltage sensor faults and current sensor faults for an LFP cell within a reasonable time, with no false or missed detection.