A Review of Lithium-Ion Battery Fault Diagnostic Algorithms: Current Progress and Future Challenges

: The usage of Lithium-ion (Li-ion) batteries has increased signiﬁcantly in recent years due to their long lifespan, high energy density, high power density, and environmental beneﬁts. However, various internal and external faults can occur during the battery operation, leading to performance issues and potentially serious consequences, such as thermal runaway, ﬁres, or explosion. Fault diagnosis, hence, is an important function in the battery management system (BMS) and is responsible for detecting faults early and providing control actions to minimize fault e ﬀ ects, to ensure the safe and reliable operation of the battery system. This paper provides a comprehensive review of various fault diagnostic algorithms, including model-based and non-model-based methods. The advantages and disadvantages of the reviewed algorithms, as well as some future challenges for Li-ion battery fault diagnosis, are also discussed in this paper.


Introduction
Lithium-ion (Li-ion) batteries play a significant role in daily applications due to their important advantages over other energy storage technologies, such as high energy and power density, long lifespan, and low self-discharge performance factors under improper temperatures [1]. Li-ion batteries have gained a significant amount of attention in recent years, showing promise as an energy storage source in electric vehicles (EVs) due to the aforementioned advantages [2]. They are also widely used in many electronics and stationary applications. Although Li-ion batteries have had reported accidents causing public concern, the advent of safety features over time has decreased the associated risk factors and improved battery operation [3,4]. Li-ion batteries have gone through many modifications to improve their safety, and system management is required to guarantee the performance and reliability of these batteries [1]. A key component in monitoring the battery health and safety is the battery management system (BMS). Some of its main functions are data acquisition, state of charge (SOC) and state of health (SOH) estimation, cell balancing, charge management, and thermal management. An important function of the BMS is the diagnosis of faults which can come from extreme operating conditions, manufacturing flaws, or battery aging [5].
Many factors affect the Li-ion battery operation, such as collision and shock, vibration, deformation, metallic lithium plating, formation of a solid electrolyte interphase (SEI) layer, formation of lithium dendrite, etc. [1,6]. Some common external battery faults are sensor faults, including temperature, voltage and current sensor faults, as well as cell connection and cooling system faults. There are also internal battery faults that are caused by the above factors and external battery faults. Some common internal battery faults are overcharge, overdischarge, internal and external short circuit, overheating, accelerated degradation, and thermal runaway. These battery faults lead to potentially hazardous consequences, such as an increase in temperature and pressure, which could increase the risk of

Internal Battery Faults
Internal battery faults are difficult to detect since the operation within a Li-ion cell is still not fully understood [11]. Some examples of internal battery faults are overcharge, overdischarge, internal and external short circuit, overheating, accelerated degradation, and thermal runaway. All these faults affect the battery operation, but accelerated degradation and thermal runaway are the most dangerous since they can significantly affect the Li-ion battery application or directly harm the users [12]. Internal faults are often identified from abnormal responses from the battery operation, which include voltage drop, SOC drop, temperature rise, increase in internal resistance, and physical transformation, such as swelling. These responses are discussed further throughout this section.

Overcharge
Overcharge is a fault that can lead to more severe faults, such as accelerated degradation and thermal runaway. It may occur in Li-ion cells due to the capacity variation of cells in the pack, incorrect voltage and current measurement, or inaccurate SOC estimation from the BMS [13]. A normal battery pack can also get overcharged when the charger breaks down. Overcharging of Liion batteries leads to electrochemical reactions between battery components and the loss of active materials [14]. Additionally, in sealed batteries, the buildup of gases could cause the battery to burst [15]. Furthermore, the surface temperature of the battery increases significantly before it starts overcharging. This results in a thick SEI layer and also causes an internal short circuit inside the battery. The overcharged cathode suffers from electrolyte decomposition, metal dissolution, and phase transition, which could ultimately lead to thermal runaway and fires [16].

Overdischarge
Overdischarge, similar to overcharge, can be caused by incorrect voltage and current measurements as well as inaccurate SOC estimation [13]. In [17], scanning electron microscopy and X-ray diffraction showed that overdischarge might be caused by copper deposition on the electrodes

Overdischarge
Overdischarge, similar to overcharge, can be caused by incorrect voltage and current measurements as well as inaccurate SOC estimation [13]. In [17], scanning electron microscopy and X-ray diffraction showed that overdischarge might be caused by copper deposition on the electrodes of the battery. Electrochemical impedance spectroscopy (EIS) studies also indicate that during battery overdischarge, the impedance of the anode is much smaller than that of the cathode, which means that the SEI change on the anode is much larger than the cathode, leading to capacity loss and current collection corrosion [14]. Overdischarge can impact the lifespan and thermal stability of a Li-ion cell, and result in considerable swelling of the cell [16]. The anode potential also increases abnormally during overdischarge, which can cause the Cu in the cell to oxidize to Cu 2+ ions. The dissolved Cu 2+ ions may result in migration through the separator and, ultimately, an internal short circuit [18].

Internal Short Circuit
A short circuit in Li-ion batteries can occur both externally and internally. An internal short circuit occurs when the insulating separator layer between the electrodes fails. This failure of the separator can be attributed to melting due to high temperature, cell deformation, the formation of dendrite, or compressive shock [7,19,20]. All of these can cause penetration through the separator layer, or cause the lithium ions and electrons to be released at the anode and travel across the electrolyte toward the cathode, which triggers contact between the anode and the cathode, leading to internal short-circuiting. When this phenomenon happens, the electrolyte tends to decompose by an exothermic reaction, causing thermal runaway [21]. Thermal runaway is mainly caused by heat build-up from the Algorithms 2020, 13, 62 4 of 18 short circuit. In general, high capacity cells are at a higher risk of thermal runaway from internal short circuits than normal capacity cells [11].

External Short Circuit
An external short circuit generally occurs when the tabs are connected by a low resistance path [11]. Another cause is electrolyte leakage from cell swelling due to gas generation from side reactions during overcharge [22]. It can also occur due to water immersion and collision deformation. An external short circuit specifically transpires when an external heat-conducting element makes contact with the positive and negative terminals simultaneously, causing an electrical connection between the electrodes to occur [23]. A study [22] concluded that due to external short circuit, the Li-ion diffusion in the negative electrode leads to limited current, and the heat generated by the electrolyte decomposition in the positive electrode is responsible for thermal runaway occurrence. An external short circuit also leads to excessive discharge of the energy that is being stored in a cell [11].

Overheating
A Li-ion battery can overheat if an alternator's voltage regulator fails, sending a high amount of voltage back to the battery and causing overheating [24]. Overheating can also be caused by external and internal short circuits [25]. Overheating of the battery accelerates the degradation of the cathode and leads to SEI growth at the anode. As a result of overheating, there is a significant capacity loss. Overheating of Li-ion battery can lead to the materials inside the battery to break down and produce bubbles of gas, and, in most cases, the pressure build-up causes the battery to swell and possibly explode [26]. Another consequence of overheating is thermal runaway, which occurs because, at a critical temperature, a runaway reaction takes place as the heat cannot escape as rapidly as it is formed [15].

Accelerated Degradation
Cell degradation is a common characteristic in most batteries and occurs due to a variety of reasons, such as aging and self-discharging mechanisms. However, accelerated degradation is abnormal and can cause severe problems in Li-ion battery applications. The degradation process is accelerated during storage at elevated temperatures [27]. External degradation is also accelerated due to factors, such as impedance increase, higher frequency of cycle, change in SOC, and voltage rates [28,29]. Some mechanisms of accelerated degradation are corrosion of current collectors, changes in the electrode material, and reactions between electrodes and electrolyte. Accelerated degradation can cause the battery to have a shorter lifespan, which can be a major issue in applications such as EVs. It can also cause surface layer formation and contact deterioration, which results in electrode disintegration, material disintegration, and loss of lithium. These phenomena can lead to transport barriers, which result in penetration of the separator and cause an internal short circuit and, ultimately, a thermal runaway [30].

Thermal Runaway
All the above internal battery faults can cause thermal runaway. It can also occur during the charging of the battery under extreme charging currents and high temperatures. When the temperature reaches the melting point of the metallic lithium, it can cause a violent reaction [31]. Restricted air circulation is another cause of thermal runaway [32]. A study by Galushkin et al. [33] concluded that the probability of thermal runaway increases with the number of charge/discharge cycles. The study also found that thermal runaway is related to a variety of exothermic reactions in batteries. The first exothermic reaction to occur is SEI decomposition, and it considerably increases the heat release at the beginning of thermal runaway. During thermal runaway, the cathode releases oxygen by a phase transition, and the oxygen is consumed by the lithiated anode. A consequence of thermal runaway is a substantial increase in pressure and temperature of the Li-ion cell, which can lead to the destruction of the container and the release of a large amount of flammable and toxic gas [33]. Often in the case of a thermal runaway occurrence, the battery heats up and explodes. Hence, it is the most severe fault that can materialize in a Li-ion battery system.

External Battery Faults
External battery faults can have a significant effect on the other functions of the BMS and cause internal battery faults to occur. There are several types of external faults, which are temperature, voltage and current sensor faults, cell connection fault, and cooling system fault. The cooling system fault can be considered the most severe fault because it leads to a direct thermal failure, specifically thermal runaway, as the system fails to provide adequate cooling [34].

Sensor Fault
It is crucial to have a reliable sensor fault diagnostic scheme to ensure battery safety and performance. It also helps to prevent internal faults, such as overcharge, overdischarge, overheat, external and internal short circuit, and, most importantly, thermal runaway. Sensor faults include failure of temperature, voltage, and current sensors. Sensor faults are caused by vibration, collision electrolyte leakage, and other physical factors [35]. They can also be attributed to the loose battery terminals or corrosion around the battery sensor. A sensor fault can accelerate the degradation process of a battery, hinder the BMS functions due to incorrect state estimation, and cause other internal battery faults [10].
The temperature sensor is a key component in the Li-ion battery system, as it helps provide critical temperature data for the BMS to manage the battery operation effectively. A temperature sensor fault can cause it to send incorrect measurements to the BMS, which can cause further problems due to ineffective thermal management [36]. Inaccuracy in the BMS thermal management function can result in a significant decrease in battery life. Temperature sensor fault can also lead to short-circuiting, overheating, aging under high temperatures, capacity fade due to high temperature, and ultimately thermal runaway [33].
The voltage sensor is used to monitor the voltage of cells in the battery pack. A voltage sensor fault can cause the cell or the entire pack to exceed the upper and lower voltage limits that are specified by the manufacturers, which can result in overcharge and overdischarge [13]. A fault in the voltage sensor also leads to inaccurate SOC and SOH estimation, which can result in an internal fault as the battery suffers from overcharge and overdischarge [37].
The current sensor monitors the current that enters and exits the battery and sends the data to the BMS. It is important to detect a faulty current sensor as it can lead to further problems. The current can bypass the sensor, and the readings will be inaccurate [37]. A current sensor fault also leads to inaccurate estimation of SOC and other parameters, which impacts the control actions in the BMS, and causes the cell to overcharge, overdischarge, or overheat.

Cooling System Fault
The cooling system helps manage the thermal aspect of the battery. It transfers the heat away from the battery pack and ensures that the battery remains in the optimal temperature range. Cooling system faults occur when the cooling motor or fan fails to operate due to outdated fan wiring, faulty temperature sensor, or a broken fuse [2,38]. The temperature sensor and cooling system fault cannot be separated from each other as they both depend on a temperature range [34]. A cooling system fault is one of the more severe faults as it leads to a direct failure of the battery due to overheating, and ultimately thermal runaway. Therefore, it is important to diagnose it as early as possible.

Cell Connection Fault
Battery or cell connection fault is caused by the poor electrical connection between the cell terminals, as the terminals may become loose from vibration or corroded by impurities over time [39]. When this fault occurs, the cell resistance can increase drastically, leading to cell imbalance due to uneven current, or overheating of the faulty cell [40]. This type of fault is simple to detect with voltage and temperature sensors, but if left unresolved, it could lead to more severe consequences, such as external short circuit or thermal runaway.

The Role of BMS in Fault Diagnosis
One of the main functions of the BMS is to minimize the risks associated with the operation of a lithium-ion battery pack to protect both the battery and the users. Hazardous conditions are mostly caused by faults, and the safety functions of the BMS should minimize the likelihood of occurrence and the severity of these faults. Sensors, contactors, and insulation are common features added to the battery system to ensure its safety [13]. There are also operational limits for voltage, current, and temperature, which are monitored with sensors connected to the cells [41]. However, as the hardware and software implementation of the BMS becomes increasingly complex, battery faults can be more complicated, and these safety measures are often not adequate [42,43]. Fault diagnostic algorithms are, hence, a requirement for BMS. These algorithms serve the purpose of detecting faults early and providing appropriate and immediate control actions for the battery and the users [8]. Figure 2 illustrates the mechanism of fault diagnosis in the BMS.
be separated from each other as they both depend on a temperature range [34]. A cooling system fault is one of the more severe faults as it leads to a direct failure of the battery due to overheating, and ultimately thermal runaway. Therefore, it is important to diagnose it as early as possible.

Cell Connection Fault
Battery or cell connection fault is caused by the poor electrical connection between the cell terminals, as the terminals may become loose from vibration or corroded by impurities over time [39]. When this fault occurs, the cell resistance can increase drastically, leading to cell imbalance due to uneven current, or overheating of the faulty cell [40]. This type of fault is simple to detect with voltage and temperature sensors, but if left unresolved, it could lead to more severe consequences, such as external short circuit or thermal runaway.

The Role of BMS in Fault Diagnosis
One of the main functions of the BMS is to minimize the risks associated with the operation of a lithium-ion battery pack to protect both the battery and the users. Hazardous conditions are mostly caused by faults, and the safety functions of the BMS should minimize the likelihood of occurrence and the severity of these faults. Sensors, contactors, and insulation are common features added to the battery system to ensure its safety [13]. There are also operational limits for voltage, current, and temperature, which are monitored with sensors connected to the cells [41]. However, as the hardware and software implementation of the BMS becomes increasingly complex, battery faults can be more complicated, and these safety measures are often not adequate [42,43]. Fault diagnostic algorithms are, hence, a requirement for BMS. These algorithms serve the purpose of detecting faults early and providing appropriate and immediate control actions for the battery and the users [8]. Figure 2 illustrates the mechanism of fault diagnosis in the BMS.  In the battery system, the BMS plays a significant role in fault diagnosis because it houses all diagnostic subsystems and algorithms. It monitors the battery system through sensors and state estimation, with the use of modeling or data analysis to detect any abnormalities during the battery system operation [13]. Since there are many internal and external faults, it is difficult to carry out this task efficiently. Various fault diagnostic methods need to work in tandem to correctly detect and isolate a specific fault, to administer the correct control action. However, the fault diagnostic algorithms in the BMS have limited computing space and time. Because of the large number of cells in the battery system in some applications, these fault diagnostic algorithms need to have low computational effort, while maintaining accuracy and reliability [44]. In recent years, there has been extensive effort in the research and development of efficient fault diagnostic approaches for Li-ion battery, which will be discussed in the following section.

Fault Diagnostic Algorithms for the Li-Ion Battery System
Fault diagnosis, as discussed, is an important function in the BMS. Fault diagnosis includes fault detection, isolation, and estimation. There are many fault diagnostic methods in various industries. For Li-ion battery applications, faults can be internal and interconnected, thus, many common methods in other fields are not suitable. There are two categories for fault diagnostic methods in Li-ion battery: model-based and non-model-based [9]. Classification of the fault diagnostic algorithms to be discussed in this paper can be found in Figure 3. This section presents recent developments for internal and external Li-ion battery fault diagnosis.
isolate a specific fault, to administer the correct control action. However, the fault diagnostic algorithms in the BMS have limited computing space and time. Because of the large number of cells in the battery system in some applications, these fault diagnostic algorithms need to have low computational effort, while maintaining accuracy and reliability [44]. In recent years, there has been extensive effort in the research and development of efficient fault diagnostic approaches for Li-ion battery, which will be discussed in the following section.

Fault Diagnostic Algorithms for the Li-Ion Battery System
Fault diagnosis, as discussed, is an important function in the BMS. Fault diagnosis includes fault detection, isolation, and estimation. There are many fault diagnostic methods in various industries. For Li-ion battery applications, faults can be internal and interconnected, thus, many common methods in other fields are not suitable. There are two categories for fault diagnostic methods in Liion battery: model-based and non-model-based [9]. Classification of the fault diagnostic algorithms to be discussed in this paper can be found in Figure 3. This section presents recent developments for internal and external Li-ion battery fault diagnosis.

Model-Based Methods
The main principle of model-based fault diagnosis is the use of battery models to generate residuals which are monitored and analyzed to detect faults. There are several types of battery models, including electrochemical, electrical, thermal, and combinations of interdisciplinary models (electro-thermal, etc.) [45]. Each model can be used to assist fault diagnosis, depending on the

Model-Based Methods
The main principle of model-based fault diagnosis is the use of battery models to generate residuals which are monitored and analyzed to detect faults. There are several types of battery models, including electrochemical, electrical, thermal, and combinations of interdisciplinary models (electro-thermal, etc.) [45]. Each model can be used to assist fault diagnosis, depending on the requirements of the Li-ion battery application. Model-based methods are often used in fault diagnosis for their simplicity and cost-efficiency. Model-based methods include state estimation, parameter estimation, parity equation, and structural analysis. A simplified schematic for state estimation fault diagnostic algorithms is shown in Figure 4. requirements of the Li-ion battery application. Model-based methods are often used in fault diagnosis for their simplicity and cost-efficiency. Model-based methods include state estimation, parameter estimation, parity equation, and structural analysis. A simplified schematic for state estimation fault diagnostic algorithms is shown in Figure 4. In some earlier works, Alavi et al. [46] proposed a two-step state-estimation based algorithm for Li plating detection, which can lead to degradation and failure of the battery. The algorithm uses an electrochemical model, with the first step being estimating the insertion and extraction rates of Li ions using particle filtering, and the second step being the comparison of the estimated data with the boundary condition to generate alarms at faulty state. This method is effective in detecting Li plating, but the electrochemical model is complicated and not fully understood; hence, it is not reliable and impractical in the BMS. Authors in [47] used impedance spectroscopy to identify the parameters in the equivalent circuit model (ECM) to construct fault models for overcharge and overdischarge conditions. Kalman filters and the multiple-model adaptive estimation (MMAE) technique were proposed to generate residuals from the voltage, and the residuals were evaluated in a probabilitybased approach to detect overcharge and overdischarge faults. The same authors then proposed the use of extended Kalman filters with MMAE to detect overcharge and overdischarge faults [48]. This method eliminates noises effectively, but the use of multiple models makes it too complex and computationally inefficient. Chen et al. [49] used a bank of Luenberger observers with the ECM to generate residuals from voltage to locate and isolate faults in a series of three Li-ion cells and a bank of learning observers to estimate the isolated fault. This method was able to detect faults caused by changed internal resistance in a cell, but it is not practical for applications with a large number of cells, such as EVs.
In more recent works, the internal short circuit fault has been the main focus. Ouyang et al. [50] proposed an internal short circuit detection method using recursive least squares (RLS) to estimate the parameters of the mean-difference model, which is derived from the ECM. The parameters of all the cells in the pack are statistically analyzed and compared to a set threshold to detect internal short circuit fault. Authors in [51] used an electrochemical-thermal coupled model to confirm that an internal short circuit causes abnormal responses in voltage, temperature, and SOC. The ECM and the energy balance equation (EBE) model were used in the fault diagnostic algorithm instead of the electrochemical-thermal coupled model to lower the computational cost. RLS was utilized to estimate the parameters of these two models, and fault detection was achieved by observing changes in these parameters. The two diagnostic methods above are further developed in [52], where extended Kalman filters and RLS were implemented for state and parameter estimation, specifically SOC, voltage, temperature, and internal resistance. The state estimation and parameter estimation are performed on all the cells, with output data belonging to the mean and worst cells. Different scenarios of deviation from the estimated data indicate different levels of fault, including extra capacity depletion, abnormal heat generation, and internal short circuit. In [53], a model-based switching model method was proposed to improve the accuracy of the internal resistance estimates during an internal short circuit fault, and RLS was also used for parameter estimation and fault detection. Gao et al. [54] proposed a fault detection method for the micro-short circuit, using the cell In some earlier works, Alavi et al. [46] proposed a two-step state-estimation based algorithm for Li plating detection, which can lead to degradation and failure of the battery. The algorithm uses an electrochemical model, with the first step being estimating the insertion and extraction rates of Li ions using particle filtering, and the second step being the comparison of the estimated data with the boundary condition to generate alarms at faulty state. This method is effective in detecting Li plating, but the electrochemical model is complicated and not fully understood; hence, it is not reliable and impractical in the BMS. Authors in [47] used impedance spectroscopy to identify the parameters in the equivalent circuit model (ECM) to construct fault models for overcharge and overdischarge conditions. Kalman filters and the multiple-model adaptive estimation (MMAE) technique were proposed to generate residuals from the voltage, and the residuals were evaluated in a probability-based approach to detect overcharge and overdischarge faults. The same authors then proposed the use of extended Kalman filters with MMAE to detect overcharge and overdischarge faults [48]. This method eliminates noises effectively, but the use of multiple models makes it too complex and computationally inefficient. Chen et al. [49] used a bank of Luenberger observers with the ECM to generate residuals from voltage to locate and isolate faults in a series of three Li-ion cells and a bank of learning observers to estimate the isolated fault. This method was able to detect faults caused by changed internal resistance in a cell, but it is not practical for applications with a large number of cells, such as EVs.
In more recent works, the internal short circuit fault has been the main focus. Ouyang et al. [50] proposed an internal short circuit detection method using recursive least squares (RLS) to estimate the parameters of the mean-difference model, which is derived from the ECM. The parameters of all the cells in the pack are statistically analyzed and compared to a set threshold to detect internal short circuit fault. Authors in [51] used an electrochemical-thermal coupled model to confirm that an internal short circuit causes abnormal responses in voltage, temperature, and SOC. The ECM and the energy balance equation (EBE) model were used in the fault diagnostic algorithm instead of the electrochemical-thermal coupled model to lower the computational cost. RLS was utilized to estimate the parameters of these two models, and fault detection was achieved by observing changes in these parameters. The two diagnostic methods above are further developed in [52], where extended Kalman filters and RLS were implemented for state and parameter estimation, specifically SOC, voltage, temperature, and internal resistance. The state estimation and parameter estimation are performed on all the cells, with output data belonging to the mean and worst cells. Different scenarios of deviation from the estimated data indicate different levels of fault, including extra capacity depletion, abnormal heat generation, and internal short circuit. In [53], a model-based switching model method was proposed to improve the accuracy of the internal resistance estimates during an internal short circuit fault, and RLS was also used for parameter estimation and fault detection. Gao et al. [54] proposed a fault detection method for the micro-short circuit, using the cell difference model and extended Kalman filters to estimate the cell SOC differences. The extra depleting current is identified with RLS to diagnose the short circuit resistance. All the above methods for internal short circuit detection use RLS for parameter estimation for low computational cost, but they all require information from other cells in the pack, which can be influenced by online cell balancing leading to unreliable fault detection.
A string of studies on thermal fault detection using the battery thermal model and the ECM was introduced by the same group of authors in [55][56][57]. In [55], the Li-ion battery was modeled via ECM and a two-state thermal model. A diagnostic scheme based on the Luenberger observer was proposed to detect and isolate three thermal faults, which were internal thermal resistance fault, convective cooling resistance fault, and thermal runaway, by generating and filtering two unique residuals from the models. This work is further developed in [56], where nonlinear observers using Lyapunov analysis was implemented, and an adaptive threshold generator was developed to deal with the presence of modeling uncertainties. Most recently, the authors proposed a partial differential equation model-based scheme with two state observers to improve the accuracy and reliability of thermal fault detection [57]. The method proposed in these works becomes progressively more precise and reliable, but the computational effort also increases as the method develops.

Non-Model-Based Methods
Non-model-based methods include signal processing and knowledge-based methods. These methods primarily rely on battery data collection, although still using battery modeling to an extent. They can improve fault diagnostic accuracy but might require a large amount of fault data, which is often not available, or have very high computational cost, which is impractical for usage in the BMS.
Several recent works implement signal processing for Li-ion battery fault diagnosis. Kong et al. [58] obtained the remaining charging capacity, which can increase due to extra charge depletion caused by micro-short circuiting, from charging cell voltage curve transformation. The micro-short circuit fault was then identified from the remaining charging capacities between adjacent charges, which was used to obtain the leakage current of the shorted cell. The correlation coefficient was used in [59] to detect the initial stage of short circuits by capturing the off-trend voltage drop and reflect the drop variation in the correlation coefficient. Another work [60] proposed the use of an interclass correlation coefficient to analyze battery short circuit fault also by capturing the off-trend voltage drop. Both methods using the correlation coefficient employ moving windows to prolong the fault memory and do not need extra hardware design or model adjustment. However, they can be subject to measurement noises. Entropy theory fault diagnosis has also been studied recently. Authors in [61] implemented the Shannon entropy and the Z-score method to detect any abnormality in the battery temperature, as well as predicting the time and location of the fault, to prevent thermal runaway. Liu et al. [62,63] proposed the use of a modified Shannon entropy with the Z-score method to capture abnormality in cell voltage, and predict the time and location of the voltage fault occurrence. The entropy-based methods are effective in detecting battery faults, but the computational cost increases with the desired diagnostic precision.
Knowledge-based is another type of non-model-based fault diagnostic method. In some earlier works, Xiong et al. [64] implemented an expert system, including a rule-based method and a probabilistic method, to detect an overdischarge fault, where the rules for failure are an unusual increase in temperature and unusual decrease in voltage. This method is easy to implement, but the threshold for the rules is very difficult to obtain for different applications and cell chemistries. In [65], the authors characterized external short circuit faults through three criteria, which are voltage variation, current variation, and temperature change rate, and constructed thresholds for these criteria to detect external short circuit faults. This method is simple to implement, but it is unreliable as Li-ion cells vary in characteristics, and more experiments would need to be conducted to validate the set thresholds. In [66], the authors used fuzzy logic to analyze temperature, SOC, and voltage residuals from an electrochemical model to detect overcharge, overdischarge, and accelerated degradation. Later works explore the application of machine learning in Li-ion battery fault diagnosis. The Random forests classifier was used in [67] to detect the electrolyte leakage behavior that occurs during an external short circuit fault. The two root-mean-square-error indicators for an external short circuit fault in this study are higher maximum temperature rise and lower discharge capacity. The result of the leakage condition is obtained through training a large amount of data using the classifier. The normal and faulty cells can then be classified effectively, and external short circuit fault can be diagnosed accurately. Zhao et al. [68] established a battery fault diagnostic model through the combination of the 3σ multi-level screening strategy and the machine learning algorithm. The 3σ multi-level screening strategy was utilized to build the criteria for normal operating cell voltage, and a neural network was applied to simulate the cell fault distribution in a battery pack. This method requires an extended period to collect battery data to detect battery faults reliably. Djeziri et al. [69] proposed the use of a Wiener process to model and monitor the drift of voltage to obtain the voltage identification (VID), which determines the voltage needed for a processor to run continuously and at 100%. This data-driven method can be used to predict the degradation trajectory of a system and adjust the parameters accordingly. Although it has not been tested with a Li-ion battery system, its ability to adapt to the degradation process shows its potential to be effective in Li-ion battery fault diagnosis.

Model-Based Methods
For external battery fault diagnosis, model-based methods are often implemented to detect sensor faults and cooling system faults. In some earlier works, Marcicki et al. [70] proposed the use of nonlinear parity equations to generate residuals for voltage, current, and fan setting from the ECM, thermal, and SOC models. The thresholds for the residuals were selected using the probability density function. Current and voltage sensor faults, as well as cooling fan fault, were then detected and isolated. However, this method can only detect faults of large magnitude due to errors from the observer. In [71], sensor fault detection was achieved using the Kirchhoff circuits' equations to generate current and voltage residuals and the temperature diffusion model to generate temperature residuals. This method has a low computational cost but is sensitive to measurement noises. In more recent works, Xu et al. [72] used the proportional-integral-observer-based method to accurately detect and estimate current sensor fault, as a means to improve the state of energy estimation. In [73], a model-based diagnostic scheme using sliding mode observers designed based on the electrical and thermal dynamics of the battery was developed. A set of fault detection filter expressions was derived on the sliding surfaces of each observer. The outputs of these filters were used as residual signals to detect, isolate, and estimate voltage, current and temperature sensor faults, under the assumption that the faults and their time derivatives are bounded and finite. Tran et al. [35] proposed a parameter estimation method using RLS to detect and isolate voltage and current sensor faults. The ECM was used, and the estimated parameters were put through a weighted moving average filter and a statistical cumulative sum (CUSUM) test to evaluate the generated residuals for any abnormalities. This method was also able to detect sensor faults when the battery underwent degradation accurately. The authors in [36] presented a fault diagnostic method for current and voltage sensor faults using state estimation, specifically SOC estimation. The residual was generated from the true SOC, calculated by the coulomb counting method, and the estimated SOC, obtained by the recursive least squares and unscented Kalman filter joint estimation method. Due to sensor noises in real applications, there are some uncertainties associated with the threshold selection for the residuals in this method, which require further research.
A group of authors produced a series of studies on sensor fault diagnosis using structural analysis [2,34,74] and an extended Kalman filter [75][76][77]. In [34] and [74], structural analysis was applied to obtain the structurally overdetermined part of the system model by analyzing the structural model represented by an incidence matrix. Sequential residual generation was then used as a diagnostic test to detect different faults, including voltage and current sensor faults and cooling system faults, by calculating the unknown variables in a sequence. The specific fault was isolated with many different minimal structurally overdetermined sets. This method was improved in [2] by applying an extended Kalman filter to address the issue of inaccurate initial SOC. In addition, a statistical CUSUM test was implemented to evaluate the generated residuals from the diagnostic tests to determine the presence of a fault. This addition, however, made the proposed method significantly more computationally complex. Another model-based fault diagnostic method was developed by the same authors using an extended Kalman filter in [75]. The extended Kalman filter was used to estimate the output voltage from the ECM to generate residuals between measured and estimated values. The residuals were then evaluated by a statistical CUSUM test to detect current and voltage sensor faults. This work was further developed in [76] and [77] by implementing an adaptive extended Kalman filter to update the process and measurement noise matrices appropriately, which helps eliminate the noises.

Non-Model-Based Methods
For cell connection fault diagnosis, entropy theory and statistical analysis methods have been proposed. Zheng et al. [78] used demonstration data from 96 cells in series collected over three months to obtain cell resistances from the ECM. By calculating the Shannon entropy of the cell resistances, the contact resistance fault was successfully isolated. In [39], the authors proposed a method of fault detection for the connection of Li-ion cells based on entropy. Cell voltage data was obtained and filtered using the discrete cosine filter method. Ensemble Shannon entropy, local Shannon entropy, and sample entropy were applied to predict the time and location of the connection failure occurrence based on voltage fluctuations. Sample entropy produced the most accurate results but required a large amount of data, while ensemble Shannon entropy was able to predict the connection fault between cells effectively. Sun et al. [79] used the wavelet decomposition with three layers developed by Daubechies to smooth the voltage signal and eliminate noise interference. After the wavelet transformation, the Shannon entropy of charge and discharge cycles was calculated, and the change in entropy was monitored to detect real-time cell connection fault. However, this method can be easily affected by the set value of the interval parameters. Ma et al. [80] proposed the use of a modified Z-score test to analyze the abnormal voltage coefficients to detect cell connection fault. The temperature rise rate was utilized as a secondary parameter to determine the severity of the fault. However, using this method, a fault occurring at a location where there is no cross-voltage test cannot be reliably detected.
Since battery packs are often made up of many cells in series, topology-based fault diagnostic methods have been proposed by a few studies. Xia et al. [81] proposed a fault-tolerant voltage measurement method, where the voltage sum of multiple cells was measured instead of the voltage of individual cells. A matrix interpretation of the sensor topology was developed to isolate sensor and cell faults by locating abnormal signals. Since this method is sensitive to measurement noises, the authors further developed an improved measurement topology in [82], where the noise limit and trend of the interleaved voltage measurement method were derived to improve the noise sensitivity. In [83], a multi-fault diagnostic strategy based on an interleaved voltage measurement topology and an improved correlation coefficient method was presented, which can diagnose several types of faults, such as internal and external short circuits, sensor faults and connection faults. The voltage measurement method correlated each battery and contact resistance with two different sensors to accurately identify the location and type of faults. The improved correlation coefficient method was used to monitor fault signatures to eliminate the effect of battery inconsistencies and measurement errors. These topology-based methods can isolate various faults accurately, but they are only suitable for battery packs that have multiple cells in series with interleaved voltage measurements.

Current Progress and Future Challenges of Li-Ion Battery Fault Diagnosis
In summary, the fault diagnostic algorithms that were discussed have made certain progress on improving Li-ion battery safety, but they still have some limitations in real-life applications. A summary of all the reviewed algorithms is shown in Table 1. Model-based methods can quickly detect and isolate a fault in real-time but require high modeling accuracy. Therefore, further research needs to be conducted to reach a better understanding of the internal battery operation to develop a precise, but not overly complicated, battery model. In addition, the trade-off between robustness and sensitivity of the diagnostic approach, which depends on the fault thresholds, should also be considered. Non-model-based methods can avoid the difficult requirements for battery modeling but still have some drawbacks. Signal processing methods have good dynamic performance but are sensitive to measurement noises and are not able to detect early faults reliably. Simple knowledge-based methods, such as expert systems, require effective rules to precisely detect faults, which is challenging as some battery faults are still not fully understood. Complex knowledge-based methods, such as machine learning, have high accuracy and compatibility with a nonlinear system, such as the Li-ion battery, but the training process is time-consuming and requires a large amount of data.
Overall, even though model-based methods are more common, many difficulties still exist in improving battery model accuracy, especially throughout the entire lifespan of the battery. Non-model-based methods, particularly data-driven methods, can have a crucial role in predicting battery behavior as it degrades and aiding the model development process. Therefore, the most effective approach for Li-ion battery fault diagnosis should be a combination of both model-based and non-model-based methods.
There are several challenges that all of these fault diagnostic methods face. First, because some faults have similar effects on the battery, it is difficult to isolate each fault accurately to provide appropriate responses. Current methods often assume that the other components in the system are operating normally to avoid the need to isolate and identify the detected faults. Future research should focus on the development of fault identification methods after successful fault detection. Another challenging task is the determination of effective fault thresholds for early and accurate detection, due to the lack of understanding of fault behavior. Furthermore, a better understanding of battery fault behavior is needed to develop fault simulation tools, because producing a real physical fault can be impractical, costly, and unsafe. Therefore, more studies on the behavior of faults need to be conducted through well-designed experiments to collect data for modeling and simulation purposes. Finally, the BMS computational capability needs to be enhanced to accommodate more complex algorithms that can improve fault diagnostic accuracy significantly. This can be achieved by future hardware development in the BMS, as well as the utilization of cloud-based technologies for battery condition monitoring. Table 1. Summary of Lithium-ion (Li-ion) fault diagnostic algorithms.

Algorithm Types
Definitions Algorithms References

State estimation
The system state is estimated from a model using filters or observers. A fault is detected from the residuals between estimated and measured values.
Nonlinear parity equations [70,71] Structural analysis The structural overdetermined part of the system model is analyzed to detect and isolate a fault.
Structural analysis [2,34,74] Signal processing Measured signals are transformed into fault parameters, such as entropy or correlation coefficient. A fault is detected from abnormalities in these fault parameters.

Conclusions
The safety of the Li-ion battery system has attracted a considerable amount of attention from researchers. Battery faults, including internal and external faults, can hinder the operation of the battery and lead to many potentially hazardous consequences, including fires or explosion. One main function of the BMS is fault diagnosis, which is responsible for detecting faults early and providing control actions to minimize fault effects. Therefore, Li-ion battery fault diagnostic methods have been extensively developed in recent years. This paper provides a comprehensive review of existing fault diagnostic methods for the Li-ion battery system.
Fault diagnostic approaches are categorized into model-based and non-model-based methods. Model-based methods often have low computational cost and fast detection time but are reliant on the accuracy of battery modeling. A simple and precise battery model has not been fully developed. Non-model-based methods are less reliant on battery modeling. However, they require a time-consuming training process that needs a large amount of data. There has not been an effective and practical solution to detect and isolate all potential faults in the Li-ion battery system. There are several challenges in Li-ion battery fault diagnosis, including assumption-free fault isolation, fault threshold selection, fault simulation tools development, and BMS hardware limitations. The summary of the algorithms provided in this paper serves as a basis for researchers to develop more effective fault diagnostic methods for Li-ion battery systems in the future to improve battery safety.