Abstract
Effective thermal runaway (TR) detection is critical for the safety of lithium-ion battery packs, particularly in electric vehicles. However, deploying laboratory-validated methods into resource-constrained battery management systems (BMS) presents significant engineering challenges. This review surveys the state of the art in on-board TR monitoring, with an emphasis on the practical constraints of automotive applications. We first examine available precursor signals, including thermal, electrical, gas, and acoustic emissions, and evaluate their trade-offs regarding response speed and integration complexity. Second, diagnostic algorithms, from threshold-based logic to deep learning, are assessed against key performance metrics such as computational latency, false alarm rates, and lead time. Furthermore, the review discusses essential deployment considerations, including model compression techniques, inference hardware architectures, and compliance with functional safety standards. Specifically, the review discusses the implementation challenges of multi-modal data fusion, with a particular focus on the constraints imposed by limited hardware resources and long-term sensor reliability. Future directions regarding data standardization and cloud-edge collaboration are also discussed.
1. Introduction
Lithium-ion batteries (LIBs) have become the backbone of modern electrification due to their high energy density and long cycle life. As LIB deployment scales into multi-megawatt-hour battery energy storage systems (BESS) and high-volume electric vehicles (EVs), the operating envelope and system complexity continue to expand [1,2]. LIBs have a limited thermal stability margin under abusive or off-nominal conditions, and a fault can initiate self-accelerating heat generation that escalates into thermal runaway (TR). Once triggered, thermal runaway may propagate across cells and modules within a short time window and can result in major safety and economic consequences, which makes pack-level safety a primary engineering constraint rather than a secondary consideration.
Recent high-profile TR incidents in EVs and grid-scale BESS installations, characterized by toxic gas release, rapid fire propagation, and significant property damage, have intensified regulatory attention. Government safety agencies continue to report dozens of TR-related events annually, highlighting the insufficient early-warning capability of existing battery management systems (BMSs) [3]. Early warning with a lead time of several to tens of seconds is essential because it enables actionable mitigation such as load shedding, active cooling, or electrical isolation before propagation begins [4,5]. In safety-critical applications, early warning is valuable only when it can trigger reliable protective actions under controlled false-alarm constraints.
However, existing sensing architectures on production BMSs often fail to detect pre-TR abnormalities with sufficient lead time, especially when the fault signatures are weak. Among various failure modes, internal short circuits (ISCs) are particularly hazardous because they generate localized Joule heating and trigger parasitic exothermic reactions that can accelerate toward TR [6,7]. Importantly, ISCs often evolve silently, with minimal change in terminal voltage or surface temperature, making early detection technically challenging [7,8,9].
Numerous laboratory techniques, including gas analysis, infrared imaging, acoustic emission sensing, and high-fidelity physics-based modeling, have demonstrated strong early-detection capability under controlled conditions [10,11,12,13]. These methods typically rely on specialized instrumentation or computational resources that are incompatible with automotive on-board environments. In contrast, on-board BMSs operate under tight constraints in sensor type, sensor placement, power consumption, and computational capacity, making it difficult to directly translate laboratory methods into deployable solutions [14,15]. Meanwhile, modern AI has enabled compact data-driven models to be considered for on-board early warning, yet practical adoption still depends on deterministic execution, model compression, and robust validation under distribution shift.
To bridge the gap between laboratory capability and on-board feasibility, this review focuses on practical pathways for engineering early TR detection systems. Specifically, we provide:
- A structured taxonomy of detectable pre-TR precursors and an evaluation of sensing technologies suitable for pack-level deployment.
- A comparative review of diagnostic algorithms, covering rule-based, model-based, and data-driven approaches, with emphasis on lead time, robustness, and computational cost.
- A system-level assessment of deployment constraints, including model compression, inference hardware, communication architecture, and functional safety requirements.
Unlike prior reviews that primarily examine TR mechanisms [7,8] or summarize detection algorithms in laboratory settings, this work highlights the engineering trade-offs and integration challenges that ultimately determine whether a method can be deployed on-board, with explicit attention to ASIL-driven constraints and embedded MCU execution limits.
The remainder of this paper is organized as follows. Section 2 reviews detectable precursors and sensing technologies. Section 3 summarizes diagnostic algorithms and evaluation metrics. Section 4 discusses deployment considerations, including model compression, hardware, and communication. Section 5 concludes with key challenges and future opportunities. The overall schematic of this on-board detection framework, bridging sensing to actionable diagnostics, is illustrated in Figure 1.
2. On-Board Detectable Precursors and Sensor Integration
Effective mitigation of TR relies on the timely capture of precursor signals within the critical interval between defect initiation and catastrophic failure. In practical vehicle environments, these precursors manifest as anomalies in thermal, electrical, gaseous, or acoustic domains [11]. However, detecting localized overheating, voltage micro-fluctuations, or trace electrolyte decomposition is complicated by the dynamic noise floor of an operating vehicle. In addition to topology- and operation-induced attenuation, voltage-based indicators also exhibit chemistry-specific limitations. For lithium-ion phosphate (LiFePO4, LFP) cells, which are widely deployed in cost-sensitive EV platforms, the open-circuit voltage (OCV) remains relatively flat over a broad SOC window. This characteristic reduces the informativeness of terminal-voltage deviations for early-stage abnormalities, especially under dynamic loads where voltage fluctuations are dominated by polarization and measurement noise. Consequently, voltage-only diagnosis is often insufficient for LFP packs, and practical on-board solutions increasingly rely on multi-sensor fusion to improve sensitivity while maintaining an acceptable false-alarm rate.
Capturing these precursors reliably requires navigating a fundamental engineering trade-off between sensing performance and deployment feasibility. Candidate technologies must not only demonstrate high sensitivity but also survive the automotive-grade operating envelope: encompassing temperature cycling ( °C to °C), mechanical shock (>50 g), and a service life exceeding 15 years [16]. Furthermore, they must align with strict Size, Weight, Power, and Cost (SWaP-C) budgets while maintaining a high signal-to-noise ratio (SNR) in the presence of electromagnetic interference and road vibration [17].
Based on these considerations, this section reviews state-of-art sensor technologies and sampling strategies for early-stage TR detection. The discussion is organized by signal modality, following a logical progression from inherent measurements to specialized, high-fidelity indicators. We begin with temperature sensing (Section 2.1) and electrical anomaly monitoring (Section 2.2), which are integrated in almost all BMS. Subsequently, we examine gas and pressure monitoring (Section 2.3), and acoustic emission detection (Section 2.4), evaluating these as extra modalities that offer earlier warning capabilities but entail higher integration complexity.
This inherent trade-off is visualized schematically in Figure 2. The diagram illustrates a clear temporal hierarchy: signals that offer the earliest warning typically impose the highest integration penalties, whereas lagging indicators benefit from being inherent to standard BMS architectures [7,18,19,20].
Figure 2.
Schematic Evolution of Multi-Sensor Signals Before and During Lithium-Ion Battery Thermal Runaway. (a) Signal Dynamics: Solid lines represent typical signal trajectories. Note that voltage (blue) and surface temperature (red) typically exhibit lagging responses compared to the early onset of acoustic (purple) and gas (green) precursors. (b) Intervention Window: The yellow shaded region denotes the effective time window for BMS intervention. Note: Curves are conceptual; specific lead times depend on cell chemistry and trigger mechanisms.
2.1. Temperature Signal Monitoring
Temperature rise is among the earliest and most intuitive precursors of TR. As heat generation in lithium-ion cells generally precedes catastrophic failure, temperature monitoring constitutes a fundamental element of commercial BMSs. In production implementations, temperature channels are typically refreshed at about 1 Hz within the BMS, whereas temperature values reported through vehicle-level interfaces are often updated at about 0.1–1 Hz. This low update rate is consistent with the slow thermal dynamics and the limited bandwidth of pack-level measurement pipelines [21,22]. Because temperature monitoring imposes relatively modest bandwidth and computing requirements and relies on mature automotive-qualified sensors, it is straightforward to integrate in practice, and a broad range of sensing technologies has been developed for capturing thermal variations under both laboratory and in-vehicle conditions [17].
Thermistors [23] and thermocouples [24] remain the dominant temperature sensors in battery systems because of their low cost, fast response, and established automotive qualification [25,26]. In practice, they are mounted on cell or module surfaces to track bulk thermal trends and enable threshold-based protections. However, surface deployment inherently limits spatial resolution and prevents direct observation of localized internal heat generation, particularly in large-format prismatic and pouch cells [27]. This limitation originates not from the sensor principles themselves, but from the thermal transfer function between internal heat sources and external measurement points. Strongly anisotropic heat conduction in layered electrodes and jelly-roll or stacked architectures effectively acts as a low-pass filter, attenuating and delaying internal thermal transients. As a result, surface temperature responses can lag internal short-circuit or decomposition events by tens of seconds to minutes under abuse conditions, reducing the available window for timely mitigation [28,29]. Beyond physical lag, the engineering reality of sensor integration further degrades measurement fidelity. Thermistors are typically attached via thermal adhesives or pads. Over a vehicle’s 15-year lifespan, thermal cycling and vibration induce delamination or adhesive degradation, increasing the contact thermal resistance. In this scenario, a normal temperature reading in the BMS may simply indicate a detached sensor rather than a healthy cell, increasing the risk of false negatives in temperature-based logic [27].
To overcome the spatial limitations of point sensors, fiber-optic sensors have been explored for their high-resolution distributed sensing capabilities [30,31] and immunity to electromagnetic interference [32,33]. Fiber Bragg grating (FBG) sensors [34], in particular, can be embedded into modules to detect temperature gradients and internal thermal heterogeneity. However, their automotive deployment faces a significant system-level cost-complexity trade-off [35,36,37]. While the optical fibers are inexpensive, the required signal demodulation units are costly and bulky. Furthermore, integrating fragile glass fibers into a battery pack subject to >50 g mechanical shock requires specialized packaging that conflicts with energy density optimization [35,37].
While fiber optics offer contact-based distributed sensing, infrared (IR) thermal imaging provides an alternative non-contact means of observing surface thermal distribution [38]. It provides visual insight into cell-to-cell heat propagation and is particularly effective in identifying abnormal heating patterns during abuse testing [39]. However, unlike sensors that can be embedded or sandwiched between cells, IR systems rely on unobstructed optical access to active surfaces. In practical battery packs, dense cell stacking, structural frames, busbars, insulation layers, and sealing materials severely restrict the required line-of-sight, fundamentally limiting IR applicability for in situ monitoring [40,41]. Consequently, IR imaging is more suitable for research diagnostic or external inspection than for real-time on-board deployment.
Reducing latency would require moving the sensing point closer to internal heat sources, which has motivated studies on embedded micro-sensors and thin-film RTDs. However, production adoption is still constrained by manufacturability and electrochemical compatibility. The sensors must survive the cell environment without disrupting winding and stacking processes or compromising long-term sealing integrity [42,43]. As a result, temperature channels in current on-board packs remain a robust safety baseline, but their use for early warning should explicitly account for thermal lag and installation-related faults. This motivates complementary electrical precursors discussed in the next subsection.
2.2. Electrical Signal Monitoring
Electrical signals, particularly voltage, current, and impedance, provide indirect yet practical indicators of abnormal cell behavior, as they are inherently measured by battery management systems without requiring additional hardware [11]. However, their diagnostic reliability is fundamentally constrained by pack topology and system architecture. In production BMSs, voltage and current channels are typically refreshed on the order of – Hz, whereas temperature channels are commonly refreshed on the order of Hz [21]. Beyond refresh-rate limitations, on-board electrical diagnosis must contend with pack-level signal attenuation, where parallel-connected cells suppress weak cell-level perturbations and reduce observability, together with practical limits in time alignment between voltage and current measurements across distributed acquisition chains and interference from the BMS’s own control and protection logic.
Voltage monitoring serves as the primary line of defense but faces inherent limitations in large-format battery packs. In modern EV modules employing parallel-connected cells, the collective behavior of healthy cells effectively forms a stiff voltage source, clamping the terminal voltage of the parallel group. Consequently, the subtle voltage perturbations typical of early-stage micro-shorts are heavily attenuated, often vanishing into background noise long before the fault escalates to a detectable severity [44]. This masking effect is particularly dangerous given the surging dominance of LiFePO4 chemistries [45]. Since LFP cells naturally exhibit a flat OCV-SOC plateau, they obscure early thermal voltage signatures even further, exposing the inherent vulnerability of relying exclusively on electrical indicators [46].
Compounding these observability limits is the interference from the BMS itself. Conventional BMS balancing strategies can further delay recognition, because balancing algorithms are designed for state-of-charge equalization rather than fault diagnosis. A cell exhibiting abnormal self-discharge may be interpreted as having a lower SOC. Subsequent balancing actions can partially compensate for the voltage deviation, thereby delaying fault recognition and reducing the available warning time before thermal runaway [46,47].
Current measurements provide essential contextual information for interpreting voltage anomalies, yet their practical utility is constrained by synchronization and communication latency. While sudden current transients may indicate internal irregularities, distinguishing them from load-induced fluctuations requires precise temporal alignment with voltage measurements. In distributed BMS architectures, voltage and current measurements are not always acquired at exactly the same time, and time alignment is often degraded by sampling and communication delays [48]. During dynamic operating conditions, such misalignment can generate artificial impedance transients, increasing the risk of false alarms that are difficult to suppress through filtering alone [49,50].
By combining voltage and current information, impedance-based indicators theoretically offer higher sensitivity to internal degradation processes [51]. Although full electrochemical impedance spectroscopy (EIS) is impractical for on-board implementation, simplified dynamic resistance estimation has attracted growing interest. Nevertheless, its reliability in aging battery packs is challenged by contact resistance growth at interconnects, busbars, and wiring harnesses. Such system-level resistance changes can mimic impedance signatures associated with electrolyte depletion or internal damage. Without effective mechanisms to decouple cell-intrinsic faults from connection-related degradation, impedance-based diagnostics are prone to elevated false alarm risks in long-life vehicle applications [52,53].
In on-board implementations, the usefulness of electrical signal diagnostics is constrained by the bandwidth and resolution limits of the measurement and communication chain. Vehicle-level links are typically designed for control messaging rather than continuous streaming of high-rate waveforms. For instance, CAN FD commonly uses 1 Mbit/s in the arbitration phase and 5 Mbit/s in the data phase [54]. This motivates deployment-oriented designs in which high-frequency signals are processed locally near the module-level front end and only compact diagnostic indicators are forwarded to higher-level controllers, reducing communication load and enabling deterministic execution under tight resource budgets [17,55]. These considerations motivate complementary modalities whose diagnostic value is less dependent on sustained high-rate electrical streaming.
2.3. Gas and Pressure Signal Monitoring
Gas evolution offers a high-fidelity confirmation of internal failure, distinguishable from thermal/electrical signals by its chemical specificity [11]. However, effective on-board detection relies on capturing the initial venting event. Unlike the massive gas release observed during thermal runaway, which is typically dominated by CO2, H2, and light hydrocarbons, early-stage venting is primarily characterized by volatile organic compounds (VOCs) derived from the electrolyte, such as dimethyl carbonate (DMC) and ethyl methyl carbonate (EMC). Catching this volatile signature requires sensors placed strategically in the pack’s regions with low convective airflow to avoid convective dispersion by the thermal management system’s cooling fans, a requirement that often conflicts with the pack’s structural design [19,56,57].
Metal-oxide semiconductor (MOS) sensors are widely investigated for detecting early-stage battery venting [10]. Their operational principle relies on measuring changes in the electrical conductivity of a semiconducting oxide upon exposure to target gases [58]. These sensors are characterized by high sensitivity, a compact form factor, and low cost [59]. However, their practical application exhibits significant challenges such as cross-sensitivity, thermal instability, and significant baseline drift, especially within the dynamic operating environments of EVs [60]. Unlike sealed laboratory chambers, vehicle battery packs employ breathing valves for pressure equalization. This exposes internal sensors to roadway pollutants; high concentrations of traffic exhaust (, ) in tunnels can potentially trigger false alarms. Furthermore, volatile siloxanes from battery potting compounds and thermal adhesives impair the sensing layer of MOS sensors, causing significant sensitivity drift over the vehicle’s lifespan [60,61].
Addressing the selectivity limitations of MOS sensors, non-dispersive infrared (NDIR) sensors offer superior selectivity, particularly for detecting CO2, a dominant byproduct of electrolyte decomposition [11]. These sensors function by quantifying gas concentration through the analysis of IR absorption at specific wavelengths. While NDIR sensors are valued for their high accuracy and long-term stability [43], they still face integration hurdles: the required optical path length necessitates a bulky form factor, and the optics are vulnerable to being blocked by aerosolized electrolyte or soot during a venting event. Alternatively, electrochemical sensors offer compactness but struggle with signal saturation. Sudden high-concentration bursts typical of thermal runaway can saturate the sensing electrode, resulting in either temporary saturation or long-term, irreversible sensitivity degradation due to electrolyte depletion [62,63].
Complementing chemical gas detection, monitoring internal pressure buildup offers a physical indicator of early cell degradation [43]. Pressure sensors, which can be embedded within modules or mounted externally to monitor vent gas release, are capable of detecting physical anomalies such as swelling, gas accumulation, or separator failure [63]. Relevant technologies include optical fiber-based sensors and piezoresistive elements, which have been investigated in both laboratory and field trials [64]. However, pressure-based diagnostics are strongly affected by environmental variations. Thermal expansion and altitude-induced pressure changes in large-format packs produce significant baseline fluctuations, rendering absolute thresholds unreliable. Consequently, practical implementations rely on detecting transient pressure gradients () associated with valve rupture, which requires elevated sampling rates and increases BMS communication bandwidth demands [47,64].
Gas and pressure signals are often among the more sensitive channels for early abnormalities, yet that sensitivity comes with a robustness cost in real vehicles. They can offer warning horizons on the order of tens of seconds to minutes ahead of surface temperature rise, but open-road disturbances and environmental variability can undermine simple thresholds, making them unreliable as standalone safety triggers [43]. A more deployment-friendly approach is to use gas or pressure as a wake-up cue that initiates short bursts of high-frequency electrical or thermal monitoring, while final protective actions are taken only after confirmation from more stable channels rather than granting an independent cutoff authority [65].
2.4. Acoustic Signal Monitoring
Acoustic signals theoretically offer one of the earliest precursors by capturing the elastic energy released during crack initiation, electrode delamination, or valve rupture [43,66]. Unlike thermal or chemical precursors, which rely on slow diffusion processes, acoustic emissions (AE) propagate at the speed of sound through the solid structure, offering minimal intrinsic propagation delay. This physical characteristic enables acoustic sensing to detect mechanochemical degradation events ahead of observable temperature rise or gas concentration accumulation in certain failure scenarios [67,68].
Microelectromechanical system (MEMS) microphones represent a cost-effective and readily available technology for detecting acoustic emissions originating from battery modules [43]. Their compact size and digital interface simplify their integration with battery enclosures [47]. Several studies have demonstrated their utility in identifying distinct acoustic signatures associated with events like gas venting or casing rupture in overcharged cells [63]. However, their automotive deployment is significantly constrained by spectral overlap between battery-related acoustic emissions and ambient vehicle noise. The operational bandwidth of standard MEMS (<20 kHz) coincides directly with the dominant frequency range of road noise, powertrain harmonics, and HVAC vibration. Consequently, early-stage battery fault signals are easily masked by the vehicle’s dynamic noise floor. While newer MEMS extend to the ultrasonic range (>40 kHz) to bypass this mechanical noise, they primarily detect air-borne sound, which suffers from rapid attenuation in battery packs filled with dense potting compounds and thermal foams [47,63].
To probe internal structural integrity, ultrasonic sensors operating in the megahertz (MHz) range have been investigated as a powerful complement to passive microphones [66]. By propagating ultrasonic guided waves through the battery casing, these sensors can detect shifts in acoustic impedance resulting from internal delamination, gas evolution, or structural degradation [67]. Variations in ultrasonic amplitude and time-of-flight have been demonstrated to be sensitive indicators of progressive cell damage [51]. Despite their high sensitivity under controlled conditions, on-board implementation faces the integration constraint. High-frequency stress waves require a rigid mechanical interface for transmission. However, EV battery modules are designed with compliant thermal pads and breathing gaps to accommodate swelling and shock. Maintaining a stable and void-free acoustic coupling interface over 15 years of thermal expansion cycles and vibration is an immense materials engineering challenge. A dry-out or delamination of the coupling medium results in immediate signal loss, rendering the system unreliable [51,67].
An alternative acoustic sensing strategy is passive AE monitoring, which detects spontaneous stress-wave emissions generated during cell degradation without requiring active excitation [66]. By capturing transient signals over a broad frequency range, AE sensors have been used to characterise critical failure events such as separator rupture and abrupt gas release [43]. However, shifting from active probing to passive listening transfers the primary challenge from mechanical coupling to event discrimination. In a moving vehicle, routine mechanical disturbances—such as debris impacts on the chassis or structural flexure when traversing potholes—can generate transient stress waves that closely resemble those produced by electrode cracking. Reliably distinguishing genuine electrochemical failure precursors from such background events requires high sampling rates, precise time synchronization, and computationally intensive classification algorithms, imposing bandwidth and cost burdens that remain incompatible with mass-market BMS architectures [66,69].
Acoustic sensing can respond very quickly, yet it also brings the highest integration burden among precursor modalities, particularly in packaging, calibration, and in-vehicle noise management [70]. To make the signal more repeatable on the road, recent implementations increasingly emphasize structure-borne sensing rather than air-borne monitoring, because rigid pack components can guide acoustic waves while reducing exposure to environmental interference [71,72]. However, data handling becomes the next bottleneck. Ultrasonic analysis often requires sampling rates exceeding 100 kS/s, which makes continuous raw-stream transmission unrealistic for automotive buses. Deployment therefore tends to rely on edge-side processing near the sensing front end, where compact accelerators extract features locally and forward only event-level indicators to the BMS, keeping bandwidth use and decision latency within practical limits [73].
2.5. Sensor Deployment Considerations
Sensor placement is constrained far more by topology and safety clearances than by sensing principles alone [17]. Chasing the fastest thermal response would suggest attaching NTCs close to electrode tabs, yet this can violate clearance requirements unless bulky potting protection is added, which conflicts with packing density and serviceability. In practice, mounting on busbars with reinforced insulation is often the workable compromise, with a predictable thermal impedance penalty accepted upfront [27].
Gas sensing runs into a different constraint that comes from flow and geometry. Dense module stacking and cooling plates can create locally stagnant zones, so a sensor placed in an unfavourable pocket may not see the initial vent plume until diffusion smooths the concentration field minutes later [74]. This is why placement near pressure relief paths is typically guided by CFD rather than by convenience alone, especially when the sensor is expected to contribute to a time critical decision.
For acoustics, the bottleneck is mechanical coupling. Sensors prefer rigid structural contact, while packs are commonly designed with compliant foams and interfaces for vibration isolation [75,76]. The mismatch pushes integrators toward mounting on frames or stiff load paths and relying on structure borne propagation instead of air borne transmission.
Placement is only half of the integration story. Harness complexity quickly becomes the dominant limiter for sensor density. Instrumenting every cell with discrete devices produces an unmanageable wiring bundle, adding weight and assembly cost while creating a large number of potential failure points. A common practical response is sparse sensing that targets statistically critical cells, combined with Wireless BMS (wBMS) and flexible printed circuits (FPC) to reduce copper harnesses and connector related reliability issues [77].
These constraints also explain why a single modality rarely satisfies functional safety expectations by itself. Thermal channels tend to be robust but slow, while gas and acoustic cues can be faster yet more exposed to environmental variability. A defense in depth design can allocate roles across orthogonal modalities, such as using a fast cue for wake up and then confirming with more stable electrical evidence before triggering protective actions [78]. Table 1 summarizes the resulting trade space in response time, integration complexity, and cost.
Table 1.
Comparison of On-board Sensors for Battery Safety Monitoring. This comparison highlights the trade-off between detection speed and engineering deployment feasibility.
3. Diagnostic Algorithms for Thermal Runaway: Capabilities and Limitations
Developing on-board diagnostic algorithms is not merely a data science challenge but a safety-critical engineering optimization. The core conflict lies in the diagnostic trade-off: maximizing lead time, minimizing false alarm rate, and fitting within microcontroller units (MCUs) resource constraints. Before reviewing specific methodologies, we establish a rigorous evaluation framework defining the acceptance criteria for automotive deployment.
3.1. Unified Evaluation Framework and Performance Metrics
Objective assessment must rely on metrics that reflect the operational reality of a vehicle fleet, rather than balanced laboratory datasets. Key performance indicators (KPIs) are as follows.
Lead Time (): Defined as the interval between the alarm and hazardous thermal runaway. While academic studies aim to maximize warning lead time, the hard engineering constraint is the “5-Minute Rule” mandated by GB 38031-2025 [80] and UN GTR No. 20 [81]. Accordingly, an algorithm is only considered compliant if it provides (>300 s) to enable safe occupant evacuation across all trigger conditions [13]. Reported in the literature spans a wide range, from seconds to minutes or longer, and should be interpreted together with the test scenario and sensing bandwidth [60,82,83].
False Alarm Rate (FAR): For Original Equipment Manufacturers (OEMs), FAR is the dominant metric. Given that TR is an extremely rare event, standard accuracy metrics are misleading due to the Accuracy Paradox in imbalanced datasets. A model with 99% accuracy could still generate thousands of false alarms across a fleet, leading to desensitization and prohibitive warranty costs. Therefore, industry standards typically mandate an FAR of <1 ppm (part per million) for critical cut-off signals [47].
Sensitivity: Complementing FAR, this measures the ability to capture subtle precursors before they evolve. For example, early-stage internal shorts are often characterized by an equivalent short resistance ranging from the order of 10–102 Ω [7,84], and the engineering challenge is distinguishing these low-magnitude signatures from the dynamic noise of regenerative braking or fast charging. A viable algorithm must maintain high sensitivity without crossing the FAR threshold, effectively optimizing the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC).
Computational Latency: Automotive MCUs operate on strict real-time operating system (RTOS) schedules. Hence, latency is defined here as the on-board inference runtime measured by the worst-case execution time (WCET) on the target MCUs or system-on-chips (SoCs), excluding offline training. In practice, WCET must fit within the allocated task period to meet real-time deadlines. Separately, the effective update rate of diagnosis is bounded by the refresh rate of available BMS measurements: voltage/current are commonly refreshed at 1–10 ms intervals, while temperature is typically refreshed at 0.1–1 s intervals [21,22]. Therefore, the inference time should be kept well within the task period so that diagnostic computation does not compete with time-critical protection actions such as relay control [85].
Generalization: A critical failure mode for many academic algorithms is overfitting to fresh cells. As batteries age, their internal resistance increases and capacity fades, shifting the baseline of normal behavior. A robust algorithm must be chemistry-agnostic or adaptive, maintaining performance across the full State-of-Health (SOH) range without requiring frequent recalibration [86].
3.2. Threshold-Based Diagnostic Approaches
Threshold-based diagnostics represent the foundation of industrial battery safety strategies, valued for their deterministic logic and minimal computational demand. Leveraging the sensor data discussed in Section 2, this methodology operates by comparing measured parameters, such as temperature, voltage, current, or their time-derivatives, against predefined safety limits.
Among these signals, temperature serves as the primary trigger. Wang et al. [87] and Feng et al. [7] established the industry-standard multi-level alarm strategy: combining absolute thresholds (e.g., >70 °C for warning, >100 °C for cutoff) with rate-of-rise limits ( °C/s). While these static rules form the functional safety baseline, the algorithm inherently inherits the sensing latency described in Section 2.1. Consequently, by the time a static threshold is breached, the internal reaction is often already self-sustaining. This limits the tactical utility of thresholding to late-stage confirmation rather than proactive prevention.
The dominance of threshold methods stems from their minimal Computational Latency. However, their deployment involves a stringent trade-off between sensitivity and FAR. Relying on raw signals renders the system susceptible to dynamic measurement noise: electromagnetic interference (EMI) on voltage lines or thermal transients from coolant pump cycling can mimic fault signatures. A threshold set too conservative ensures zero FAR but results in a high Missing Rate for early precursors; a threshold set too aggressive triggers constant false alarms during aggressive driving, leading to driver desensitization [87].
To improve the SNR without abandoning the low-complexity framework, researchers have introduced signal-processing-enhanced thresholding. Instead of monitoring raw amplitude, these methods apply thresholds to extracted statistical features that quantify signal disorder. For instance, Widodo et al. [88] utilized Sample Entropy (SampEn) to quantify voltage time-series complexity, successfully identifying degradation patterns hidden within the voltage noise floor. Similarly, Shang et al. [89] extended this using Modified Sample Entropy (MSE). By thresholding these non-linear features rather than raw data, such methods significantly decouple fault signatures from linear dynamic loads, improving robustness against operating conditions.
Threshold-based logic remains prevalent in production BMSs because it aligns well with regulatory expectations and can be executed within tight real-time budgets. Its dependence on lagging indicators and limited ability to adapt to aging, however, leaves little usable lead time for early warning. For this reason, many intelligent BMS designs position threshold rules as a conservative safety guard, serving as a final fail-safe layer that backs up more informative model-based and data-driven estimators.
3.3. Model-Based Diagnostic Approaches
Model-based diagnostics attempt to overcome the black box nature of thresholding by running a parallel mathematical simulation of the battery [90]. Evaluated against the framework in Section 3.1, these methods trade increased Computational Latency for extended lead time by estimating unmeasurable internal states. The core mechanism relies on residual generation, where faults are identified when the deviation between model predictions and sensor measurements exceeds a dynamic bound. However, the reliability of this residual is fundamentally limited by the discrepancy between the idealized model physics and the actual aging state of the physical cell.
Equivalent Circuit Models (ECMs) dominate current on-board implementations due to their minimal computational latency. By abstracting complex electrochemistry into lumped resistor-capacitor (RC) networks, ECMs fit comfortably within the floating-point operation budgets of standard automotive MCUs. Studies have successfully demonstrated the use of ECM-based Kalman filters to detect voltage anomalies associated with internal short circuits [91]. However, ECMs are limited by their lumped-parameter nature. Since they treat the cell as a homogenous point mass, they inherently lack the capability to resolve localized thermal hotspots or uneven current distributions characteristic of early-stage failures [92]. This lack of spatial granularity often delays detection until the local fault has propagated sufficiently to affect global cell parameters, thereby reducing the effective sensitivity to early-stage failures [93].
Electrochemical models, particularly the Pseudo-Two-Dimensional (P2D) model [94], offer the highest theoretical sensitivity by resolving spatial distributions of lithium-ion concentration and potential [95]. This allows for the prediction of internal exothermic triggers around 140 °C, offering the longest lead time [96]. Ren et al. [97] further coupled the P2D framework with thermal models to quantify heat generation during overcharge, achieving less than 5% error in thermal runaway onset time prediction. Nevertheless, P2D models are currently computationally prohibitive for embedded deployment. Solving the coupled partial differential equations (PDEs) requires matrix operations that exceed the capability of standard BMS hardware. More critically, these models face significant challenges regarding parameter identifiability; accurately determining the dozens of electrochemical parameters, such as solid-phase diffusion coefficients, for an aging cell in a moving vehicle is operationally difficult. Consequently, P2D models remain primarily tools for offline validation or cloud-based digital twins rather than on-board inference [55].
The Single Particle Model (SPM) reduces computational load by assuming a uniform current distribution, simplifying the electrode to a single representative particle [98]. While this simplification retains sufficient accuracy for state-of-charge estimation, the SPM is structurally limited for thermal runaway diagnostics [99]. Its fundamental assumption of spatial homogeneity leads to an averaging effect, where the localized signal of a micro-short or lithium plating spot is mathematically averaged out over the entire electrode area [100]. Consequently, the SPM yields a high missing rate for the very localized defects that typically initiate thermal runaway, offering little advantage over simpler ECMs in this specific safety context.
Reduced-order models (ROMs) have emerged as a practical compromise, balancing Computational Latency and Detection Rate. Han et al. [101] applied orthogonal decomposition to compress P2D equations into 3–5 ordinary differential equations, reducing computation time by 98%. Deng et al. [102] implemented polynomial-approximated ROMs on embedded BMS hardware, enabling millisecond-response fault detection. For TR prognosis, Xu et al. [55] developed cloud-based ROMs that analyze voltage or temperature residuals to trigger alarms 15 min pre-event with 90% reliability. Despite this potential, the practical deployment of ROMs faces a challenge in Generalization Ability: as cells age, internal parameters drift, necessitating complex online parameter identification to prevent model divergence and false alarms [103].
3.4. Data-Driven Diagnostic Approaches
Data-driven approaches diverge from physical modeling by learning a mapping from historical BMS measurements to fault probabilities, typically through offline training on servers or GPUs followed by on-board inference on resource-constrained hardware. Evaluated against the metrics in Section 3.1, these methods theoretically offer the highest sensitivity to non-linear early-stage faults, such as micro-shorts obscured by dynamic loads. However, their on-board reliability is frequently overestimated in literature due to the data leakage phenomenon, where training and testing datasets share similar environmental biases. The central engineering challenge is not learning the fault pattern, but maintaining generalization when the vehicle encounters novel drive cycles or ambient conditions not represented in the training distribution [104].
From a deployment perspective, data-driven diagnosis typically follows a two-stage workflow: offline training and online inference. Training is usually performed off-board on GPUs and primarily affects development cost and model update cycles, rather than real-time feasibility on the vehicle. In contrast, on-board operation executes inference only on the target BMS hardware. Therefore, unless otherwise stated, the computational metrics discussed in this subsection refer to on-board inference.
Feature-driven Machine Learning (ML) represents the most deployment-ready subset [104]. By utilizing engineered electro-thermal descriptors, these methods train lightweight classifiers that fit comfortably within the on-board inference latency of automotive MCUs. For instance, Jiang et al. [86] combined features derived from EIS with temperature–gradient cues to suppress FAR by 40%. Yao et al. [105] demonstrated that a support vector machine (SVM) using voltage–hysteresis features can detect micro-shorts in real-world vehicle data with approximately 95% accuracy and an inference time under 50 ms on MCUs, illustrating the feasibility of real-time deployment. Wang et al. [106] employed XGBoost with voltage–temperature correlations for over-discharge diagnosis, attaining an F1-score improvement of about 28% relative to threshold-based baselines. Extending beyond detection, Zhao et al. [107] modeled hidden state transitions in satellite batteries with Gaussian mixture models (GMMs) to forecast capacity-fade trajectories with less than 3% error, while Hong et al. [108] combined random forests with Shannon-entropy features to provide 30-minute early warnings for thermal anomalies. However, the robustness of this approach is entirely dependent on the quality of the input features. Descriptors like are mathematically sensitive to measurement noise and sampling asynchrony; a noisy voltage sensor can generate artificial feature spikes, causing the classifier to fail despite its underlying logic being sound. Collectively, these studies demonstrate that the primary trade-off of traditional ML is the reliance on manual feature engineering, which requires deep domain knowledge and may limit the detection of novel, undefined fault types.
Deep Learning (DL) extends this approach by learning cross-signal interactions directly from raw time-series data, eliminating the need for manual feature extraction. This capability significantly enhances sensitivity and lead time. For instance, Liu et al. [109] converted voltage sequences into Gramian Angular Fields and trained Convolutional Neural Networks (CNNs) to identify cell-level faults with approximately 97% precision. Similarly, Sun et al. [110] employed Bi-LSTMs to capture module-level thermal propagation patterns, achieving approximately 35-minute early warnings under controlled test conditions. In the domain of health prognostics, Tian et al. [111] utilized Transformers to predict degradation paths with about 1.5% SOH error. In a related study, Wei et al. [112] integrated CNN and LSTM to model internal temperature distributions, achieving a mean absolute error of 0.8 °C. However, deploying end-to-end DL models faces three distinct engineering hurdles. First, the computational cost of processing high-dimensional tensors often necessitates specialized Neural Network Processing Units (NPUs), conflicting with the cost structure of standard BMS hardware. Second, the black box nature of DL presents a certification barrier under ISO 21448 (Safety of the Intended Functionality, SOTIF) [113], as it is difficult to formally prove that the network will not behave hazardously in unknown scenarios. Third, the extreme class imbalance in real-world data—where faulty samples are virtually non-existent compared to normal driving data—biases models toward false positives. Although techniques like Generative Adversarial Networks (GANs) and Transfer Learning are being explored to synthesize fault data [114], guaranteeing their fidelity to physical reality remains an open research question.
Hybrid models that integrate physical principles with data-driven techniques offer a promising alternative to purely data-driven architectures [92]. These models combine the generalization of physical constraints with the adaptability of machine learning, thereby enhancing performance under noisy or dynamic conditions while reducing the reliance on extensive labeled datasets. For instance, Finegan et al. [115] embedded electrochemical equations with physics-informed neural networks (PINNs) to stabilize inference of TR-related reaction kinetics, improving reliability in the presence of measurement noise. Similarly, Li et al. [116] combined model-based observers with LSTMs. Their hybrid model successfully isolated voltage and current sensor anomalies within approximately 10 s, complementing purely data-driven detectors by incorporating consistency checks grounded in first principles. At the fleet scale, Xu et al. [117] developed a federated digital twin for distributed diagnosis of over 10,000 EV batteries, enabling cloud–edge collaboration without centralizing raw data. By incorporating physical constraints, hybrid models significantly reduce the black box risk and lower the requirement for massive labeled datasets, offering a balanced pathway for robust on-board implementation.
Algorithm selection is shaped by the hardware and safety partitioning across the BMS stack. On safety MCUs, feature-driven machine learning is often preferred because it can be made deterministic and interpretable while meeting strict timing constraints. Deep learning can provide earlier warnings for complex failure modes, but it typically requires hardware acceleration and a heavier SOTIF-oriented validation burden. A layered deployment is therefore common in practice, with lightweight feature-based logic enforcing immediate protection and higher-complexity models operating as advisory detectors on a higher-performance SoC or a cloud digital twin.
Recognizing these complementary strengths, recent prototype implementations have begun to synthesize these approaches. They strategically combine curated features for stability, compact sequence learners for early warnings, and lightweight physical constraints to ensure model plausibility. This integrated methodology aims to concurrently satisfy demanding requirements for low latency, minimal false alarms, and robust cross-platform generalization.
A comparative summary of these sensing modalities, evaluated against key automotive engineering constraints including response time, integration complexity, and cost, is presented in Table 2.
Table 2.
Evaluation of Diagnostic Algorithms against Key Performance Indicators. The comparison assumes a standard automotive-grade MCU/SoC environment without cloud acceleration. TRL denotes Technology Readiness Level.
4. Deployment Considerations for On-Board TR Diagnostics
While advanced diagnostic algorithms demonstrate high fidelity in laboratory settings, their transition to on-board BMS requires navigating a complex set of hardware constraints and safety mandates. This section bridges the gap between algorithmic theory and vehicular reality, focusing on model optimization, inference hardware, system architecture, and regulatory compliance.
4.1. Safety Limits, Overdrive Conditions, and Cooling Constraints
Before entering model compression and hardware choices, it is helpful to clarify the safety envelope that ultimately defines whether an on-board warning remains actionable. In engineering practice, this envelope is often expressed through temperature, voltage, and current limits, beyond which irreversible degradation or self-sustaining reactions may occur [78,121]. These boundaries should not be treated as fixed numbers. Once a pack operates closer to overdrive regimes, the effective distance to critical limits can shrink and the intervention window narrows accordingly [83]. Under such conditions, lead time and decision latency are better interpreted against the remaining safety margin rather than a nominal operating envelope.
Thermal management introduces another layer of uncertainty. Pack-level cooling is typically designed to handle spatially averaged heat rejection, whereas an internal fault may remain highly localized for a non-trivial period. Its protective effect is therefore scenario-dependent and may be weakened by non-uniform coolant distribution, delayed actuation, or partial degradation of the thermal loop [122,123]. In addition, localized heat generation can originate deep inside the cell and must traverse multiple thermal resistances before it becomes observable at surface or pack-level sensors, which may delay thermal observability even when cooling hardware is present. As a result, detection strategies should not assume that cooling can universally compensate emerging hazards. Early-warning logic is preferably designed to trigger before cooling saturation and thermal propagation occur, and can be made cooling-aware when thermal-loop indicators are available.
4.2. Model Compression Techniques
Deploying data-driven models on embedded platforms often requires aggressive compression, especially when memory and deterministic execution are constrained [124,125]. Quantization is widely adopted because fixed-point arithmetic is efficient on automotive-grade processors, and it can substantially reduce memory footprint and bandwidth demand [126]. Structured pruning can further lower latency by removing redundant channels or kernels, yet the gain is not free [127]. If compression becomes too aggressive, sensitivity to weak and high-frequency micro-short signatures may be lost, and early precursors risk being absorbed into background noise. Knowledge distillation offers a more conservative compromise in many deployments, since a compact student model can be trained to reproduce a larger teacher’s behavior while retaining decision boundaries for rare fault events under tight resource budgets [128].
4.3. Automotive-Grade Inference Platforms
Executing these compressed models efficiently and reliably requires carefully selected hardware. The choice of inference platform involves a critical trade-off between computational throughput, power consumption, and functional safety integrity levels, across both active operation and parked states [129].
Microcontrollers (MCUs). Traditional automotive MCUs serve as the safety backbone of BMS [129]. While they offer the highest safety integrity (ASIL-D) required for critical cutoff decisions, their architecture lacks the parallel Multiply-Accumulate (MAC) units necessary for deep learning. Consequently, MCUs are best suited for rule-based logic or lightweight ML models where execution determinism is critical, acting as the safe actuator rather than the intelligent engine [120].
Systems-on-Chips (SoCs). For data-intensive diagnostic tasks, heterogeneous SoCs have become the platform of choice. These devices integrate high-performance ARM cores with dedicated NPUs, enabling real-time inference of complex architectures like CNNs or LSTMs within the strict SWaP constraints of sealed packs.However, qualifying complex AI accelerators to the highest ASIL targets is non-trivial, so safety architectures typically retain an independent safety controller for final actuation decisions. This necessitates a dual-chip architecture, where the SoCs processes early warnings while a safety MCU verifies the final critical action, a trend increasingly observed in intelligent BMS designs [120].
Field-Programmable Gate Arrays (FPGAs). Offering massive parallelism and hardware-level customization, FPGAs are uniquely capable of ultra-low, deterministic latency [130]. Research has demonstrated that FPGAs can execute complex state estimation algorithms with tens of microseconds level latency. This makes them particularly suitable for fusing high-frequency sensor streams that would saturate standard MCU interrupts. Despite these technical advantages, high unit costs and development complexity currently limit their adoption to premium prototype vehicles.
Hardware-Software Co-design. Ultimately, hardware selection is not an independent decision but is deeply intertwined with model architecture. The emerging trend is co-optimization, where algorithms are tailored to the specific sparsity patterns or quantization schemes supported by the target hardware’s compiler [130]. This integrated approach ensures that theoretical algorithmic gains translate into actual latency reductions in vehicle-grade deployments, avoiding the bottleneck where memory access speed lags behind computation capability.
This hardware hierarchy is also useful for power management. During parking, the BMS typically operates in a low-power mode, so it is rarely practical to keep a high-performance SoC continuously active. Instead, a low-power MCU domain can remain awake to perform sparse monitoring and lightweight screening, while the SoC is activated only when needed, either periodically in a duty-cycled manner or upon simple event triggers, to run short bursts of higher-complexity inference. Yet even with optimal hardware, ensuring timely and reliable warning delivery requires robust communication and integration strategies.
4.4. Communication and System Integration
Beyond on-chip execution, diagnostic outputs must be effectively integrated into the broader vehicle architecture. The timely and reliable transmission of warnings [131] is as critical as their initial detection.
In standard centralized or distributed BMS architectures, high-frequency data from cell monitoring units (CMUs) must traverse bandwidth-limited daisy chains (ISO SPI) or localized CAN buses to reach the master controller. Streaming raw waveforms through these legacy channels induces deterministic latency that can delay critical cutoff decisions. To mitigate this, Edge Computing strategies are increasingly adopted, where feature extraction occurs locally at the module level, transmitting only compact health flags or compressed feature vectors to the master BMS, thereby reducing bus load by orders of magnitude.
To accommodate data-intensive methodologies, automotive architectures are evolving beyond CAN-FD. Automotive Ethernet is being explored for high-bandwidth backbone communication. Furthermore, wBMS has emerged as a promising technology. By replacing heavy copper harnesses with secure RF communication, wBMS reduces pack weight and eliminates wiring failure points, though it introduces new challenges in signal reliability within the electromagnetic noise of an EV pack
Over-the-Air (OTA) connectivity enables a Cloud-Edge Collaboration, where on-board algorithms are periodically updated based on fleet-wide learning. However, this connectivity expands the attack surface. Ensuring the integrity of safety-critical updates requires adherence to ISO 21434 [132]. Robust authentication protocols, secure boot mechanisms, and encrypted data tunnels are mandatory to prevent malicious tampering that could disable TR warnings or induce thermal hazards.
4.5. Functional Safety and Regulatory Standards
Battery TR detection systems are safety-critical and must therefore comply with stringent functional safety standards.
In the automotive sector, ISO 26262 is the governing standard [133] for mitigating risks caused by system malfunctions. Achieving a high ASIL rating (ASIL C/D) for TR warning often necessitates hardware redundancy and diverse software monitors.
A unique challenge for data-driven diagnostics is that ISO 26262 does not cover performance limitations of non-deterministic algorithms, such as an AI misinterpreting noise as a fault. This gap is addressed by ISO 21448. For black box DL models to be certified, SOTIF requires rigorous validation against unknown unsafe scenarios. This places a premium on Explainable AI (XAI) and often mandates rule-based safety guards, deterministic logic that overrides AI outputs if physical bounds are violated, to ensure a safe state.
Beyond process safety, regulations define the minimum performance capability. Notably, GB 38031-2020 (China) and UN GTR No. 20 (Global) mandate that the system must provide a warning signal at least 5 min prior to any hazardous event in the passenger compartment. This “5-min rule” sets the rigorous lead time benchmark that all diagnostic algorithms discussed in Section 3 strive to exceed.
For ESS, UL 9540A serves as the definitive test methodology for evaluating thermal runaway propagation. Unlike automotive standards, which focus on passenger egress, UL 9540A data directly influences the design of active fire suppression systems and separation distances, requiring diagnostic algorithms to trigger well before cell venting evolves into module-level propagation.
To visualize the engineering positioning of these diverse strategies, Figure 3 maps each diagnostic category based on its computational cost and detection sensitivity. The analysis highlights that Hybrid/ROM approaches currently occupy the ‘optimal zone’ for on-board deployment, balancing fidelity with hardware constraints.
Figure 3.
Performance trade-off analysis of diagnostic algorithms for on-board deployment. The bubble chart classifies diagnostic methodologies based on three critical engineering metrics: (1) X-axis: Computational cost and inference latency (Low to High); (2) Y-axis: Detection sensitivity and early-warning lead time (Low to High); (3) Bubble Size: Generalization ability and robustness against aging. Threshold-based methods offer minimal cost but limited sensitivity. Deep Learning provides high sensitivity but demands significant resources. The green zone highlights Hybrid models and Reduced-Order Models as the optimal engineering compromise.
4.6. Field-Deployed Case Studies
Industrial implementation follows a staged evolution path. Current State (Gen 1): OEMs rely on enhanced firmware logic running on safety MCUs, offering high robustness but limited lead time. Emerging Phase (Gen 2): Pilot programs in stationary storage are validating advanced sensing, though cost prevents automotive adoption. Future Architecture (Gen 3): The target state is Hybrid Sensor-AI Fusion, running on heterogeneous dual-chip platforms (See Figure 4). Prototypes integrating gas sensors with compressed CNNs have demonstrated lead time extensions from seconds to minutes [134]. However, mass adoption awaits the standardization of validation protocols to certify these probabilistic systems against functional safety mandates [63,135].
Figure 4.
Hierarchical system architecture for on-board thermal runaway detection. The diagram illustrates the data flow and hardware allocation across three levels: (1) Edge Level: Sensors capture multi-modal data. To overcome bandwidth bottlenecks, the Analog Front End (AFE)/Cell Supervision Circuit (CSC) performs preliminary feature extraction. (2) Pack Level (Dual-Chip): A heterogeneous architecture combines a high-performance SoCs (ASIL-B) for running complex AI models with a safety-critical MCU (ASIL-D). A safety guard mechanism ensures that AI outputs are cross-checked against physical rules. (3) Cloud Level: Digital twins leverage historical fleet data to retrain models, pushing optimized parameters back to the vehicle via OTA updates.
5. Challenges, Future Directions, and Conclusions
5.1. Hardware Frontiers: From Sensor Ruggedization to Novel Chemistries
The physical layer of TR detection faces a dual challenge: ensuring longevity for current technologies while adapting to next-generation energy storage.
First, sensor reliability remains a key bottleneck for long-term on-board deployment. Unlike laboratory instrumentation, automotive sensors must endure a 15-year lifecycle involving extreme thermal shock and vibration. A critical failure mode is sensor drift. Gas and acoustic sensors often lose calibration accuracy over time due to electrolyte exposure or mechanical fatigue, leading to an unacceptable rise in FAR as the vehicle ages [15,136]. Future packaging must evolve toward non-intrusive retrofitting, such as embedding sensors into structural components like busbars, equipped with self-calibration algorithms to mitigate drift.
Beyond these immediate drift mechanisms, battery aging introduces a slow but systematic shift in the cell baseline. Capacity fade and resistance growth can alter the nominal voltage/temperature/impedance signatures over the service life, gradually eroding the validity of fixed thresholds and biasing residual-based diagnostics if model parameters are not updated. In practice, long-term robustness benefits from making decision logic SOH-aware and from incorporating maintenance-friendly recalibration mechanisms, so that diagnostic sensitivity does not degrade as the pack ages.
Simultaneously, the industry must prepare for emerging chemistries. Solid-state batteries (SSBs) and sodium-ion systems introduce unique failure modes that render legacy strategies obsolete [137,138,139]. For instance, SSBs lack the characteristic early-stage off-gassing sequence of liquid electrolytes, potentially invalidating current gas detection architectures. Consequently, future sensing frameworks must be chemistry-agnostic, shifting towards mechanical or active ultrasonic probing to capture the distinct physical signatures of solid-state failure [140,141].
5.2. The Software Ecosystem: Data Standardization and Cloud Collaboration
On the algorithmic front, the reproducibility is limited by fragmented proprietary datasets and heterogeneous test protocols. Beyond data availability, the absence of standardized validation protocols and benchmarking datasets further complicates objective comparison among TR detection algorithms. Existing studies often rely on heterogeneous experimental setups, proprietary abuse tests, or narrowly defined operating conditions, making reported performance metrics difficult to reproduce or generalize. As a result, it remains unclear whether observed gains stem from algorithmic advances or from dataset-specific biases and evaluation choices. Establishing shared benchmarks with clearly defined fault scenarios, aging states, and false-alarm constraints is therefore essential for credible algorithm assessment and technology transfer. In this context, open-access initiatives such as the Battery Data Genome provide an important foundation for validating algorithms against realistic long-tail corner cases before vehicle integration [142].
Furthermore, overcoming the computational constraints of on-board MCUs requires a fundamental change toward cloud-edge collaboration. While the on-board BMS (Edge) executes lightweight, deterministic safety guards, a Digital Twin (Cloud) can leverage high-fidelity physics models to track long-term aging parameters. This collaboration enables Adaptive Diagnostics: the cloud periodically pushes updated thresholds to the vehicle via Over-the-Air (OTA) updates, ensuring that sensitivity does not degrade as the battery ages. Achieving this vision requires robust cybersecurity frameworks (ISO 21434) to protect the data pipeline [135].
5.3. Conclusions
This review has systematically evaluated the transition of TR detection from laboratory concepts to on-board implementations. We highlight the engineering trade-offs inherent in precursor sensing, demonstrating that while acoustic and gas signals offer earlier warning capabilities, their implementation is strongly constrained by environmental interference and packaging limitations. Meanwhile, electrical and thermal measurements that are routinely available in standard BMS architectures remain essential reference signals for robust decisions. Moreover, our assessment indicates that no single technique is sufficient for robust TR detection across operating conditions. In practice, threshold-based logic remains the safety baseline, while data-driven approaches can be more sensitive to weak precursors but require validation under representative operating conditions.
Thermal runaway detection has evolved from simple temperature monitoring to a multi-disciplinary field integrating advanced sensing, electrochemical modeling, and artificial intelligence. Effective on-board implementation is not merely an algorithmic challenge but a systems engineering optimization.
We therefore emphasize three priorities for practical deployment (1) Multi-modal Fusion: reducing single-signal dependence by combining fast acoustic or gas cues with the reliability of electrical and thermal references in a layered architecture. (2) Hardware-Aware AI: co-designing diagnostic models with automotive computing platforms, such as dual-chip MCU and SoC architectures, to meet real-time and ASIL-related constraints. (3) Lifecycle Robustness: maintaining low FAR over the full vehicle lifetime, rather than optimizing performance only on fresh cells.
Ultimately, progress in physics-informed AI, standardized benchmarking, and cloud and edge collaboration can help translate laboratory prototypes into reliable, deployable safety functions for electric vehicles.
Funding
This research was funded by Hunan Provincial Natural Science Foundation, grant number 2024JJ4072.
Data Availability Statement
No new data were created or analyzed in this study. Data sharing is not applicable to this article.
Conflicts of Interest
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the deci-sion to publish the results.
References
- Tarascon, J.M.; Armand, M. Issues and Challenges Facing Rechargeable Lithium Batteries. Nature 2001, 414, 359–367. [Google Scholar] [CrossRef]
- Xue, N.; Du, W.; Greszler, T.A.; Shyy, W.; Martins, J.R.R.A. Design of a Lithium-Ion Battery Pack for PHEV Using a Hybrid Optimization Method. Appl. Energy 2014, 115, 591–602. [Google Scholar] [CrossRef]
- Sun, P.; Bisschop, R.; Niu, H.; Huang, X. A review of battery fires in electric vehicles. Fire Technol. 2020, 56, 1361–1410. [Google Scholar] [CrossRef]
- Gardner, D.W.; Charles, G.; Nguyen, T.; Javey, A.; Fahad, H.M. Mitigating lithium-ion cell thermal runaway via selective trace H2 sensing. Cell Rep. Phys. Sci. 2025, 6, 102859. [Google Scholar] [CrossRef]
- Zhu, Y.; Shang, Y.; Gu, X.; Tao, X.; Li, X.; Fu, X.; Cheng, Z. Multi-level early warning of thermal runaway based on internal pressure-temperature fusion for lithium-ion batteries. Green Energy Intell. Transp. 2025, 100368. [Google Scholar] [CrossRef]
- Liu, B.; Jia, Y.; Yuan, C.; Wang, L.; Gao, X.; Yin, S.; Xu, J. Safety Issues and Mechanisms of Lithium-Ion Battery Cell upon Mechanical Abusive Loading: A Review. Energy Storage Mater. 2020, 24, 85–112. [Google Scholar] [CrossRef]
- Feng, X.; Ouyang, M.; Liu, X.; Lu, L.; Xia, Y.; He, X. Thermal Runaway Mechanism of Lithium Ion Battery for Electric Vehicles: A Review. Energy Storage Mater. 2018, 10, 246–267. [Google Scholar] [CrossRef]
- Golubkov, A.W.; Fuchs, D.; Wagner, J.; Wiltsche, H.; Stangl, C.; Fauler, G.; Voitic, G.; Thaler, A.; Hacker, V. Thermal-runaway experiments on consumer Li-ion batteries with metal-oxide and olivin-type cathodes. RSC Adv. 2014, 4, 3633–3642. [Google Scholar] [CrossRef]
- Finegan, D.P.; Scheel, M.; Robinson, J.B.; Tjaden, B.; Hunt, I.; Mason, T.J.; Millichamp, J.; Di Michiel, M.; Offer, G.J.; Hinds, G.; et al. In-operando high-speed tomography of lithium-ion batteries during thermal runaway. Nat. Commun. 2015, 6, 6924. [Google Scholar] [CrossRef] [PubMed]
- Wang, Z.; Zhu, L.; Liu, J.; Wang, J.; Yan, W. Gas Sensing Technology for the Detection and Early Warning of Battery Thermal Runaway: A Review. Energy Fuels 2022, 36, 6038–6057. [Google Scholar] [CrossRef]
- Wang, X.X.; Li, Q.T.; Zhou, X.Y.; Hu, Y.M.; Guo, X. Monitoring Thermal Runaway of Lithium-Ion Batteries by Means of Gas Sensors. Sens. Actuators B Chem. 2024, 411, 135703. [Google Scholar]
- Lee, H.; Seo, Y.H.; Ma, P.S. Advanced ultrasonic detection of lithium-ion battery thermal runaway under various heating powers. Appl. Energy 2025, 396, 126328. [Google Scholar] [CrossRef]
- Yin, S.; Liu, J.; Cong, B. Review of Thermal Runaway Monitoring, Warning and Protection Technologies for Lithium-Ion Batteries. Processes 2023, 11, 2345. [Google Scholar] [CrossRef]
- Liu, K.; Zhao, S.; Wang, Y.; Li, K.; Wang, J.; Sun, Y.; Wu, Q.; Peng, Q. Advanced fault diagnosis in batteries: Insights into fault mechanisms, sensor fusion, and artificial intelligence. Adv. Appl. Energy 2025, 20, 100247. [Google Scholar] [CrossRef]
- Mohammadi Moradian, J.; Ali, A.; Yan, X.; Pei, G.; Zhang, S.; Naveed, A.; Shehzad, K.; Shahnavaz, Z.; Ahmad, F.; Yousaf, B. Sensors Innovations for Smart Lithium-Based Batteries: Advancements, Opportunities, and Potential Challenges. Nano Micro Lett. 2025, 17, 279. [Google Scholar]
- ISO 16750-1:2023; Road vehicles–Environmental conditions and testing for electrical and electronic equipment–Part 1: General. International Organization for Standardization: Geneva, Switzerland, 2023.
- Cheng, A.; Xin, Y.; Wu, H.; Yang, L.; Deng, B. A Review of Sensor Applications in Electric Vehicle Thermal Management Systems. Energies 2023, 16, 5139. [Google Scholar] [CrossRef]
- Larsson, F.; Bertilsson, S.; Furlani, M.; Albinsson, I.; Mellander, B.E. Gas explosions and thermal runaways during external heating abuse of commercial lithium-ion graphite-LiCoO2 cells at different levels of ageing. J. Power Sources 2018, 373, 220–231. [Google Scholar] [CrossRef]
- Fernandes, Y.; Bry, A.; De Persis, S. Identification and Quantification of Gases Emitted during Abuse Tests by Overcharge of a Commercial Li-Ion Battery. J. Power Sources 2018, 389, 106–119. [Google Scholar] [CrossRef]
- Appleberry, M.C.; Kowalski, J.A.; Africk, S.A.; Mitchell, J.; Ferree, T.C.; Chang, V.; Parekh, V.; Xu, Z.; Ye, Z.; Whitacre, J.F.; et al. Avoiding Thermal Runaway in Lithium-Ion Batteries Using Ultrasound Detection of Early Failure Mechanisms. J. Power Sources 2022, 535, 231423. [Google Scholar] [CrossRef]
- Tadoum, D.D.; Berger, F.; Krause, F.; Wasylowski, D.; Ringbeck, F.; Li, W.; Sauer, D.U. Standards and Regulations for Battery Management Systems in Germany: Review and Improvement Potentials. Glob. Chall. 2025, 9, e00129. [Google Scholar]
- Berger, F.; Joest, D.; Barbers, E.; Quade, K.; Wu, Z.; Sauer, D.U.; Dechent, P. Benchmarking battery management system algorithms-Requirements, scenarios and validation for automotive applications. eTransportation 2024, 22, 100355. [Google Scholar] [CrossRef]
- Shadman Rad, M.; Danilov, D.; Baghalha, M.; Kazemeini, M.; Notten, P. Adaptive Thermal Modeling of Li-Ion Batteries. Electrochim. Acta 2013, 102, 183–195. [Google Scholar] [CrossRef]
- Bajzek, T. Thermocouples: A Sensor for Measuring Temperature. IEEE Instrum. Meas. Mag. 2005, 8, 35–40. [Google Scholar] [CrossRef]
- Childs, P.R.; Greenwood, J.; Long, C. Review of temperature measurement. Rev. Sci. Instrum. 2000, 71, 2959–2978. [Google Scholar] [CrossRef]
- Duff, M.; Towey, J. Two ways to measure temperature using thermocouples feature simplicity, accuracy, and flexibility. Analog Dialogue 2010, 44, 1–6. [Google Scholar]
- Zheng, Y.; Che, Y.; Hu, X.; Sui, X.; Stroe, D.I.; Teodorescu, R. Thermal State Monitoring of Lithium-Ion Batteries: Progress, Challenges, and Opportunities. Prog. Energy Combust. Sci. 2024, 100, 101120. [Google Scholar] [CrossRef]
- Forgez, C.; Vinh Do, D.; Friedrich, G.; Morcrette, M.; Delacourt, C. Thermal Modeling of a Cylindrical LiFePO4/Graphite Lithium-Ion Battery. J. Power Sources 2010, 195, 2961–2968. [Google Scholar] [CrossRef]
- Richardson, R.R.; Howey, D.A. Sensorless Battery Internal Temperature Estimation Using a Kalman Filter with Impedance Measurement. IEEE Trans. Sustain. Energy 2015, 6, 1190–1199. [Google Scholar] [CrossRef]
- Yang, G.; Leitão, C.; Li, Y.; Pinto, J.; Jiang, X. Real-Time Temperature Measurement with Fiber Bragg Sensors in Lithium Batteries for Safety Usage. Measurement 2013, 46, 3166–3172. [Google Scholar] [CrossRef]
- Meyer, J.; Nedjalkov, A.; Doering, A.; Angelmahr, M.; Schade, W. Fiber Optical Sensors for Enhanced Battery Safety. In Proceedings of the Fiber Optic Sensors and Applications XII; Pickrell, G., Udd, E., Du, H.H., Eds.; SPIE: Bellingham, WA, USA, 2015; Volume 9480, p. 94800Z. [Google Scholar]
- Sommer, L.W.; Raghavan, A.; Kiesel, P.; Saha, B.; Schwartz, J.; Lochbaum, A.; Ganguli, A.; Bae, C.J.; Alamgir, M. Monitoring of Intercalation Stages in Lithium-Ion Cells over Charge-Discharge Cycles with Fiber Optic Sensors. J. Electrochem. Soc. 2015, 162, A2664. [Google Scholar] [CrossRef]
- Sommer, L.W.; Kiesel, P.; Ganguli, A.; Lochbaum, A.; Saha, B.; Schwartz, J.; Bae, C.J.; Alamgir, M.; Raghavan, A. Fast and Slow Ion Diffusion Processes in Lithium Ion Pouch Cells during Cycling Observed with Fiber Optic Strain Sensors. J. Power Sources 2015, 296, 46–52. [Google Scholar] [CrossRef]
- Kersey, A.; Davis, M.; Patrick, H.; LeBlanc, M.; Koo, K.; Askins, C.; Putnam, M.; Friebele, E. Fiber Grating Sensors. J. Light. Technol. 1997, 15, 1442–1463. [Google Scholar] [CrossRef]
- Huang, J.; Albero Blanquer, L.; Bonefacino, J.; Logan, E.R.; Alves Dalla Corte, D.; Delacourt, C.; Gallant, B.M.; Boles, S.T.; Dahn, J.R.; Tam, H.Y.; et al. Operando Decoding of Chemical and Thermal Events in Commercial Na(Li)-Ion Cells via Optical Sensors. Nat. Energy 2020, 5, 674–683. [Google Scholar] [CrossRef]
- Louli, A.J.; Ellis, L.D.; Dahn, J.R. Operando Pressure Measurements Reveal Solid Electrolyte Interphase Growth to Rank Li-Ion Cell Performance. Joule 2019, 3, 745–761. [Google Scholar] [CrossRef]
- Cheng, X.; Pecht, M. In Situ Stress Measurement Techniques on Li-Ion Battery Electrodes: A Review. Energies 2017, 10, 591. [Google Scholar] [CrossRef]
- Zhang, B.; Wu, N.; Zhao, X.-q.; Ding, Z.-d.; Zhu, J. Thermal Out-of-Control Monitoring System for Power Batteries Based on Infrared Thermal Imaging Technology. Chin. Battery Ind. 2019, 23, 171–175, 185. [Google Scholar]
- Yiting, S.; Mengran, Z.; Qiang, H.; Yong, M.; Chao, W.; Yang, J. Thermal propagation process between the pouch and aluminum LFP battery under the condition of overcharge. Electr. Power Eng. Technol. 2020, 39, 191. [Google Scholar]
- Wang, S.; Li, K.; Tian, Y.; Wang, J.; Wu, Y.; Ji, S. Infrared Imaging Investigation of Temperature Fluctuation and Spatial Distribution for a Large Laminated Lithium–Ion Power Battery. Appl. Therm. Eng. 2019, 152, 204–214. [Google Scholar] [CrossRef]
- Giuliano, M.R.; Advani, S.G.; Prasad, A.K. Thermal Analysis and Management of Lithium–Titanate Batteries. J. Power Sources 2011, 196, 6517–6524. [Google Scholar] [CrossRef]
- Fan, J.; Liu, C.; Li, N.; Yang, L.; Yang, X.G.; Dou, B.; Hou, S.; Feng, X.; Jiang, H.; Li, H.; et al. Wireless Transmission of Internal Hazard Signals in Li-Ion Batteries. Nature 2025, 641, 639–645. [Google Scholar] [CrossRef]
- Wang, W.; Zou, J.; Tang, C.; Sun, J.; Zhang, W.; Zhang, X.; Gao, W.; Wang, Y.; Jin, Q.; Jian, J. Design and Batch Preparation of a High-Performance Temperature Sensor for New Energy Vehicles Using Platinum Film. IEEE Sens. J. 2023, 23, 13909–13916. [Google Scholar] [CrossRef]
- Liu, B.; Jia, Y.; Li, J.; Yin, S.; Yuan, C.; Hu, Z.; Wang, L.; Li, Y.; Xu, J. Safety Issues Caused by Internal Short Circuits in Lithium-Ion Batteries. J. Mater. Chem. A 2018, 6, 21475–21484. [Google Scholar] [CrossRef]
- International Energy Agency. Global EV Outlook 2025; Technical Report; International Energy Agency: Paris, France, 2025. [Google Scholar]
- Xiong, R.; Sun, W.; Yu, Q.; Sun, F. Research Progress, Challenges and Prospects of Fault Diagnosis on Battery System of Electric Vehicles. Appl. Energy 2020, 279, 115855. [Google Scholar] [CrossRef]
- Han, D.; Wang, J.; Yin, C.; Zhao, Y. Advances in Early Warning of Thermal Runaway in Lithium-Ion Battery Energy Storage Systems. Adv. Sens. Res. 2025, 4, 2400165. [Google Scholar] [CrossRef]
- Texas Instruments. Wired vs. Wireless Communications in EV Battery Management (SLYY197, Rev. A); Technical Report; Texas Instruments: Dallas, TX, USA, 2020. [Google Scholar]
- Zhao, J.; Feng, X.; Tran, M.K.; Fowler, M.; Ouyang, M.; Burke, A.F. Battery Safety: Fault Diagnosis from Laboratory to Real World. J. Power Sources 2024, 598, 234111. [Google Scholar] [CrossRef]
- Hu, J.; Bian, X.; Wei, Z.; Li, J.; He, H. Residual Statistics-Based Current Sensor Fault Diagnosis for Smart Battery Management. IEEE J. Emerg. Sel. Top. Power Electron. 2022, 10, 2435–2444. [Google Scholar]
- Wang, X.; Wei, X.; Zhu, J.; Dai, H.; Zheng, Y.; Xu, X.; Chen, Q. A Review of Modeling, Acquisition, and Application of Lithium-Ion Battery Impedance for Onboard Battery Management. eTransportation 2021, 7, 100093. [Google Scholar] [CrossRef]
- Li, D.; Wang, L.; Duan, C.; Li, Q.; Wang, K. Temperature Prediction of Lithium-Ion Batteries Based on Electrochemical Impedance Spectrum: A Review. Int. J. Energy Res. 2022, 46, 10372–10388. [Google Scholar] [CrossRef]
- Wei, X.; Wang, X.; Dai, H. Practical On-Board Measurement of Lithium Ion Battery Impedance Based on Distributed Voltage and Current Sampling. Energies 2018, 11, 64. [Google Scholar] [CrossRef]
- NXP Semiconductors. AN12728: CAN with Flexible Data-Rate; Technical Report; NXP Semiconductors: Eindhoven, The Netherlands, 2020. [Google Scholar]
- Xu, Y.; Ge, X.; Guo, R.; Shen, W. Recent Advances in Model-Based Fault Diagnosis for Lithium-Ion Batteries: A Comprehensive Review. Renew. Sustain. Energy Rev. 2025, 207, 114922. [Google Scholar] [CrossRef]
- Koch, S.; Birke, K.P.; Kuhn, R. Fast Thermal Runaway Detection for Lithium-Ion Cells in Large Scale Traction Batteries. Batteries 2018, 4, 16. [Google Scholar] [CrossRef]
- Zhang, Y.; Song, J.; He, L.; Deng, X.; Zhao, Z.; Wu, L. Performance study of Tesla valve-based direct cooling thermal management system for batteries. Appl. Therm. Eng. 2025, 278, 127207. [Google Scholar] [CrossRef]
- Essl, C.; Seifert, L.; Rabe, M.; Fuchs, A. Early Detection of Failing Automotive Batteries Using Gas Sensors. Batteries 2021, 7, 25. [Google Scholar] [CrossRef]
- Koch, S.; Fill, A.; Birke, K.P. Comprehensive Gas Analysis on Large Scale Automotive Lithium-Ion Cells in Thermal Runaway. J. Power Sources 2018, 398, 106–112. [Google Scholar] [CrossRef]
- Torres-Castro, L.; Bates, A.M.; Johnson, N.B.; Quintana, G.; Gray, L. Early Detection of Li-Ion Battery Thermal Runaway Using Commercial Diagnostic Technologies. J. Electrochem. Soc. 2024, 171, 020520. [Google Scholar] [CrossRef]
- Gulsoy, B.; Vincent, T.; Sansom, J.; Marco, J. In-Situ Temperature Monitoring of a Lithium-Ion Battery Using an Embedded Thermocouple for Smart Battery Applications. J. Energy Storage 2022, 54, 105260. [Google Scholar] [CrossRef]
- Xu, M.; Xu, Y.; Tao, J.; Wen, L.; Zheng, C.; Yu, Z.; He, S. Development of a Compact NDIR CO2 Gas Sensor for Harsh Environments. Infrared Phys. Technol. 2024, 136, 105035. [Google Scholar]
- Cai, T.; Valecha, P.; Tran, V.; Engle, B.; Stefanopoulou, A.; Siegel, J. Detection of Li-Ion Battery Failure and Venting with Carbon Dioxide Sensors. eTransportation 2021, 7, 100100. [Google Scholar] [CrossRef]
- Mei, W.; Liu, Z.; Wang, C.; Wu, C.; Liu, Y.; Liu, P.; Xia, X.; Xue, X.; Han, X.; Sun, J.; et al. Operando Monitoring of Thermal Runaway in Commercial Lithium-Ion Cells via Advanced Lab-on-Fiber Technologies. Nat. Commun. 2023, 14, 5251. [Google Scholar] [CrossRef]
- Jin, Y.; Zheng, Z.; Wei, D.; Jiang, X.; Lu, H.; Sun, L.; Tao, F.; Guo, D.; Liu, Y.; Gao, J.; et al. Detection of Micro-Scale Li Dendrite via H2 Gas Capture for Early Safety Warning. Joule 2020, 4, 1714–1729. [Google Scholar] [CrossRef]
- Pan, Y.; Xu, K.; Wang, R.; Wang, H.; Chen, G.; Wang, K. Lithium-Ion Battery Condition Monitoring: A Frontier in Acoustic Sensing Technology. Energies 2025, 18, 1068. [Google Scholar] [CrossRef]
- Wang, Y.; Lai, X.; Chen, Q.; Han, X.; Lu, L.; Ouyang, M.; Zheng, Y. Progress and Challenges in Ultrasonic Technology for State Estimation and Defect Detection of Lithium-Ion Batteries. Energy Storage Mater. 2024, 69, 103430. [Google Scholar] [CrossRef]
- He, Y.; Zeng, Q.; Tang, L.; Liu, F.; Li, Q.; Yin, Y.; Xu, S.; Deng, B. State of health estimation of lithium-ion battery based on full life cycle acoustic emission signals. J. Energy Storage 2025, 139, 118725. [Google Scholar] [CrossRef]
- Yu, Q.; Wang, C.; Li, J.; Xiong, R.; Pecht, M. Challenges and Outlook for Lithium-Ion Battery Fault Diagnosis Methods from the Laboratory to Real World Applications. eTransportation 2023, 17, 100254. [Google Scholar] [CrossRef]
- Ramos, I.E.; Coric, A.; Su, B.; Zhao, Q.; Eriksson, L.; Krysander, M.; Tidblad, A.A.; Zhang, L. Online acoustic emission sensing of rechargeable batteries: Technology, status, and prospects. J. Mater. Chem. A 2024, 12, 23280–23296. [Google Scholar] [CrossRef]
- Li, X.; Wu, C.; Fu, C.; Zheng, S.; Tian, J. State characterization of lithium-ion battery based on ultrasonic guided wave scanning. Energies 2022, 15, 6027. [Google Scholar] [CrossRef]
- Zhang, K.; Yin, J.; He, Y. Acoustic Emission Detection and Analysis Method for Health Status of Lithium Ion Batteries. Sensors 2021, 21, 712. [Google Scholar] [CrossRef]
- Wang, Z.; Zhao, X.; Zhang, H.; Zhen, D.; Gu, F.; Ball, A. Active acoustic emission sensing for fast co-estimation of state of charge and state of health of the lithium-ion battery. J. Energy Storage 2023, 64, 107192. [Google Scholar] [CrossRef]
- Ferrario, F.; Hildebrand, S.; da Costa Barata, R.; Lazareanu, M.; Lebedeva, N.; Busini, V. Simulation of Li-ion battery electrolyte vapour dispersion in an enclosed and quiescent environment: An experimental and computational fluid dynamics approach. J. Energy Storage 2025, 137, 118495. [Google Scholar] [CrossRef]
- Tang, C.; Yuan, Z.; Liu, G.; Jiang, S.; Hao, W. Acoustic emission analysis of 18,650 lithium-ion battery under bending based on factor analysis and the fuzzy clustering method. Eng. Fail. Anal. 2020, 117, 104800. [Google Scholar] [CrossRef]
- Gao, J.; Lyu, Y.; Chen, H.; Song, W.; Liu, H.; Wu, B.; He, C. Guided waves propagation in lithium-ion batteries: Theoretical modeling and experimental analysis. NDT E Int. 2024, 145, 103102. [Google Scholar] [CrossRef]
- Song, J.; Hilmert, D.; Kiel, F. Mechanisms of failure and state analysis of electrical connectors in automobiles. Eng. Fail. Anal. 2025, 173, 109427. [Google Scholar] [CrossRef]
- Gu, X.; Shang, Y.; Li, J.; Zhu, Y.; Tao, X.; Geng, H.; Zhang, Z.; Zhang, C. Early Warning of Thermal Runaway Based on State of Safety for Lithium-Ion Batteries. Commun. Eng. 2025, 4, 106. [Google Scholar] [CrossRef] [PubMed]
- Analog Devices. LTC6811-1/LTC6811-2: Multicell Battery Monitor Datasheet. 2019. Available online: https://www.analog.com/media/en/technical-documentation/data-sheets/ltc6811-1-6811-2.pdf (accessed on 26 January 2026).
- GB 38031-2025; Electric Vehicles Traction Battery Safety Requirements. State Administration for Market Regulation and Standardization Administration of China: Beijing, China, 2025.
- United Nations Economic Commission for Europe. UN Global Technical Regulation No. 20; Global Technical Regulation on Electric Vehicle Safety; United Nations Economic Commission for Europe: Geneva, Switzerland, 2018. [Google Scholar]
- Hong, H.S.; Quang, L.T.; Van Giang, P.; Binh, D.T.; Thuan, N.D.; Ha, N.H. A real-time framework for early detection and severity prediction of thermal runaway in Li-ion batteries. J. Energy Storage 2025, 135, 118310. [Google Scholar]
- Li, K.; Chen, L.; Han, X.; Gao, X.; Lu, Y.; Wang, D.; Tang, S.; Zhang, W.; Wu, W.; Cao, Y.c.; et al. Early warning for thermal runaway in lithium-ion batteries during various charging rates: Insights from expansion force analysis. J. Clean. Prod. 2024, 457, 142422. [Google Scholar] [CrossRef]
- Ouyang, M.; Zhang, M.; Feng, X.; Lu, L.; Li, J.; He, X.; Zheng, Y. Internal short circuit detection for battery pack using equivalent parameter and consistency method. J. Power Sources 2015, 294, 272–283. [Google Scholar] [CrossRef]
- AUTOSAR. Recommended Methods and Practices for Timing Analysis and Design within the AUTOSAR Development Process; Technical Report; AUTOSAR CP R19-11; AUTOSAR: Munich, Germany, 2019. [Google Scholar]
- Jiang, L.; Deng, Z.; Tang, X.; Hu, L.; Lin, X.; Hu, X. Data-Driven Fault Diagnosis and Thermal Runaway Warning for Battery Packs Using Real-World Vehicle Data. Energy 2021, 234, 121266. [Google Scholar] [CrossRef]
- Wang, Q.; Ping, P.; Zhao, X.; Chu, G.; Sun, J.; Chen, C. Thermal Runaway Caused Fire and Explosion of Lithium Ion Battery. J. Power Sources 2012, 208, 210–224. [Google Scholar] [CrossRef]
- Widodo, A.; Shim, M.C.; Caesarendra, W.; Yang, B.S. Intelligent Prognostics for Battery Health Monitoring Based on Sample Entropy. Expert Syst. Appl. 2011, 38, 11763–11769. [Google Scholar] [CrossRef]
- Shang, Y.; Lu, G.; Kang, Y.; Zhou, Z.; Duan, B.; Zhang, C. A multi-fault diagnosis method based on modified Sample Entropy for lithium-ion battery strings. J. Power Sources 2020, 446, 227275. [Google Scholar] [CrossRef]
- Long, T.; Guo, Y. Research on the temperature radius stratification model based on electrochemical-thermal-force coupling in Lithium-ion batteries. Electrochem. Commun. 2025, 180, 108052. [Google Scholar] [CrossRef]
- Feng, X.; Weng, C.; Ouyang, M.; Sun, J. Online Internal Short Circuit Detection for a Large Format Lithium Ion Battery. Appl. Energy 2016, 161, 168–180. [Google Scholar] [CrossRef]
- Tran, M.K.; Mevawalla, A.; Aziz, A.; Panchal, S.; Xie, Y.; Fowler, M. A review of lithium-ion battery thermal runaway modeling and diagnosis approaches. Processes 2022, 10, 1192. [Google Scholar] [CrossRef]
- Chen, Z.; Xiong, R.; Tian, J.; Shang, X.; Lu, J. Model-Based Fault Diagnosis Approach on External Short Circuit of Lithium-Ion Battery Used in Electric Vehicles. Appl. Energy 2016, 184, 365–374. [Google Scholar] [CrossRef]
- Newman, J.; Tiedemann, W. Porous-Electrode Theory with Battery Applications. AIChE J. 1975, 21, 25–41. [Google Scholar] [CrossRef]
- Bizeray, A.M.; Zhao, S.; Duncan, S.R.; Howey, D.A. Lithium-ion battery thermal-electrochemical model-based state estimation using orthogonal collocation and a modified extended Kalman filter. J. Power Sources 2015, 296, 400–412. [Google Scholar] [CrossRef]
- Zhang, Q.; Wang, D.; Yang, B.; Cui, X.; Li, X. Electrochemical Model of Lithium-Ion Battery for Wide Frequency Range Applications. Electrochim. Acta 2020, 343, 136094. [Google Scholar] [CrossRef]
- Ren, D.; Feng, X.; Lu, L.; Ouyang, M.; Zheng, S.; Li, J.; He, X. An Electrochemical-Thermal Coupled Overcharge-to-Thermal-Runaway Model for Lithium Ion Battery. J. Power Sources 2017, 364, 328–340. [Google Scholar] [CrossRef]
- Doyle, M.; Fuller, T.F.; Newman, J. Modeling of Galvanostatic Charge and Discharge of the Lithium/Polymer/Insertion Cell. J. Electrochem. Soc. 1993, 140, 1526. [Google Scholar] [CrossRef]
- Tamilselvi, S.; Gunasundari, S.; Karuppiah, N.; Razak RK, A.; Madhusudan, S.; Nagarajan, V.M.; Sathish, T.; Shamim, M.Z.M.; Saleel, C.A.; Afzal, A. A review on battery modelling techniques. Sustainability 2021, 13, 10042. [Google Scholar] [CrossRef]
- Hao, W.; Guo, F.; Li, J.; Xie, J. Influence of Physical Parameters on Lithium Dendrite Growth Based on Phase Field Theory. Metals 2025, 16, 41. [Google Scholar] [CrossRef]
- Han, S.; Tang, Y.; Khaleghi Rahimian, S. A Numerically Efficient Method of Solving the Full-Order Pseudo-2-Dimensional (P2D) Li-Ion Cell Model. J. Power Sources 2021, 490, 229571. [Google Scholar] [CrossRef]
- Deng, Z.; Yang, L.; Deng, H.; Cai, Y.; Li, D. Polynomial Approximation Pseudo-Two-Dimensional Battery Model for Online Application in Embedded Battery Management System. Energy 2018, 142, 838–850. [Google Scholar] [CrossRef]
- Lin, X.; Perez, H.E.; Siegel, J.B.; Stefanopoulou, A.G.; Li, Y.; Anderson, R.D.; Ding, Y.; Castanier, M.P. Online Parameterization of Lumped Thermal Dynamics in Cylindrical Lithium Ion Batteries for Core Temperature Estimation and Health Monitoring. IEEE Trans. Control Syst. Technol. 2013, 21, 1745–1755. [Google Scholar]
- Samanta, A.; Chowdhuri, S.; Williamson, S.S. Machine learning-based data-driven fault detection/diagnosis of lithium-ion battery: A critical review. Electronics 2021, 10, 1309. [Google Scholar] [CrossRef]
- Yao, L.; Fang, Z.; Xiao, Y.; Hou, J.; Fu, Z. An Intelligent Fault Diagnosis Method for Lithium Battery Systems Based on Grid Search Support Vector Machine. Energy 2021, 214, 118866. [Google Scholar] [CrossRef]
- Wang, Z.; Song, C.; Zhang, L.; Zhao, Y.; Liu, P.; Dorrell, D.G. A Data-Driven Method for Battery Charging Capacity Abnormality Diagnosis in Electric Vehicle Applications. IEEE Trans. Transp. Electrif. 2022, 8, 990–999. [Google Scholar] [CrossRef]
- Zhao, D.; Zhou, Z.; Tang, S.; Cao, Y.; Wang, J.; Zhang, P.; Zhang, Y. Online Estimation of Satellite Lithium-Ion Battery Capacity Based on Approximate Belief Rule Base and Hidden Markov Model. Energy 2022, 256, 124632. [Google Scholar] [CrossRef]
- Hong, J.; Wang, Z.; Yao, Y. Fault Prognosis of Battery System Based on Accurate Voltage Abnormity Prognosis Using Long Short-Term Memory Neural Networks. Appl. Energy 2019, 251, 113381. [Google Scholar] [CrossRef]
- Liu, X.; Wu, L.; Guo, X.; Andriukaitis, D.; Królczyk, G.; Li, Z. A Novel Approach for Surface Defect Detection of Lithium Battery Based on Improved K-Nearest Neighbor and Euclidean Clustering Segmentation. Int. J. Adv. Manuf. Technol. 2023, 127, 971–985. [Google Scholar] [CrossRef]
- Sun, Z.; Wang, Z.; Liu, P.; Qin, Z.; Chen, Y.; Han, Y.; Wang, P.; Bauer, P. An Online Data-Driven Fault Diagnosis and Thermal Runaway Early Warning for Electric Vehicle Batteries. IEEE Trans. Power Electron. 2022, 37, 12636–12646. [Google Scholar] [CrossRef]
- Tian, J.; Xiong, R.; Shen, W.; Lu, J. Data-Driven Battery Degradation Prediction: Forecasting Voltage-Capacity Curves Using One-Cycle Data. Ecomat 2022, 4, e12213. [Google Scholar] [CrossRef]
- Wei, Z.; Li, P.; Cao, W.; Chen, H.; Wang, W.; Yu, Y.; He, H. Machine Learning-Based Hybrid Thermal Modeling and Diagnostic for Lithium-Ion Battery Enabled by Embedded Sensing. Appl. Therm. Eng. 2022, 216, 119059. [Google Scholar] [CrossRef]
- ISO 21448:2022; Road Vehicles—Safety of the Intended Functionality. International Organization for Standardization: Geneva, Switzerland, 2022.
- Dong, C.; Sun, D. Multi-source domain transfer learning with small sample learning for thermal runaway diagnosis of lithium-ion battery. Appl. Energy 2024, 365, 123248. [Google Scholar] [CrossRef]
- Finegan, D.P.; Zhu, J.; Feng, X.; Keyser, M.; Ulmefors, M.; Li, W.; Bazant, M.Z.; Cooper, S.J. The Application of Data-Driven Methods and Physics-Based Learning for Improving Battery Safety. Joule 2021, 5, 316–329. [Google Scholar] [CrossRef]
- Li, D.; Zhang, Z.; Liu, P.; Wang, Z.; Zhang, L. Battery Fault Diagnosis for Electric Vehicles Based on Voltage Abnormality by Combining the Long Short-Term Memory Neural Network and the Equivalent Circuit Model. IEEE Trans. Power Electron. 2021, 36, 1303–1315. [Google Scholar] [CrossRef]
- Xu, C.; Li, L.; Xu, Y.; Han, X.; Zheng, Y. A vehicle-cloud collaborative method for multi-type fault diagnosis of lithium-ion batteries. eTransportation 2022, 12, 100172. [Google Scholar] [CrossRef]
- Banbury, C.; Reddi, V.J.; Torelli, P.; Holleman, J.; Jeffries, N.; Kiraly, C.; Montino, P.; Kanter, D.; Ahmed, S.; Pau, D.; et al. Mlperf tiny benchmark. arXiv 2021, arXiv:2106.07597. [Google Scholar]
- David, R.; Duke, J.; Jain, A.; Janapa Reddi, V.; Jeffries, N.; Li, J.; Kreeger, N.; Nappier, I.; Natraj, M.; Wang, T.; et al. Tensorflow lite micro: Embedded machine learning for tinyml systems. Proc. Mach. Learn. Syst. 2021, 3, 800–811. [Google Scholar]
- Lai, L.; Suda, N.; Chandra, V. Cmsis-nn: Efficient neural network kernels for arm cortex-m cpus. arXiv 2018, arXiv:1801.06601. [Google Scholar]
- Jeevarajan, J.A.; Joshi, T.; Parhizi, M.; Rauhala, T.; Juarez-Robles, D. Battery hazards for large energy storage systems. ACS Energy Lett. 2022, 7, 2725–2733. [Google Scholar] [CrossRef]
- Shahid, S.; Agelin-Chaab, M. A review of thermal runaway prevention and mitigation strategies for lithium-ion batteries. Energy Convers. Manag. X 2022, 16, 100310. [Google Scholar] [CrossRef]
- Hwang, F.S.; Confrey, T.; Reidy, C.; Picovici, D.; Callaghan, D.; Culliton, D.; Nolan, C. Review of battery thermal management systems in electric vehicles. Renew. Sustain. Energy Rev. 2024, 192, 114171. [Google Scholar] [CrossRef]
- Lin, J.; Chen, W.M.; Lin, Y.; Gan, C.; Han, S. Mcunet: Tiny deep learning on iot devices. Adv. Neural Inf. Process. Syst. 2020, 33, 11711–11722. [Google Scholar]
- Cheng, Y.; Wang, D.; Zhou, P.; Zhang, T. Model compression and acceleration for deep neural networks: The principles, progress, and challenges. IEEE Signal Process. Mag. 2018, 35, 126–136. [Google Scholar] [CrossRef]
- Jacob, B.; Kligys, S.; Chen, B.; Zhu, M.; Tang, M.; Howard, A.; Adam, H.; Kalenichenko, D. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2704–2713. [Google Scholar]
- Filters’Importance, D. Pruning filters for efficient convnets. arXiv 2016, arXiv:1608.08710. [Google Scholar]
- Hinton, G.; Vinyals, O.; Dean, J. Distilling the knowledge in a neural network. arXiv 2015, arXiv:1503.02531. [Google Scholar] [CrossRef]
- Lelie, M.; Braun, T.; Knips, M.; Nordmann, H.; Ringbeck, F.; Zappen, H.; Sauer, D.U. Battery management system hardware concepts: An overview. Appl. Sci. 2018, 8, 534. [Google Scholar] [CrossRef]
- Sze, V.; Chen, Y.H.; Yang, T.J.; Emer, J.S. Efficient processing of deep neural networks: A tutorial and survey. Proc. IEEE 2017, 105, 2295–2329. [Google Scholar] [CrossRef]
- Yang, S.; Zhang, Z.; Cao, R.; Wang, M.; Cheng, H.; Zhang, L.; Jiang, Y.; Li, Y.; Chen, B.; Ling, H.; et al. Implementation for a Cloud Battery Management System Based on the CHAIN Framework. Energy AI 2021, 5, 100088. [Google Scholar] [CrossRef]
- ISO/SAE 21434:2021; Road Vehicles—Cybersecurity Engineering. International Organization for Standardization and SAE International: Geneva, Switzerland, 2021.
- Lai, T.; Zhao, H.; Song, Y.; Wang, L.; Wang, Y.; He, X. Mechanism and Control Strategies of Lithium-Ion Battery Safety: A Review. Small Methods 2025, 9, 2400029. [Google Scholar] [CrossRef] [PubMed]
- Song, Y.; Jiang, X.; Lyu, N.; Lu, H.; Zhang, D.; Li, H.; Jin, Y. Early warning of lithium-ion battery thermal runaway based on gas sensors. eTransportation 2025, 26, 100502. [Google Scholar] [CrossRef]
- Zhao, J.; Qu, X.; Wu, Y.; Fowler, M.; Burke, A.F. Artificial intelligence-driven real-world battery diagnostics. Energy AI 2024, 18, 100419. [Google Scholar] [CrossRef]
- Dennler, N.; Rastogi, S.; Fonollosa, J.; Van Schaik, A.; Schmuker, M. Drift in a Popular Metal Oxide Sensor Dataset Reveals Limitations for Gas Classification Benchmarks. Sens. Actuators B Chem. 2022, 361, 131668. [Google Scholar] [CrossRef]
- Kuang, H.; Liao, H.; Zhang, Z.; Huang, S.; Mao, B.; Shan, W.; Duan, X.; Feng, X.; Li, T. Dynamic diels-alder reaction crosslinked metal-organic framework/poly (ionic liquid) composite solid electrolyte for lithium-metal batteries. J. Colloid Interface Sci. 2025, 707, 139638. [Google Scholar] [CrossRef]
- Gao, T.; Wu, Y. Applications and Advances of Machine Learning in the Development of Solid-State Electrolytes for Lithium-Ion Batteries. ACS Omega 2025, 10, 60094–60109. [Google Scholar] [CrossRef]
- Lv, D.; Huang, X.; Li, X.; Zhu, L.; Chai, J.; Gao, Y.; Liu, Z.; Yao, X. Bioinspired Ultrastrong and Ion-Selective Gel Electrolytes by Interfacial Coacervation for High-Performance Lithium-Metal Batteries. Carbon Energy 2026, e70153. [Google Scholar] [CrossRef]
- Luo, Y.; Rao, Z.; Yang, X.; Wang, C.; Sun, X.; Li, X. Safety Concerns in Solid-State Lithium Batteries: From Materials to Devices. Energy Environ. Sci. 2024, 17, 7543–7565. [Google Scholar] [CrossRef]
- Yang, S.J.; Hu, J.K.; Jiang, F.N.; Yuan, H.; Park, H.S.; Huang, J.Q. Safer solid-state lithium metal batteries: Mechanisms and strategies. InfoMat 2024, 6, e12512. [Google Scholar] [CrossRef]
- Ward, L.; Babinec, S.; Dufek, E.J.; Howey, D.A.; Viswanathan, V.; Aykol, M.; Beck, D.A.; Blaiszik, B.; Chen, B.R.; Crabtree, G.; et al. Principles of the battery data genome. Joule 2022, 6, 2253–2271. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.



