1. Introduction
The global transition towards electric vehicles (EVs) has substantially reshaped the automotive sector, with lithium-ion batteries serving as the core technology governing driving range, operational safety, and total cost of ownership [
1]. Extensive prior research has investigated long-term battery degradation phenomena, including capacity fade, impedance growth, and cycle-life prediction [
2,
3]. In contrast, the detection of short-term performance degradation during real-world vehicle operation remains comparatively underexplored. Such short-term degradation events include transient voltage drops, abrupt increases in effective internal resistance, and temporary power delivery limitations, which may develop within hours or days due to causes such as aggressive driving behavior, fast charging, or rapid thermal fluctuations [
4]. Although many of these effects are partially reversible, their occurrence can reduce driver confidence, impair accurate state-of-charge (SoC) estimation, and potentially accelerate irreversible battery aging if not identified and mitigated in a timely manner.
Early detection of short-term battery degradation poses several fundamental technical challenges. Modern EV fleets exhibit pronounced heterogeneity in battery chemistry, vehicle platforms, and operating environments, resulting in highly variable electrical and thermal load profiles. Furthermore, labeled degradation events are inherently scarce, as many abnormal behaviors do not trigger battery management system (BMS) diagnostic codes until significant deterioration has already occurred. Therefore, any onboard detection strategy must operate in real time using only signals that are routinely available from the BMS and the controller area network (CAN), including terminal voltage, current, SoC, battery and ambient temperatures, and auxiliary power consumption. Approaches that rely on controlled excitation or predefined test sequences are impractical for naturalistic driving conditions. From a safety-critical deployment perspective, accurate detection alone is insufficient; decision mechanisms must also provide quantifiable and risk-controlled guarantees, particularly with respect to false negative outcomes that may allow hazardous conditions to persist undetected.
Existing battery monitoring and anomaly detection methods can be broadly categorized as physics-based, data-driven, or hybrid approaches. Physics-based techniques such as equivalent circuit models (ECMs) and electrochemical impedance spectroscopy (EIS) offer interpretable estimates of internal resistance and diffusion-related parameters [
5,
6]. However, these methods typically assume idealized current excitation patterns that rarely occur in real driving, rendering parameter estimation from naturalistic data sparse, noisy, and highly dependent on operating conditions. Data-driven methods, including support vector machines, random forests, and recurrent neural networks [
7,
8], are capable of capturing complex nonlinear sensor relationships but often lack physical grounding. As a result, they may misinterpret normal operational variability as degradation and can exhibit limited robustness under distribution shifts. Hybrid approaches [
9,
10] partially address these limitations, yet many still depend on explicit current step detection, make poor use of multi-sensor information, and do not provide formal guarantees around decision risk.
These limitations motivate the development of a physics-guided multi-sensor learning framework explicitly designed for real-time deployment under realistic operating conditions. This paper addresses the problem of early warning for short-term EV battery performance degradation, with an emphasis on detection timeliness, robustness, and computational efficiency rather than on pointwise anomaly classification accuracy alone. The main contributions of this work are summarized as follows:
We propose a physics-guided multi-sensor learning framework, termed SensorFusion-Former (SFF), that integrates a physics-based baseline model with data-driven temporal learning. The physics model normalizes operational variability, allowing the learning architecture to focus on degradation-relevant residual dynamics instead of nominal operating fluctuations.
A multi-sensor fusion attention mechanism is introduced to explicitly capture cross-modality interactions among electrical, thermal, and auxiliary signals. This mechanism is combined with a lightweight transformer architecture to achieve effective temporal representation learning while maintaining low inference latency suitable for real-time battery management systems.
A weak supervision strategy based on physics-consistent residual analysis and temporal smoothing is developed, enabling scalable model training without the need for densely labeled degradation events. This approach substantially reduces annotation cost while preserving early-warning sensitivity.
To enhance deployment reliability, evidential uncertainty modeling and conformal calibration are incorporated into the early warning head, yielding statistically controlled decision thresholds with bounded false alarm risk under distributional variability.
Extensive experiments conducted on a real driving cycle dataset from IEEE DataPort demonstrate that the proposed framework consistently outperforms classical machine learning methods, deep neural networks, and standard transformer models. The proposed approach achieves superior early warning lead time and lower false alarm rates while maintaining competitive discriminative performance and reduced inference latency across diverse thermal operating scenarios.
The remainder of this paper is organized as follows.
Section 2 reviews prior work on battery health diagnostics and fault detection, multi-sensor fusion and deep learning architectures, uncertainty-aware decision-making, and physics-guided machine learning for battery systems.
Section 3 presents the proposed system model and algorithms, including the multi-sensor problem formulation, physics-guided surrogate voltage model, SensorFusion-Former architecture, probabilistic multi-task prediction heads, unified training objective, and complete training and deployment pipeline, together with an analysis of computational complexity and real-time feasibility.
Section 4 reports the experimental setup and a comprehensive evaluation of the proposed approach, covering overall comparisons with baseline models, ablation studies, cross-scenario generalization across diverse thermal domains, and early warning capability analysis. Finally,
Section 5 concludes the paper and outlines directions for future work.
3. System Model and Algorithms
This section presents the proposed system for early detection of short-term performance degradation in EV lithium-ion batteries. The system operates on routinely logged vehicle telemetry and consists of three key components: construction of physics-guided surrogate targets, derivation of weak degradation labels, and training of a multi-sensor deep learning model that produces calibrated and risk-controlled early warning alerts.
Figure 1 illustrates an overview of the proposed system architecture. The framework comprises four main stages. First, multi-sensor data ingestion is performed together with a physics-guided baseline model to normalize operating conditions (left). Second, the SensorFusion-Former model processes the normalized inputs through seven internal layers, including cross-sensor attention, physics-conditioned biasing, and causal temporal attention based on FAVOR+ kernels (center). The core methodological innovations are highlighted using orange blocks and marked with the symbol ★. Third, multi-task probabilistic prediction heads generate outputs for degradation regression, event classification, early warning, and physics consistency forecasting (right). Finally, offline training and conformal calibration pipelines are employed to enable domain adaptation and risk-controlled deployment (bottom).
The proposed methodology is built upon three core design components. First, the cross-sensor attention module (Layer 1) captures instantaneous inter-domain dependencies among electrical, thermal, and auxiliary sensor groups. Second, physics-conditioned biasing (Layer 2) injects grey-box model outputs (including the voltage residual
, reference voltage
, and ohmic resistance estimate
) into the latent representations without introducing future information leakage. Third, causal temporal attention based on FAVOR+ kernels (Layers 4–6) achieves a computational complexity of
, enabling real-time inference in embedded battery management systems while preserving expressive attention modeling.
Table 1 summarizes the key symbols used in the problem formulation, physics-based modeling, and architectural design.
3.1. Multi-Sensor Problem Formulation
At each discrete time index
with sampling interval
, the battery management system observes a multi-sensor feature vector
where
denotes electrical signals,
denotes thermal signals, and
denotes auxiliary operational signals, with
.
Specifically, the electrical channel vector is defined as , including terminal voltage , current , state-of-charge , and traction power . The thermal channel vector captures battery temperature , ambient temperature , and coolant mass flow rate . The auxiliary channel vector includes power consumption of the heating, ventilation, and air conditioning system , heating power , vehicle speed , and longitudinal acceleration .
Each sensor group provides complementary information about battery operation. The electrical signals reflect the instantaneous electrochemical response of the battery, the thermal signals capture temperature-dependent reaction kinetics and aging mechanisms, and the auxiliary signals describe external load conditions and vehicle usage patterns that indirectly influence battery stress. This structured multi-sensor representation enables the model to differentiate between benign operational effects such as transient voltage drops during aggressive acceleration and potential degradation signatures such as sustained increases in internal resistance under moderate load conditions.
Direct interpretation of raw sensor measurements is challenging due to their strong dependence on operating context, including state-of-charge, temperature, and instantaneous power demand. For example, a voltage drop of several volts may be expected at high discharge rates and low ambient temperatures, yet may indicate abnormal behavior under moderate load at nominal conditions. To decouple operation-induced variability from degradation-related effects, a physics-guided baseline model is introduced in the following subsection.
3.2. Physics-Guided Surrogate Voltage Model
Direct interpretation of raw voltage deviations is difficult because observed variations may be caused by benign operating factors, including load transients, temperature changes, and SoC dependence rather than true degradation. In order to separate operating effects from degradation-related behavior, we introduce a grey-box physics-guided surrogate voltage model that approximates the expected pack voltage under nominally healthy conditions. The resulting reference voltage serves as a baseline for constructing operation-normalized deviation signals.
3.2.1. Three-Component Voltage Decomposition
We express the reference pack voltage as the sum of three physically interpretable components:
where
denotes the monotone non-decreasing open-circuit-voltage (OCV) surface that characterizes the equilibrium potential and satisfies
. The term
is a non-negative ohmic resistance map governing the instantaneous current-induced voltage drop. The dynamic component
is modeled as a stable and causal filtering operator that captures time-dependent polarization and diffusion effects driven by the recent excitation history
.
The parameter set
collects the learnable coefficients of the three components. We estimate
from nominally healthy operation segments
by solving
where
is the Huber loss and
imposes soft shape constraints to preserve monotonicity of
and non-negativity of
. The regularization weight
balances data fit and physical plausibility.
Figure 2 illustrates the decomposition on a representative driving segment. The OCV surface
captures equilibrium voltage variation with SoC and temperature, the ohmic term
explains instantaneous losses that scale with current, and the dynamic term
accounts for polarization and diffusion effects driven by recent current and temperature history.
As shown in
Figure 2a, the reference voltage
closely tracks the measured voltage
over diverse operating regimes during healthy operation, including high discharge at low temperature, moderate load at nominal temperature, and regenerative braking. Fitting the surrogate using (
3) produces a reference trajectory that accounts for expected variations induced by SoC evolution, thermal conditions, and load changes, meaning that residual deviations become more indicative of abnormal behavior.
During the degradation episode, a persistent discrepancy emerges between and that cannot be explained by the calibrated healthy baseline. Such unexplained deviations may reflect increased effective internal resistance or abnormal polarization dynamics and consequently motivate an operation-normalized residual, since the magnitude of is strongly dependent on current level.
3.2.2. Operation-Normalized Residual and Severity Index
To quantify deviations in a manner that is robust to operating variability, we define the operation-normalized residual
where
prevents numerical instability under near-zero current conditions. The normalization scales the absolute deviation by the predicted ohmic drop, meaning that
reflects relative unexplained losses rather than raw voltage magnitude.
Figure 2b validates this design. Despite large voltage excursions caused by acceleration, coasting, and regenerative braking, the residual
remains consistently small during healthy operation, indicating effective suppression of operation-induced confounders. In contrast, during the degradation episode
increases markedly and exceeds the threshold
, enabling clear separation between degradation-related behavior and benign operating variability. The highlighted region where
is later converted into frame-level labels via the temporal smoothing procedure in the next subsection.
The early-warning interval in
Figure 2b illustrates the intended predictive setting. Specifically, for a horizon of
H samples, the model is trained to predict both reactive event labels
and early-warning labels
(defined in
Section 3.2.3), allowing an alert to be issued prior to the onset of a confirmed event.
Single-sample residuals
may be noisy and influenced by short-lived transients. Therefore, we define a windowed severity index
over a horizon of length
:
where
denotes the Huber function
with
. The weights
emphasize operating points that are informative for degradation assessment. In practice,
is derived from a kernel density estimate in the
space; operating regimes that occur frequently under healthy conditions are down-weighted, whereas rarer but diagnostically informative regimes receive higher weights.
The resulting summarizes recent operation-normalized deviations in a manner that is robust to outliers while remaining sensitive to sustained abnormal behavior. This scalar sequence serves as the primary signal for automatic event label generation.
3.2.3. Event Labeling with Hysteresis and Early Warning
Since ground-truth labels for short-term degradation events are rarely available, we construct weak labels from the severity index
. A degradation threshold
is calibrated on healthy data as
where
denotes the empirical
-quantile with
, ensuring that only a small fraction of healthy samples exceed
.
Raw frame-level flags are defined as
where
filters out low-current intervals that are typically less informative.
To reduce spurious detections induced by sensor noise and transient fluctuations, we apply three postprocessing operations. First, a hysteresis rule enforces temporal consistency by confirming an event only after at least consecutive samples satisfy . Second, candidate segments shorter than samples are removed. Third, neighboring segments separated by gaps no larger than samples are merged, preventing a single anomaly from being fragmented into multiple detections.
These steps address complementary failure modes of threshold-based detection. Hysteresis suppresses isolated spikes, the minimum duration constraint removes short-lived artifacts, and gap merging consolidates fragmented segments caused by varying current magnitude. Together, the procedure balances sensitivity with robustness to false alarms while yielding event intervals that better correspond to physically meaningful degradation episodes.
Figure 3 shows how raw threshold crossings are refined into coherent event intervals and corresponding early-warning windows. After postprocessing we obtain a set of
J disjoint event intervals
, where
and
denote the start and end indices of the
jth event. The binary event label is defined as
and the
H-step early-warning label is defined as
The early warning label marks samples within
H steps prior to event onset, enabling the model to learn predictive precursors rather than only reactive detection.
3.3. Sensor Fusion-Former Architecture
3.3.1. Sensor Group Tokenization
For each sensor group
and time index
t, we map group-specific inputs to a shared latent space via
where
denotes layer normalization,
is a group-specific feedforward network, and
is a learnable group embedding. This design preserves modality-specific characteristics while enabling subsequent cross-group interaction modeling in a common representation space.
3.3.2. Cross-Sensor Attention
To capture instantaneous dependencies among sensor groups, we concatenate the group embeddings and apply multi-head self-attention (MHSA):
where
denotes a multi-head self-attention operator applied over the
group tokens at the same time step. The fused token
summarizes cross-sensor interactions and serves as the input to subsequent temporal modeling.
3.3.3. Physics-Conditioned Feature Injection
To incorporate physics-guided information without violating causality, we inject grey-box outputs through a learned conditioning function
where
is a lightweight multilayer perceptron (MLP). Because the conditioning variables are computed from current and past observations only, the injection does not introduce future information leakage.
3.3.4. Causal Temporal Modeling with FAVOR+
To model temporal dependencies over a causal window of length
W, we construct a context matrix
The sequence is processed by
L causal transformer blocks:
where each block implements causal attention to prevent access to future tokens.
Standard self-attention requires computing all pairwise similarities within a length-W window, which incurs time complexity and memory. Such quadratic scaling can become a deployment bottleneck when streaming inference is required on resource-constrained battery management systems.
To improve efficiency, we adopt FAVOR+ (Fast Attention Via positive Orthogonal Random features) attention [
49], which approximates softmax attention using random feature maps. This yields linear complexity
with memory
, where
r denotes the number of random features.
Table 2 summarizes the computational and memory complexity of FAVOR+ relative to representative efficient attention variants.
Finally, we aggregate the temporal context into a single latent representation:
where
can be implemented using the last token, global average pooling, or attention-weighted pooling. In our implementation, we use the last token in order to preserve causality and emphasize the most recent context.
3.4. Probabilistic Multi-Task Prediction Heads
The proposed architecture employs probabilistic multi-task prediction heads to jointly estimate degradation severity, event occurrence, and early-warning likelihood while explicitly modeling prediction uncertainty. This design enables risk-aware decision-making and supports subsequent conformal calibration.
3.4.1. Heteroscedastic Regression for Severity
To model both the expected value and uncertainty of degradation severity, we adopt a heteroscedastic regression formulation. Specifically, the predictive mean and variance are given by
where
denotes the predicted mean severity and
represents the input-dependent predictive variance.
The regression loss is defined as the negative log-likelihood of a Gaussian distribution:
where
is the degradation severity index defined in
Section 3.2 and
are sample-specific weights that reflect the operating-point density introduced in
Section 3.2. This formulation penalizes both large prediction errors and overconfident uncertainty estimates.
3.4.2. Evidential Classification
For binary event detection, we employ an evidential classification framework based on the Beta-Bernoulli model, which provides a principled representation of epistemic uncertainty. The parameters of the Beta distribution are predicted as
ensuring
and
for numerical stability. The resulting predictive event probability is given by
and the associated predictive variance is
which serves as a measure of epistemic uncertainty.
The evidential classification loss combines data fidelity and uncertainty regularization:
where
denotes the binary cross-entropy loss,
is the event label defined in
Section 3.2.3, and
controls the strength of uncertainty regularization. This objective encourages accurate predictions while discouraging unwarranted overconfidence.
An analogous evidential formulation is applied to early-warning prediction. Specifically, a separate classification head with parameters is trained using the corresponding early warning labels , yielding an early warning probability and loss defined in the same manner.
3.5. Risk-Controlled Decision Making via Weighted Conformal Prediction
To provide finite-sample performance guarantees under distributional variability, we adopt a weighted conformal calibration strategy on a held-out calibration set . The use of sample-dependent weights allows the calibration procedure to account for nonuniform operating conditions commonly observed in real-world electric vehicle data.
3.5.1. Regression Calibration
For degradation severity prediction, we compute a weighted conformal quantile based on normalized regression residuals:
where
denotes the
weighted quantile operator and
are sample-specific weights proportional to the local data density in the operating condition space. This calibration ensures that the normalized residual exceeds
with probability at most
on unseen data drawn from a similar distribution.
During deployment, a severity exceedance is declared whenever
yielding a risk-controlled decision rule with a finite-sample guarantee.
3.5.2. Classification Calibration
For event detection, we determine a probability threshold
that explicitly controls the false negative rate at level
. The threshold is selected on the calibration set as
where
denotes the calibrated predictive probability. This procedure yields a data-driven decision threshold that bounds the empirical false-negative rate on the calibration set and supports risk-aware deployment.
The sample-specific weights in Equations (25) and (27) are computed as follows. Let be the operating-condition vector. Each dimension is standardized using training-split statistics to obtain . A Gaussian KDE with bandwidth (Scott’s rule, ) is fitted separately on the training set () and the calibration set (). The raw conformal weight for each calibration sample is the density ratio , with . Weights are clipped to and then -normalized; clipping precedes normalization in order to prevent extreme ratios from dominating the weighted quantile. The same are used for both regression (Equation (25)) and classification (Equation (27)) calibration. The training-time weights in Equations (5) and (20) follow an analogous procedure but use the inverse density , clipped to , serving the complementary purpose of up-weighting rare but diagnostically informative operating regimes during model training.
3.6. Unified Training Objective
The complete training objective integrates all learning components into a single loss function
where
,
,
,
, and
are non-negative weighting coefficients that balance the contributions of each loss term.
The physics-consistency loss
encourages consistency between the learned representations and the underlying voltage dynamics by penalizing discrepancies in next-step voltage prediction.
In addition, a contrastive learning component
implements an InfoNCE objective over temporally adjacent windows, where
and
denote representations of positive temporal pairs,
is the batch set of candidate keys, and
is the contrastive temperature. This term promotes temporal consistency and improves representation quality for downstream prediction tasks.
Model optimization is performed using the AdamW optimizer with gradient clipping (norm bounded by 1.0), cosine learning rate decay, and mixed-precision training to improve numerical stability and computational efficiency.
3.7. Training and Deployment Algorithm
Algorithm 1 integrates all components of the proposed framework into a unified training and deployment workflow. The procedure starts by estimating the parameters of the physics-based baseline model, denoted by
, using nominally healthy data
according to Equation (
3). Based on the trained baseline, operation-normalized residuals
and degradation severity indices
are computed for the full dataset. A degradation threshold
is then calibrated and the corresponding temporal event labels
are generated using the temporal smoothing strategy described in
Section 3.2.3.
The SensorFusion-Former model is subsequently trained under the unified multi-task objective in Equation (
28). Optimization is performed using the AdamW optimizer with gradient clipping and early stopping to promote stable convergence. After model training, weighted conformal calibration is conducted on the held-out calibration set
to estimate the conformal quantile
and the probability thresholds
. These calibrated quantities are used during deployment to enable risk-controlled decision making for both severity assessment and event detection.
| Algorithm 1. SensorFusion-Former Training and Calibration. |
Require: Raw telemetry , healthy subset annotation , validation set , hyperparameters Ensure: Trained SFF model , calibrated thresholds
- 1:
// Phase 1: Physics Baseline Training - 2:
Initialize (e.g., pretrained OCV curves) - 3:
for
do - 4:
Compute via ( 2) - 5:
end for - 6:
( 3) via L-BFGS-B - 7:
// Phase 2: Weak Label Generation - 8:
for
do - 9:
Compute via ( 4) using - 10:
- 11:
end for - 12:
Set - 13:
Generate via ( 9) and ( 10) - 14:
// Phase 3: SFF Model Training - 15:
Initialize (Xavier/He initialization) - 16:
for epoch to do - 17:
Shuffle and partition into mini-batches - 18:
for mini-batch do - 19:
for do - 20:
Construct via ( 15) - 21:
Forward pass through SFF (( 18)) - 22:
Compute - 23:
end for - 24:
Evaluate via ( 28) - 25:
with gradient clipping - 26:
end for - 27:
if on does not improve for P epochs then - 28:
break (early stopping) - 29:
end if - 30:
end for - 31:
// Phase 4: Conformal Calibration - 32:
Partition into (calibration) and (test) - 33:
Compute via ( 25) on - 34:
Compute via ( 27) on - 35:
return
|
3.8. Computational Complexity and Real-Time Feasibility
We analyze the computational requirements of the proposed SensorFusion-Former architecture to assess its suitability for real-time deployment in embedded BMS with limited computational resources.
Theorem 1 (Per-Step Inference Complexity)
. Consider a causal context window of length W, hidden dimension , L transformer layers, H attention heads, and FAVOR+ rank r. The per-step forward-pass computational complexity of the proposed model is given bywhere the three terms correspond to sensor group tokenization and fusion, linearized causal attention, and position-wise feedforward networks, respectively. Proof. The overall complexity is derived by analyzing each component of the forward pass. First, the cross-sensor attention operates over sensor groups. Computing group-wise projections and attention incurs operations, which simplifies to and is negligible compared with temporal modeling costs. Second, each FAVOR+ causal attention layer processes a sequence of length W with hidden dimension using r random features per attention head, resulting in operations per layer. Third, the position-wise feedforward networks require operations per layer. Summing these terms over L layers yields the stated complexity. Since by design, the overall complexity scales linearly with the window length W. □
For comparison, a standard transformer with vanilla self-attention incurs a per-step complexity of , which is dominated by the quadratic dependence on the sequence length. Under typical deployment settings (e.g., , , , , and ), the FAVOR+ attention mechanism reduces the attention-related computation by more than an order of magnitude relative to vanilla attention while preserving the expressive power of softmax-based attention.
The resulting linear scaling with respect to W enables real-time inference at a sampling interval of ms on embedded platforms commonly used in automotive battery management systems. This computational efficiency leaves sufficient headroom for concurrent BMS tasks, including state estimation, thermal control, and safety monitoring, thereby supporting practical onboard deployment.
3.9. Complete Methodology Pipeline
Figure 4 provides an integrated overview of the proposed methodology by connecting all components introduced in this section into a unified processing pipeline. The workflow begins with the estimation of the physics-guided baseline model parameters
using nominally healthy telemetry data
according to Equation (
3). This stage establishes reference voltage predictions
and operation-normalized residuals
, which form the foundation for subsequent degradation quantification.
In the second phase, weak supervision signals are constructed by computing the degradation severity index , calibrating the degradation threshold , and applying temporal smoothing operations, including hysteresis, minimum-duration filtering, and gap merging. These steps yield both frame-level event labels and horizon-based early-warning labels , enabling the learning of both reactive detection and predictive warning capabilities.
The third phase trains the SensorFusion-Former model using the unified multi-task objective defined in Equation (
28). This objective jointly optimizes heteroscedastic regression for severity estimation, evidential classification for event detection and early warning, and physics-consistency forecasting through next-step voltage prediction. Model optimization is performed using the AdamW optimizer with gradient clipping and early stopping to ensure stable and robust convergence.
In the final phase, weighted conformal prediction is applied on a held-out calibration set to derive risk-controlled decision thresholds, including the conformal quantile and probability threshold . The calibrated model is then deployed for real-time inference on-board electric vehicles.
As illustrated by the red dashed feedback loop in
Figure 4, the proposed pipeline supports continuous post-deployment refinement. Newly collected fleet-scale data can be used to update the domain alignment and calibration components, allowing the system to maintain robustness under seasonal variability, shifting usage patterns, and platform drift, with updated parameters being periodically redistributed across the vehicle fleet.
5. Conclusions and Future Work
This paper proposes a unified framework for early warning of short-term electric vehicle battery performance degradation, with explicit emphasis on early warning timeliness, probabilistic reliability, and practical deployability. By integrating a physics-guided baseline with a multi-sensor fusion transformer architecture, the proposed SensorFusion-Former (SFF) is able to capture subtle degradation precursors that are difficult to identify using conventional convolutional, recurrent, or generic attention-based models. The use of weak supervision derived from physics-consistent residual signals enables scalable training without reliance on densely annotated degradation events, while evidential uncertainty modeling and conformal calibration provide principled mechanisms for risk-controlled decision-making in safety-critical deployment settings.
Extensive experimental evaluations across multiple scenarios demonstrate that SFF consistently outperforms a diverse set of baseline methods. In particular, the proposed approach achieves substantially longer early warning lead times with reduced false alarm rates while maintaining competitive discriminative performance and significantly lower inference latency. Cross-scenario experiments under nominal, hot-climate, and cold-climate operating conditions further confirm the robustness and generalization capability of the framework. These results collectively validate the effectiveness of combining physics-guided normalization, explicit cross-sensor interaction modeling, and lightweight temporal attention for real-time battery health monitoring.
Several directions remain open for future investigation. First, extending the framework to support online or continual learning would allow the model to adapt to long-term battery aging effects and evolving operating conditions. Second, incorporating richer physics-informed priors such as degradation-aware electrochemical models or advanced state estimation techniques could further improve interpretability and robustness. Third, future work might explore the joint optimization of early-warning models with downstream control policies, including adaptive charging and thermal management strategies, in order to establish a closed-loop connection between detection and mitigation. Finally, large-scale deployment and validation across heterogeneous vehicle platforms at the fleet level would provide valuable insights into scalability, transferability, and real-world operational impact.
In summary, this work establishes a principled and deployable foundation for early-warning detection of short-term electric vehicle battery degradation, offering a general paradigm for integrating physics guidance, multi-sensor fusion, and uncertainty-aware learning in safety-critical time series monitoring applications.