Knowledge-Enhanced Time Series Anomaly Detection for Lithium Battery Cell Screening

Liu, Zhenjie; Wang, Yudong; He, Jianjun

doi:10.3390/pr14020371

Open AccessArticle

Knowledge-Enhanced Time Series Anomaly Detection for Lithium Battery Cell Screening

by

Zhenjie Liu

^1,2,

Yudong Wang

^2,* and

Jianjun He

¹

School of Automation, Central South University, Changsha 410083, China

²

Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China

^*

Author to whom correspondence should be addressed.

Processes 2026, 14(2), 371; https://doi.org/10.3390/pr14020371

Submission received: 30 October 2025 / Revised: 25 December 2025 / Accepted: 6 January 2026 / Published: 21 January 2026

(This article belongs to the Special Issue Process Safety and Control Strategies for Urban Clean Energy Systems)

Download

Browse Figures

Versions Notes

Abstract

The increasing application of lithium-ion batteries in manufacturing and energy storage systems necessitates high-precision screening of abnormal cells during manufacturing, so as to ensure safety and performance. Existing methods struggle to break down the barrier between prior knowledge and data, suffering from limitations such as insufficient detection accuracy and poor interpretability. This becomes even more prominent when facing distributional shifts in data. In this study, we propose a knowledge-enhanced anomaly detection framework for cell screening. This framework integrates domain knowledge, such as electrochemical principles, expert heuristic rules, and manufacturing constraints, into data-driven models. By combining features extracted from charging/discharging curves with rule-based prior knowledge, the proposed framework not only improves detection accuracy but also enables a traceable reasoning process behind anomaly identification. Experiments based on real-world battery production data demonstrate that the proposed framework outperforms baseline models in both precision and recall, making it a promising preferred solution for quality control in intelligent battery manufacturing.

Keywords:

lithium battery; anomaly detection; knowledge enhanced AD

1. Introduction

Lithium-ion batteries (LIBs) have become the cornerstone of modern energy systems, with a wide range of applications—they can power electric vehicles and be used for grid-scale energy storage. As the demand for high-performance and highly reliable battery cells continues to grow, strict quality control mechanisms must be adopted to detect and eliminate anomalous cells before the batteries are put into use. Failure to detect such anomalies may lead to severe safety hazards, degraded system performance, and significant economic losses [1,2]. In practice, lithium-ion battery cells are commonly assembled in series or parallel configurations to form battery packs that meet the high-power and high-capacity requirements of large-scale applications, such as industrial equipment and energy storage systems. These battery packs often consist of hundreds or even thousands of individual cells and are managed by a battery management system (BMS). The failure of any single cell can significantly compromise the overall performance, reliability, and safety of the entire pack [2].

Conventional approaches for cell anomaly detection (AD) generally fall into two categories [2]. The first relies heavily on domain expertise and typically involves rule-based techniques, such as classifying cells based on capacity, voltage, or internal resistance, or applying fixed threshold criteria. While simple and computationally efficient, this method is often outdated and lacks sufficient granularity to capture subtle anomalies [3,4,5]. The second category adopts data-driven techniques, including traditional machine learning, statistical modeling, and probabilistic methods. These models, however, commonly suffer from poor generalization performance—especially as data volume scales up [6,7]. For example, in modern manufacturing environments, with the advancement of information technology and the increasing use of sensors, a typical lithium battery production line can yield over 500,000 cells per day, and massive amounts of timestamped data can now be collected directly from formation equipment. This multi-dimensional time-series data—covering measurements such as voltage, current, internal resistance, and temperature, offers new opportunities for more precise anomaly detection. This new data form also exacerbates the limitations of conventional approaches: traditional methods often struggle to model long, high-resolution time series effectively, further limiting their applicability in high-throughput industrial scenarios. Besides, machine learning-based approaches tend to be sensitive to noise, outliers, and distribution shifts—especially when labeled anomaly samples are scarce or imbalanced.

Moreover, most existing studies in this domain focus on post-assembly battery packs, leveraging BMS data to monitor and analyze cell behavior. However, there has been relatively limited attention paid to the early-stage production process, leading to a time lag in anomaly detection. If defective cells are not identified during manufacturing and proceed to the assembly stage—or worse, enter the market—they pose severe risks to both user safety and property.

To address the issues mentioned above, we propose a knowledge-enhanced time-series anomaly detection model to screen abnormal lithium-ion battery cells. First, we introduce a TCN-based anomaly detection framework, which utilizes multi-source data, including voltage, current, and internal resistance. To mitigate the potential problem of blind spots of the convolutional kernels, we design a randomly sampled dilated convolution kernel. This kernel helps ensure that the model can capture long-range dependencies without sacrificing performance. Second, we propose a knowledge enhancement approach that synthesizes typical manufacturing domain knowledge and encodes it into a regularization term for the model’s loss function. This term constrains the reconstruction process, incorporating expert knowledge to improve the model’s interpretability and robustness. The main contributions of this work are summarized as follows:

1.: Development of a TCN-based anomaly detection framework: We introduce a novel time-series anomaly detection framework based on the Temporal Convolutional Networks (TCN), which effectively handles long-range dependencies and overcomes the issue of blind spots of convolutional kernels by using randomly sampled dilated convolution kernels.
2.: Knowledge-enhanced regularization for improved model interpretability: We propose a knowledge-enhanced approach that encodes manufacturing domain knowledge into the model’s loss function as the regularization terms. This additional knowledge constraint improves model robustness and interpretability, particularly in identifying subtle anomalies that may not be captured by data-driven methods alone.
3.: We validate our method on real-world industrial datasets, demonstrating significant improvements in detection performance over existing baselines.

This study aims to bridge the gap between expert knowledge and deep learning in battery cell anomaly detection, offering a practical and scalable solution for intelligent manufacturing systems.

2. Related Work

In recent years, research in anomaly detection has increasingly focused on deep neural network (DNN)-based models, which leverage their powerful non-linear processing capabilities to overcome the limitations of traditional methods, classical machine learning, statistical learning, and probabilistic approaches [8,9]. There are three main paradigms for DNN-based anomaly detection: predictive-based, end-to-end, and reconstruction error-based methods.

The predictive-based approach uses prior data samples to predict the current state of a battery cell. A deviation in prediction indicates an anomaly in the cell. However, building highly accurate predictive models remains a challenge, especially in industrial scenarios [10]. Even slight deviations in prediction accuracy can have catastrophic consequences. The end-to-end model divides the dataset into two categories: anomalous and normal. When a new battery cell sample is input into the model, it outputs the probability of the sample belonging to either category. This method requires a well-curated dataset, where labels are complete and the categories are balanced. However, in large-scale production environments, obtaining a sufficient number of anomalous samples is infeasible, making this method difficult to apply in practice [11]. Recent studies on end-to-end models have explored data augmentation techniques, such as generative adversarial networks (GANs), to generate large quantities of anomalous samples [12]. However, the differences between normal and anomalous lithium-ion cells are often subtle in terms of data patterns, and the anomalies primarily manifest as underlying mechanistic differences that can only be detected with domain knowledge. Alternative methods, such as noise-based data augmentation (e.g., adding random Gaussian noise), have been proposed to expand the anomalous sample set. Nonetheless, these methods still struggle to avoid confusion between the distributions of normal and anomalous cells. The predictive-based and end-to-end methods mentioned above are only applicable to laboratory research and hard to be applied in actual production [1,13,14].

In contrast, reconstruction error-based anomaly detection methods are different. They mainly include supervised, unsupervised, or semi-supervised models, and when applied to lithium battery anomaly detection, they are primarily unsupervised and semi-supervised methods [13]. Such methods are usually trained on normal sample sets, enabling them to fully learn the features and patterns of normal cells. For a sample, the model aims to obtain its reconstruction. When a sample to be detected is input into the model, if it cannot be well reconstructed, it indicates that the sample is not a normal one and thus is labeled as an anomaly. Autoencoders (AEs) are among the most commonly used RE-based models due to their simplicity and effectiveness [15,16,17]. An AE uses an encoder to map the data to a lower-dimensional manifold, capturing its representation, and a decoder to map this representation back to the input space to reconstruct the data. The reconstruction error is usually measured using mean squared error (MSE). However, AEs do not incorporate modules for extracting temporal dependencies, which can lead to fluctuations in reconstruction results and misalignment of higher-order features [18]. Some studies have gradually explored methods for integrating knowledge and machine learning [19]. They use LSTM to extract features from battery cell data such as charge-discharge voltage, internal resistance, and capacity, followed by the fusion of knowledge and these features [1]. Ossai et al. proposed a battery life prediction and detection method based on expert knowledge and data [20]. Convolutional Neural Networks (CNNs) optimize the extraction of local features using convolution kernels, which are effective in capturing local dependencies in battery cell charge/discharge time series. By stacking multiple convolutional layers, CNNs can capture dependencies in data within a larger field of view [21]. However, they struggle to simultaneously capture both short-term and long-term dependencies [22]. Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks can address this limitation, but they often fail to preserve long-term dependencies when the time series spans a long duration [17,23]. To balance both long- and short-term dependencies, Temporal Convolutional Networks (TCNs) introduce dilated convolutions and causal convolutions. The dilated convolution allows the model to have a larger receptive field [24]. Modern TCNs, which incorporate advanced convolution kernels, further expand the receptive field [24]. However, TCN-based methods are still susceptible to the problem blind spots of convolutional kernels, where the model struggles to capture certain temporal dependencies. The FEDformer, an improvement on Autoformer and Informer, introduces Fourier Transform (FTT)-based analysis for better handling of time-series data [25,26,27]. However, these Transformer-based methods rely on high resource consumption, making practical deployment quite challenging.

Time-series anomaly detection relies on capturing complex data patterns, such as short-term and long-term dependencies, among other temporal data patterns. However, the aforementioned methods employ regular convolution kernels or regular feature extractors, which struggle to break through the bottleneck in extracting data patterns. Furthermore, in terms of utilizing prior knowledge in lithium battery manufacturing, an effective framework for integrating data and knowledge has not yet been established.

3. Methodology

The proposed model, Knowledge-enhanced and Partially Randomly Sampling based TCN (KP-TCN) is illustrated in Figure 1. For the AD model, we adopt an improved TCN as the backbone network. Among its components, we introduce randomly sampled dilated convolution kernels (RS-DCKs) to alleviate the blind spot phenomenon caused by stacked DCKs. In terms of knowledge enhancement, through in-depth research and collaboration with manufacturers, we have collected domain knowledge from the manufacturing process and integrated it into regularization terms for the model’s loss function. During the backpropagation process, they work in synergy with the original loss function to improve the model’s performance and robustness.

3.1. TCN and Blind Spot

3.1.1. Temporal Dependency Acquisition

The core idea behind TCN is to use causal convolutions where the output at time t depends only on current and previous time steps, ensuring the network does not violate the temporal order. This is particularly important when modeling time series data, as future information should not influence past predictions. The causal convolution can be expressed as

y_{t} = \sum_{i = 0}^{k - 1} w_{i} x_{t - i}

(1)

where

y_{t}

is the output at time step t,

x_{t}

is the input at time t, and

w_{i}

represents the convolution filter weights over the previous k time steps. For deeper networks with non-linear activation functions, this extends to

y_{t}^{(l)} = σ (\sum_{i = 0}^{k - 1} w_{i}^{(l)} y_{t - i}^{(l - 1)} + b^{(l)})

(2)

where

σ (\cdot)

denotes a non-linear activation function (e.g., ReLU or GELU), l indicates the layer index,

w_{i}^{(l)}

represents the weights in layer l, and

b^{(l)}

is the bias term for that layer. To enhance the ability to model long-range dependencies, TCNs introduce dilated convolutions. A dilated convolution increases the receptive field without increasing the number of parameters or the computational complexity. It is achieved by skipping certain input elements, which allows the model to cover a larger range of past inputs with each convolution operation. The dilated convolution operation can be written as

y_{t} = \sum_{i = 0}^{k - 1} w_{i} x_{t - r \cdot i}

(3)

where r is the dilation rate. A higher dilation rate means the model considers more distant time steps, increasing its ability to capture long-range dependencies. For a network with multiple layers, the effective receptive field (RF) after L layers can be calculated as

R F = 1 + \sum_{l = 1}^{L} (k_{l} - 1) \cdot \prod_{m = 1}^{l} r_{m}

(4)

where

k_{l}

is the kernel size of layer l, and

r_{m}

is the dilation rate of layer m. This formula demonstrates how exponentially increasing dilation rates lead to exponentially growing receptive fields.

In practice, TCNs typically employ multiple layers of dilated convolutions with exponentially increasing dilation rates, which allows the model to effectively capture both short- and long-term dependencies across the sequence. The dilation rate in each layer can be represented as

r_{l} = 2^{l - 1}, f o r l = 1, 2, \dots, L

(5)

For time-series anomaly detection specifically, TCNs often optimize a combined loss function that includes both reconstruction error and regularization terms:

RE = \sum_{t = 1}^{T} {∥x_{t} - {\hat{x}}_{t}∥}_{2}^{2}

(6)

L = RE + λ \sum_{l = 1}^{L} ({∥W^{(l)}∥}_{2}^{2} + {∥b^{(l)}∥}_{2}^{2}) + γ H (p_{t})

(7)

where

{\hat{x}}_{t}

is the reconstructed input,

λ

controls the strength of L2 regularization,

γ

weights the contribution of the regularization term

H (p_{t})

, and

p_{t}

represents the anomaly probability distribution at time t. This architecture helps TCNs achieve a wider receptive field while maintaining computational efficiency, making them particularly suitable for large-scale time-series anomaly detection.

The primary advantages of TCNs over traditional RNNs and LSTMs lie in their ability to process long-range dependencies more efficiently. By leveraging causal and dilated convolutions, TCNs are able to capture temporal patterns across multiple time scales while avoiding the vanishing gradient problem often encountered in RNNs and LSTMs. These properties make TCNs an ideal choice for time-series anomaly detection tasks, such as lithium-ion battery cell screening, where detecting subtle temporal irregularities is crucial.

3.1.2. Blind Spot Phenomenon

The blind spot of convolutional kernels is a term used to describe a phenomenon that occurs in convolutional kernel-based neural networks, where certain regions of the input data are neglected or ignored during the convolution process. This phenomenon occurs due to excessively large or overly regular dilation rates, which restrict the convolution kernels to a fixed pattern and cause them to consistently skip important local features—resulting in the loss of valuable information from these regions. Essentially, these blind spot in the convolution process prevent the network from capturing critical patterns, thereby leading to degraded performance. This issue is particularly prominent when handling time-series data, where local dependencies play a crucial role. To quantify this phenomenon, consider a dilated convolution with kernel size k and dilation rate r operating on a time-series input

x_{t}

. The total number of input elements spanned by the convolution (i.e., the interval from the earliest to the latest time step covered) is

S = r \cdot (k - 1) + 1

(8)

where

S

represents the span of the convolution in time steps. However, the kernel only explicitly processes k discrete elements within this span, meaning the number of skipped elements (i.e., the size of the blind spot) is

B = S - k

(9)

This formula reveals that the size of the blind spot

B

grows linearly with both the dilation rate r and kernel size k, explaining why excessively large dilation rates create significant information gaps.

For deep networks with multiple dilated layers, the cumulative blind spot effect across L layers can be approximated as the sum of individual layer contributions, weighted by their respective receptive field expansion. This cumulative effect exacerbates information loss in deep architectures, making local pattern capture increasingly challenging. This issue is particularly problematic when working with long sequences of data, as increasing the receptive field (via dilation) may inadvertently skip over fine-grained details that are necessary for accurate anomaly detection or feature learning. The phenomenon can be compared to a “gap” in the model’s ability to perceive certain time steps, especially when crucial time-series patterns are hidden behind the dilation.

3.2. Partially Randomly Sampling

To address this issue, we propose a partially randomly sampled dilated convolution mechanism based TCN (PRS-TCN). Rather than replacing all dilation patterns with randomness, our approach preserves the backbone structure of the standard TCN, and selectively introduces randomly sampled dilation kernels into a portion of the convolutional layers. This hybrid design enhances the model’s ability to perceive diverse temporal patterns while maintaining the stability and efficiency of conventional TCNs.

Specifically, we first define a candidate dilation pool

R

, which contains a set of possible dilation offsets. For each randomly injected layer, we then sample a dilation kernel from this pool. Let the kernel size be k, and let

R = {r_{1}, r_{2}, \dots, r_{M}}

be the predefined candidate set (with

M ≫ k

). For a given convolutional kernel, we randomly select k offsets from

R

without replacement:

{r^{(1)}, r^{(2)}, \dots, r^{(k)}} \sim S a m p l e (R, k) .

(10)

The output at time step t is then computed as

y_{t} = \sum_{i = 1}^{k} w_{i} \cdot x_{t - r^{(i)}} .

(11)

These randomly sampled kernels are only applied to a subset of layers, while the remaining layers use traditional fixed dilation rates (e.g., powers of two). This design allows the model to benefit from a broader and more irregular receptive field, improving its robustness to temporal variation and mitigating the feature solidification phenomenon. In our lithium-ion battery cell screening scenario, this design is particularly effective in capturing abnormal behaviors that occur across different temporal resolutions, while avoiding the inefficiencies and blind spots of fixed dilation strategies.

3.3. Knowledge-Enhanced Regularization

In machine learning, the incorporation of domain-specific knowledge into model training has been shown to significantly improve generalization and robustness, especially in cases where data is limited or noisy. This can be achieved through a knowledge-enhanced regularization term, which introduces prior knowledge (e.g., expert rules, physical constraints, or industry standards) into the model’s loss function. This approach constrains the model’s learning process to adhere to known patterns and relationships, ensuring that the learned model does not deviate from established domain principles.

The primary objective of knowledge-enhanced regularization is to guide the model’s learning by enforcing certain predefined domain rules or patterns. This not only helps the model avoid learning spurious patterns that do not reflect real-world phenomena but also assists in improving model performance in scenarios where sufficient labeled data is unavailable. The incorporation of domain knowledge has been particularly useful in applications where explicit rules and physical relationships are known, yet the data may be sparse or difficult to interpret. The knowledge-enhanced regularization term can be formalized as an additional component in the loss function. The overall loss function becomes

L_{t o t a l} = (1 - λ) L_{d a t a} + λ \cdot L_{k n}

(12)

where

L_{t o t a l}

is the total loss function, incorporating both the data-driven loss and the knowledge-driven regularization;

L_{d a t a}

is the standard loss term (e.g., mean squared error, cross-entropy) that reflects the model’s performance on the data;

L_{k n}

is the regularization term derived from domain knowledge, which introduces constraints to guide the model’s learning process;

λ

is a hyperparameter that controls the influence of the knowledge-enhanced regularization term.

In the context of lithium-ion battery cell anomaly detection, there are several domain-specific rules and constraints that are widely accepted. These include physical relationships between battery parameters such as voltage, current, resistance, and temperature, as well as expected operational behaviors during charging and discharging cycles. These prior knowledge elements can be effectively incorporated into the model through knowledge-enhanced regularization.

For instance, it is well established that the voltage of a lithium-ion battery should remain within a certain range during charging and discharging. If the voltage exceeds the upper threshold or falls below the lower threshold, the battery is likely to exhibit abnormal behavior. To incorporate this domain knowledge, we can define a regularization term that penalizes deviations from these voltage constraints:

L_{k n} = \sum_{t} (max (0, V_{t} - V_{m a x}) + max (0, V_{m i n} - V_{t}))

(13)

where

V_{t}

is the voltage of the battery at time step t, and

V_{m a x}

and

V_{m i n}

are the maximum and minimum allowable voltage thresholds, respectively. This regularization term ensures that the model’s predictions do not violate the known physical limits of battery voltage, reducing the likelihood of false positives and improving the accuracy of anomaly detection.

Additionally, there are other domain-specific patterns that can be encoded as regularization terms. For example, the relationship between battery voltage and current during a charging cycle is often known to follow a specific pattern. If the model’s output deviates from this expected relationship, the knowledge-enhanced regularization can penalize the model accordingly:

L_{k n} = \sum_{t} (| I_{t} - f (V_{t}) |)

(14)

where

I_{t}

is the current at time step t, and

f (V_{t})

is the expected function of current as a function of voltage, derived from domain knowledge or empirical measurements. This term forces the model to respect the known relationship between voltage and current during the battery’s operation, further improving its robustness and interpretability.

4. Experimental Results

In order to validate the effectiveness of the proposed KP-TCN for lithium-ion battery anomaly detection, we conducted extensive experiments on a real-world lithium-ion battery production dataset. This dataset includes time-series data collected from various sensors during the battery manufacturing process, including voltage, current, internal resistance, and temperature, among others. These real-time data reflect the dynamic conditions and operational characteristics of the battery cells as they undergo different stages of manufacturing, such as formation and charge/discharge cycles.

4.1. Dataset Description

The training set consisted of 50,000 normal battery samples, while the test set contained 5000 samples, of which 1000 were labeled as anomalous. These anomalous samples represent battery cells that exhibit abnormal behavior, potentially due to defects in the manufacturing process, such as faulty internal wiring or issues with the battery’s chemical composition. The dataset provides a challenging yet realistic scenario for evaluating the effectiveness of our knowledge-enhanced model, as it includes a mix of clean, high-quality data as well as rare, potentially noisy anomalous instances. Additionally, we applied min-max normalization preprocessing to the original dataset, scaling all features to the range of 0 to 1. This ensures that different features have similar scales during training, preventing certain features from dominating the model’s learning process due to their larger numerical ranges.

4.2. Experimental Setup

We employed a semi-supervised learning approach for the training of our model. In this setup, the model was trained primarily on normal battery samples to learn the general characteristics of healthy battery cells. The semi-supervised approach allows the model to leverage the abundance of normal data to learn the typical behavior of battery cells, while also being capable of identifying anomalies during the testing phase. This method is particularly suited for industrial settings where labeled anomalous samples are scarce but normal operation data is abundant. The distribution of anomalies in the test set reflects the typical rarity of defect occurrences in large-scale production environments, which adds to the challenge of accurate anomaly detection. The architecture of our model consists of a 9-layer TCN structure, with each layer containing 128 channels and a convolution kernel size of 11. We introduced partially randomly sampled dilated convolution kernels in the first two convolutional layers, with dilation factors set to 1 or 2. This design allows the model to capture richer data features while mitigating the blind spot issue in the receptive field. We used the Adam optimizer for model training, with an initial learning rate set to 0.001 and a dynamic adaptive learning rate adjustment strategy. During training, the batch size was set to 64, and the model was trained for 100 epochs.

First, we compare our method with the autoencoder, TCN, and several advanced methods such as Autoformer, Informer, and FEDformer without incorporating domain knowledge in these experiments. Then, we will conduct an ablation experiment in Section 1 to demonstrate the effectiveness of the proposed knowledge filter.

In the anomaly detection of lithium-ion batteries, Anomaly is defined as the positive class. The relevant indicator formulas are as follows:

Recall (REC): Also known as the True Positive Rate (TPR), it represents the proportion of actual anomalous samples that are correctly predicted as anomalous. The formula is:

R e c a l l = \frac{T P}{T P + F N}

, where

T P

(True Positive) is the number of samples that are correctly predicted as anomalous, and

F N

(False Negative) is the number of samples that are actually anomalous but are mispredicted as normal.

Precision (PRE): It measures the proportion of samples predicted as anomalous that are actually anomalous. The formula is:

P r e c i s i o n = \frac{T P}{T P + F P}

, where

F P

(False Positive) is the number of samples that are actually normal but are mispredicted as anomalous.

4.3. Main Results

We trained the proposed KP-TCN on the training set and then observed its performance on the test dataset. We first verify the convergence of the algorithm and simultaneously evaluate the inference efficiency of the model, then observe the performance of KP-TCN and comparison methods on the test dataset, and finally validate the effectiveness of the proposed partially random sampling DCKs strategy and knowledge-enhanced strategy through ablation experiments. In the subsequent experiments, all experimental results are based on the average value of five independent runs.

4.3.1. Convergency

During the training process, all models adopted a dynamic adaptive learning rate to adjust the magnitude of model parameter updates in each gradient descent iteration, aiming to achieve the optimal fitting state. The training loss observations of all models are shown in Figure 2. It can be seen that all models converged successfully.

We present the comparison results of online real-time detection time for different methods, as shown in Table 1. An Intel 10400 CPU was used for the computations during inference. The proposed KP-TCN not only maintains high detection performance but also demonstrates strong competitiveness in online real-time detection time, indicating that the proposed method has high practical value in actual industrial applications.

4.3.2. Detailed Results

We define the threshold

α

as the percentage of samples labeled as anomalies in each detection, and observe the performance of KP-TCN and comparative methods under different detection intensities by adjusting the value of

α

. Specifically:

1.: When $α$ equals 0.2, the number of samples labeled as anomalies in each detection is 1000 (20% of the 5000-test sample set).
2.: When $α$ is less than 0.2, the number of samples labeled as anomalies is smaller than the actual number of anomalies in the test set (1000).

We provide detailed observations of the metrics including precision and recall (TPR), as shown in Table 2. The performance changes of the models are observed by modifying the value of

α

. Through comparison, it can be seen that our method achieves a recall of 100% when

α = 35 %

, significantly outperforming other methods. In particular, when

α \leq 5 %

, our method achieves the higher PRE and REC, far exceeding other methods. This indicates that even with a lower detection intensity, our method can maintain high precision and recall rates.

As shown in Figure 3, the PRE-REC curves are presented, where the horizontal axis represents recall and the vertical axis represents precision. Each curve indicates the precision of the model at the corresponding recall level. It can be observed that the proposed KP-TCN achieves the best performance, followed by TCN and MTCN.

Figure 4 shows the AUC-ROC curves, with the horizontal axis representing the False Positive Rate (FPR) and the vertical axis representing the True Positive Rate (TPR, i.e., recall). The AUC-ROC evaluates model performance by measuring the area under the ROC curve, where a higher AUC value indicates stronger detection capability. Notably, the proposed KP-TCN achieves the highest AUC score in this metric, followed by TCN and Autoformer.

4.4. Ablation Study

To demonstrate the effectiveness of the proposed partially random sampling dilated convolution mechanism and knowledge-enhanced regularization, we conducted ablation experiments in Table 3. First, we compared the backbone model without knowledge enhancement (partially randomly sampling TCN, PRS-TCN) with the original TCN to verify the effectiveness of the partially random sampling dilated convolution mechanism. Then, we compared the TCN with knowledge enhancement against the original TCN to validate the effectiveness of the proposed knowledge-enhanced regularization.

Importance of knowledge enhanced regularization: The combination of the knowledge-enhanced regularization and the model training strategy in this study enables the model to learn richer intrinsic patterns of normal samples, thereby better performing the anomaly detection task. It can be seen that the proposed knowledge-enhanced regularization allows the original TCN to achieve a REC (Recall) of 100% when

α = 40 %

, which demonstrates the effectiveness of the knowledge-enhanced regularization mechanism.

Importance of partially randomly sampling mechanism: The partially random sampling mechanism optimizes the shortcomings of TCN to avoid the occurrence of blind spots of convolutional kernels. Experimental results show that the proposed method can prevent the emergence of blind spots, avoiding the TCN from ignoring critical features when capturing data characteristics, thereby enhancing the model’s performance. Notably, PRS-TCN achieves a REC (Recall) of 100% when

α = 40 %

, significantly outperforming the TCN without this strategy, which demonstrates the effectiveness of the partially random sampling mechanism.

5. Conclusions

In this study, we propose a TCN-based model integrated with knowledge enhancement and a partially random sampling mechanism (KP-TCN), which has two main advantages:

1.: Partially random sampling mechanism: This mechanism effectively avoids the blind spot problem of convolutional kernels, preventing the TCN from missing critical features when capturing data patterns.
2.: Knowledge-enhanced regularization: By leveraging prior knowledge, this component enables the model to learn richer intrinsic patterns from normal samples.

Experimental results demonstrate that the proposed method significantly outperforms comparative approaches. Additionally, ablation experiments validate the superiority of KP-TCN by confirming the effectiveness of its key components. In future work, we plan to apply the proposed method to more practical scenarios to validate its generalization capabilities. Additionally, we intend to explore more types of prior knowledge to further enhance the model’s performance.

Author Contributions

Conceptualization, methodology, software, data curation, writing, visualization, supervision, Z.L.; validation, investigation, project administration, funding acquisition, resources, Y.W.; formal analysis, J.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported in part by the National Key Research and Development Program of China under Grant 2024YFB4709100; in part by the National Natural Science Foundation of China under Grants 62503478, 62425310, 62073321, 62473367; in part by National Defense Basic Scientific Research Program JCKY2019203C029; in part by Science and technology Innovation Project of China Academy of Chinese Medical Sciences CI2023C005YG; in part by the Science and Technology Development Fund, Macau SAR under Grants FDCT-22-009-MISE, 0060/2021/A2 and 0015/2020/AMJ; in part by the financial support from the National Defense Basic Scientific Research Project JCKY2020130C025.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wang, Y.; Bai, X.; Liu, C.; Tan, J. A multi-source data feature fusion and expert knowledge integration approach on lithium-ion battery anomaly detection. J. Electrochem. Energy Convers. Storage 2022, 19, 021003. [Google Scholar] [CrossRef]
Wang, Y.; Tan, J.; Liu, Z.; Ditta, A. Lithium-ion Battery Screening by K-means with Dbscan for Denoising. CMC-Comput. Mater. Cont. 2020, 65, 2111–2122. [Google Scholar] [CrossRef]
Pan, Y.; Kong, X.; Yuan, Y.; Sun, Y.; Han, X.; Yang, H.; Zhang, J.; Liu, X.; Gao, P.; Li, Y.; et al. Detecting the foreign matter defect in lithium-ion batteries based on battery pilot manufacturing line data analyses. Energy 2023, 262, 125502. [Google Scholar] [CrossRef]
Dey, S.; Mohon, S.; Pisu, P.; Ayalew, B. Sensor fault detection, isolation, and estimation in lithium-ion batteries. IEEE Trans. Control Syst. Technol. 2016, 24, 2141–2149. [Google Scholar] [CrossRef]
Liu, W.; Zhang, Z.; Zhao, Z.; Zhang, W. A Framework for Anomaly Cell Detection in Energy Storage Systems Based on Daily Operating Voltage and Capacity Increment Curves. Batteries 2025, 11, 316. [Google Scholar] [CrossRef]
Schaeffer, J.; Lenz, E.; Gulla, D.; Bazant, M.Z.; Braatz, R.D.; Findeisen, R. Lithium-Ion Battery System Health Monitoring and Fault Analysis from Field Data Using Gaussian Processes. arXiv 2024, arXiv:2406.19015. [Google Scholar] [CrossRef]
Shen, D.; Yang, D.; Lyu, C.; Ma, J.; Hinds, G.; Sun, Q.; Du, L.; Wang, L. Multi-sensor multi-mode fault diagnosis for lithium-ion battery packs with time series and discriminative features. Energy 2024, 290, 130151. [Google Scholar] [CrossRef]
Boukerche, A.; Zheng, L.; Alfandi, O. Outlier detection: Methods, models, and classification. ACM Comput. Surv. (CSUR) 2020, 53, 1–37. [Google Scholar] [CrossRef]
Zamanzadeh Darban, Z.; Webb, G.I.; Pan, S.; Aggarwal, C.; Salehi, M. Deep learning for time series anomaly detection: A survey. ACM Comput. Surv. 2024, 57, 1–42. [Google Scholar] [CrossRef]
Schmidl, S.; Wenig, P.; Papenbrock, T. Anomaly detection in time series: A comprehensive evaluation. Proc. VLDB Endow. 2022, 15, 1779–1797. [Google Scholar] [CrossRef]
Pang, G.; Shen, C.; Cao, L.; Hengel, A.V.D. Deep learning for anomaly detection: A review. ACM Comput. Surv. (CSUR) 2021, 54, 1–38. [Google Scholar] [CrossRef]
Xia, X.; Pan, X.; Li, N.; He, X.; Ma, L.; Zhang, X.; Ding, N. GAN-based anomaly detection: A review. Neurocomputing 2022, 493, 497–535. [Google Scholar] [CrossRef]
Li, X.; Wang, Q.; Xu, C.; Wu, Y.; Li, L. Survey of Lithium-Ion Battery Anomaly Detection Methods in Electric Vehicles. In IEEE Transactions on Transportation Electrification; IEEE: Piscataway, NJ, USA, 2024. [Google Scholar]
Dong, G.; Lin, M. Model-based thermal anomaly detection for lithium-ion batteries using multiple-model residual generation. J. Energy Storage 2021, 40, 102740. [Google Scholar] [CrossRef]
Cheng, Z.; Wang, S.; Zhang, P.; Wang, S.; Liu, X.; Zhu, E. Improved autoencoder for unsupervised anomaly detection. Int. J. Intell. Syst. 2021, 36, 7103–7125. [Google Scholar] [CrossRef]
Chow, J.K.; Su, Z.; Wu, J.; Tan, P.S.; Mao, X.; Wang, Y.H. Anomaly detection of defects on concrete structures with the convolutional autoencoder. Adv. Eng. Inform. 2020, 45, 101105. [Google Scholar] [CrossRef]
Sun, C.; He, Z.; Lin, H.; Cai, L.; Cai, H.; Gao, M. Anomaly detection of power battery pack using gated recurrent units based variational autoencoder. Appl. Soft Comput. 2023, 132, 109903. [Google Scholar] [CrossRef]
Zhao, H.; Zhang, C.; Liao, C.; Wang, L.; Liu, W.; Wang, L. Data-driven strategy: A robust battery anomaly detection method for short circuit fault based on mixed features and autoencoder. Appl. Energy 2025, 382, 125267. [Google Scholar] [CrossRef]
Wang, X.; Du, Y.; Lin, S.; Cui, P.; Shen, Y.; Yang, Y. adVAE: A self-adversarial variational autoencoder with Gaussian anomaly prior knowledge for anomaly detection. Knowl.-Based Syst. 2020, 190, 105187. [Google Scholar] [CrossRef]
Ossai, C.I.; Egwutuoha, I.P. Anomaly detection and extra tree regression for assessment of the remaining useful life of lithium-ion battery. In Proceedings of the International Conference on Advanced Information Networking and Applications, Caserta, Italy, 15–17 April 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 1474–1488. [Google Scholar]
Cheng, Z.; Sun, H.; Takeuchi, M.; Katto, J. Deep convolutional autoencoder-based lossy image compression. In Proceedings of the 2018 Picture Coding Symposium (PCS), San Francisco, CA, USA, 24–27 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 253–257. [Google Scholar]
Arbaoui, S.; Samet, A.; Ayadi, A.; Mesbahi, T.; Boné, R. Data-driven strategy for state of health prediction and anomaly detection in lithium-ion batteries. Energy AI 2024, 17, 100413. [Google Scholar] [CrossRef]
Yin, C.; Zhang, S.; Wang, J.; Xiong, N.N. Anomaly detection based on convolutional recurrent autoencoder for IoT time series. IEEE Trans. Syst. Man, Cybern. Syst. 2020, 52, 112–122. [Google Scholar] [CrossRef]
Luo, D.; Wang, X. Moderntcn: A modern pure convolution structure for general time series analysis. In Proceedings of the Twelfth International Conference on Learning Representations, Vienna, Austria, 7–11 May 2024; pp. 1–43. [Google Scholar]
Wu, H.; Xu, J.; Wang, J.; Long, M. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Adv. Neural Inf. Process. Syst. 2021, 34, 22419–22430. [Google Scholar]
Ilbert, R.; Odonnat, A.; Feofanov, V.; Virmaux, A.; Paolo, G.; Palpanas, T.; Redko, I. SAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention. In Machine Learning Research, Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria, 21–27 July 2024; PMLR: New York, NY, USA, 2024; Volume 235, pp. 20924–20954. [Google Scholar]
Zhou, T.; Ma, Z.; Wen, Q.; Wang, X.; Sun, L.; Jin, R. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In Proceedings of the International Conference on Machine Learning, Baltimore, MD, USA, 17–23 July 2022; PMLR: New York, NY, USA, 2022; pp. 27268–27286. [Google Scholar]

Figure 1. Framework of the proposed Knowledge-enhanced and Partially Randomly Sampling based Temporal Convolutional Network (KP-TCN). The original TCN includes causal convolutions and dilated convolutions, among which we incorporate random convolution kernels into the dilated convolutions. Additionally, we add a knowledge-based regularization term to the loss function to enhance the model’s capability.

Figure 2. The training losses of all models show that they all converged.

Figure 3. The Precision–Recall curves of the proposed KP-TCN and all comparative methods shows that KP-TCN achieves the best performance.

Figure 4. The AUC-ROC of the proposed KP-TCN and all comparative methods shows that KP-TCN achieves the best performance.

Table 1. Comparison of online real-time detection time (in seconds) for different methods.

Method	AE	TCN	MTCN	Autoformer	Informer	FEDformer	KP-TCN
Time	16.4	7.6	7.1	23.12	31.3	27.8	7.9

Table 2. Detailed Experimental Results.

	$α$ =	1	2	3	4	5	10	15	20	25	30	35	40	45	50
AE	PRE	0.94	0.93	0.933	0.95	0.94	0.872	0.811	0.741	0.673	0.611	0.542	0.486	0.438	0.395
	REC	0.047	0.094	0.14	0.19	0.235	0.436	0.608	0.741	0.842	0.917	0.949	0.973	0.986	0.989
TCN	PRE	1	1	1	0.985	0.976	0.928	0.849	0.771	0.703	0.633	0.563	0.499	0.444	0.4
	REC	0.05	0.1	0.15	0.197	0.244	0.464	0.637	0.771	0.879	0.95	0.985	0.999	1	1
MTCN	PRE	1	0.99	0.973	0.97	0.964	0.932	0.855	0.772	0.696	0.622	0.546	0.482	0.434	0.395
	REC	0.05	0.099	0.146	0.194	0.241	0.466	0.641	0.772	0.87	0.933	0.955	0.964	0.976	0.987
Autoformer	PRE	1	1	0.987	0.98	0.968	0.904	0.831	0.759	0.690	0.623	0.557	0.494	0.442	0.399
	REC	0.05	0.1	0.148	0.196	0.242	0.452	0.623	0.759	0.863	0.934	0.975	0.987	0.995	0.999
Informer	PRE	1	1	0.993	0.985	0.972	0.9	0.824	0.733	0.646	0.583	0.53	0.48	0.436	0.399
	REC	0.05	0.1	0.149	0.197	0.243	0.45	0.618	0.733	0.807	0.875	0.927	0.96	0.982	0.999
FEDformer	PRE	1	1	0.987	0.965	0.94	0.856	0.813	0.751	0.688	0.633	0.567	0.498	0.443	0.398
	REC	0.05	0.1	0.148	0.193	0.235	0.428	0.61	0.751	0.86	0.949	0.993	0.996	0.997	0.997
KP-TCN	PRE	1	1	1	0.995	0.98	0.962	0.924	0.886	0.798	0.666	0.571	0.5	0.444	0.4
	REC	0.05	0.1	0.15	0.199	0.245	0.481	0.693	0.886	0.998	0.999	1	1	1	1

Table 3. Results of Ablation Study.

	$α = (%)$	1	2	3	4	5	10	15	20	25	30	35	40
TCN	PRE	0.94	0.93	0.933	0.95	0.976	0.928	0.849	0.772	0.703	0.6333	0.5629	0.499
	REC	0.047	0.094	0.14	0.19	0.244	0.464	0.637	0.772	0.879	0.95	0.985	0.999
PRS-TCN	PRE	1	0.98	0.98	0.975	0.98	0.944	0.873	0.814	0.738	0.657	0.571	0.5
	REC	0.05	0.098	0.147	0.195	0.245	0.472	0.655	0.814	0.922	0.986	0.999	1
KE-TCN	PRE	1	1	0.993	0.98	0.978	0.948	0.889	0.829	0.72	0.637	0.565	0.5
	REC	0.05	0.1	0.149	0.196	0.244	0.474	0.667	0.829	0.901	0.956	0.989	1
KP-TCN	PRE	1	1	1	0.995	0.98	0.962	0.924	0.886	0.798	0.666	0.571	0.5
	REC	0.05	0.1	0.15	0.199	0.245	0.481	0.693	0.886	0.998	0.999	1	1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, Z.; Wang, Y.; He, J. Knowledge-Enhanced Time Series Anomaly Detection for Lithium Battery Cell Screening. Processes 2026, 14, 371. https://doi.org/10.3390/pr14020371

AMA Style

Liu Z, Wang Y, He J. Knowledge-Enhanced Time Series Anomaly Detection for Lithium Battery Cell Screening. Processes. 2026; 14(2):371. https://doi.org/10.3390/pr14020371

Chicago/Turabian Style

Liu, Zhenjie, Yudong Wang, and Jianjun He. 2026. "Knowledge-Enhanced Time Series Anomaly Detection for Lithium Battery Cell Screening" Processes 14, no. 2: 371. https://doi.org/10.3390/pr14020371

APA Style

Liu, Z., Wang, Y., & He, J. (2026). Knowledge-Enhanced Time Series Anomaly Detection for Lithium Battery Cell Screening. Processes, 14(2), 371. https://doi.org/10.3390/pr14020371

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Knowledge-Enhanced Time Series Anomaly Detection for Lithium Battery Cell Screening

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. TCN and Blind Spot

3.1.1. Temporal Dependency Acquisition

3.1.2. Blind Spot Phenomenon

3.2. Partially Randomly Sampling

3.3. Knowledge-Enhanced Regularization

4. Experimental Results

4.1. Dataset Description

4.2. Experimental Setup

4.3. Main Results

4.3.1. Convergency

4.3.2. Detailed Results

4.4. Ablation Study

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI