1. Introduction
To enhance operational safety and ensure product quality, process monitoring and fault isolation have become critical components in modern industrial systems [
1]. Over the past few decades, various techniques have been developed to address these tasks, among which multivariate statistical process monitoring (MSPM) has emerged as a mainstream approach due to its effectiveness in handling high-dimensional process data and uncovering abnormal behavior based on statistical patterns [
2,
3,
4].
Conventional MSPM techniques, such as principal component analysis (PCA) and its variants, typically assume that process data follow a time-invariant and unimodal Gaussian distribution [
5,
6,
7]. However, practical industrial processes often operate under multiple distinct modes due to variations in raw material properties, production specifications, throughput demands, and control strategies [
8,
9,
10]. Under such multimodal conditions, the performance of standard MSPM approaches degrades significantly, as they fail to capture the complex distributional shifts and mode-dependent dynamics.
In response to these limitations, a growing body of research has explored multimode process monitoring methods. For instance, Li et al. proposed a local neighborhood standardization strategy to preprocess multimodal data [
11]. Yu et al. adopted a finite Gaussian Mixture Model (GMM) combined with Bayesian inference for mode-wise monitoring [
12]. Jiang et al. further integrated GMM with optimal principal component selection to enhance diagnosis capability in multimode processes [
13]. More recent studies have extended Bayesian or sparse approaches, such as mode identification with transitional modeling [
14], sparse principal component selection coupled with Bayesian inference-based probability [
15], and Bayesian reconstruction strategies for specific applications like coal mills [
16]. Cui et al. additionally proposed dimensionality-reducing GMM-based reconstruction combined with deep learning models for nonlinear multimode monitoring [
17]. These methods demonstrate the effectiveness of combining GMM partitioning with local models or probabilistic inference, yet they still mainly focus on clustering and detection, with limited exploration of discriminative dictionary learning or variable-level fault isolation. Further, Kodamana et al. developed mixtures of probabilistic PCA for modeling and monitoring multimode dynamic processes [
18]. Zhang et al. proposed a modified PCA algorithm, designated PCA-EWC, for monitoring multimode processes to overcome catastrophic forgetting of PCA for successive modes [
19]. Although these approaches improve upon traditional MSPM in handling multimodal characteristics, they are still fundamentally rooted in variance-preserving dimensionality reduction and assume smooth Gaussian latent structures.
In addition, real-world process data may exhibit sparsity and non-Gaussianity that are poorly modeled by PCA-type methods. Sparse representations not only enhance interpretability but also improve generalization and robustness by emphasizing the most informative features [
20,
21]. Recent studies have demonstrated that sparse coding techniques offer considerable advantages in high-dimensional and heterogeneous environments [
22]. Nevertheless, most current multimode monitoring frameworks do not explicitly exploit sparse structures or address non-Gaussian feature distributions. Furthermore, despite increasing interest in fault detection, few studies have systematically tackled the more challenging problem of fault isolation in multimode systems, where overlapping features and operating conditions complicate variable-wise interpretation. Inspired by developments in computer vision and signal processing [
23,
24,
25,
26], where dictionary learning has proven effective in extracting sparse and discriminative representations, this paper proposes a novel process monitoring approach based on label-consistent K-SVD (LC-KSVD) dictionary learning [
27,
28]. Unlike traditional MSPM models, LC-KSVD embeds supervision directly into the dictionary learning process, enabling the model to construct sparse representations that preserve both structural similarity and class-specific information.
Beyond industrial process monitoring, similar methodological challenges have been widely addressed in the domain of structural health monitoring (SHM). Recent advances have integrated machine learning, deep learning, and metaheuristic optimization with vibration-based analysis to enhance structural damage detection. For instance, hybrid approaches combining particle swarm optimization (PSO), radial basis functions, and emerging algorithms such as the YUKI method have been developed for accurate crack identification in composite beams, demonstrating improved robustness and convergence compared with classical PSO [
29]. Other studies have explored optimized neural networks and bio-inspired algorithms for defect prediction in civil and mechanical systems [
30,
31], as well as deep learning-based frameworks for automated SHM tasks [
32]. These developments, while originating in SHM, reflect parallel challenges to those faced in multimode process monitoring, reinforcing the relevance of designing monitoring frameworks that not only achieve high detection accuracy but also maintain interpretability and robustness [
33,
34].
In this context, the present study focuses on industrial process monitoring and aims to address these parallel challenges by integrating probabilistic mode partitioning, discriminative sparse representation, and interpretable fault isolation into a unified framework. This design ensures that the proposed method remains both accurate and practically interpretable under multimode operating conditions. The proposed framework comprises three major components. First, a parallelized GMM-based mode segmentation method is employed to partition training data into coherent operating regimes. Second, an LC-KSVD model is trained using the labeled multimodal data, and monitoring statistics are derived from the reconstruction error under the learned sparse dictionary. Third, a novel fault isolation strategy based on reconstruction error and missing value estimation is developed, leveraging insights from missing data analysis and reconstruction-based contribution evaluation. Experimental results on both a numerical simulation and the widely studied Continuous Stirred Tank Heater (CSTH) benchmark confirm that the proposed method not only outperforms conventional techniques in fault detection accuracy and false alarm suppression but also provides interpretable and precise fault localization, even under complex multimodal conditions. In summary, the contribution of this work lies in the development of an enhanced multimode monitoring framework that advances beyond existing approaches in several important aspects. GMM is improved through a parallel EM algorithm, which significantly accelerates convergence and makes the method scalable to high-dimensional process data. This improvement is further coupled with LC-KSVD dictionary learning, allowing the monitoring model to capture discriminative and interpretable sparse representations across different operating modes. On this basis, we design a reconstruction-error-guided isolation mechanism that extends sparse dictionary learning from fault detection to variable-level fault localization under multimode conditions. By combining these elements into a unified framework, the proposed method achieves efficient mode partitioning, accurate detection, and interpretable fault isolation.
The remainder of this paper is organized as follows:
Section 2 details the methodological foundation, including the enhanced GMM for mode identification and the LC-KSVD algorithm for sparse dictionary learning and monitoring.
Section 3 introduces the monitoring and fault isolation procedures under the proposed framework.
Section 4 presents two case studies to validate the performance of the proposed method. Finally,
Section 5 concludes the paper and discusses potential avenues for future research.
3. Process Monitoring and Fault Detection Procedure
Define the training data as , consisting of c modes, where each represents data from the i-th mode with samples. The procedure for process monitoring using LC-KSVD follows a workflow analogous to PCA-based monitoring.
LC-KSVD models are trained using multimode normal condition data. A statistical model of normal conditions is constructed by evaluating the monitoring statistic derived from the reconstruction error of the LC-KSVD model. Once the model is built, the statistics of test data can be calculated and used to detect abnormalities based on control limits.
The reconstruction error is defined as
To determine whether a test sample contains fault information, control limits for the statistic must be predefined. A kernel density estimator (KDE) [
38] is used to approximate the probability density function of the reconstruction error,
where
x is the data under consideration,
is a sample value from the dataset,
is the window width, and
is the kernel function.
The control limit is determined by the -quantile of the estimated distribution. In summary, the LC-KSVD-based monitoring procedure consists of the following steps:
Acquire multimode normal operation data with c modes.
Normalize data using the mean and standard deviation of each variable.
Train the LC-KSVD model on normal data using the K-SVD algorithm.
Construct the reconstruction error set from normal operation data.
Determine control limits using KDE.
Solve the optimization problem
using FISTA [
39], where the regularization parameter
balances reconstruction fidelity and sparsity. Larger values enforce sparser codes, while smaller values allow denser representations.
Then, compute the statistic Re of the new data and determine whether a fault has occurred by comparing it with the threshold .
After detecting a fault in a sample, it is essential to isolate the fault source. Inspired by reconstruction-based contribution analysis and missing value analysis in multivariate statistics, we propose a fault isolation strategy.
Assume
is a faulty sample, and its revised version is given by
where
is a direction vector with 1 at the
i-th index and 0 elsewhere, and
is the revised amplitude.
The corresponding reconstruction error is defined as
If the correct fault direction and amplitude are identified, it is possible to achieve . For each dimension, we assume the associated variable to be faulty and remove it. If the reconstruction error drops below the threshold, that variable is likely the fault source.
Hence, Equation (
19) is reformulated as
where
is a direction vector with the
i-th element set to 0 and all others set to 1.
This optimization problem can be solved using the FISTA algorithm. The full fault isolation procedure is summarized in Algorithm 1.
Algorithm 1 Fault Isolation for LC-KSVD Process Monitoring |
- 1:
Initialize: - 2:
for each do - 3:
Initialize , where for , otherwise - 4:
Solve Equation ( 20) using FISTA, add to set Res - 5:
Calculate and corresponding index i, add i to fs - 6:
if or then return fs - 7:
else - 8:
Update , reset Res - 9:
end if - 10:
end for
|
Compared with prior multimode sparse coding and GMM–Bayesian methods, which primarily emphasize mode partitioning and global fault detection [
12,
13,
15,
16,
17], the proposed framework goes a step further by explicitly coupling mode identification with discriminative dictionary learning and extending the monitoring scope to variable-level fault isolation. Specifically, the parallel EM-based GMM ensures scalable and robust mode separation, while LC-KSVD leverages label consistency to construct mode-aware sparse representations that enhance discriminability between normal and faulty conditions. Building upon these representations, a reconstruction-error-guided strategy provides interpretable diagnostics by localizing the contribution of individual variables to detected faults. This joint design addresses the limitations of previous multimode approaches, which often lack either discriminative modeling capability or transparent fault isolation, thereby offering a more comprehensive solution for process monitoring and fault detection.
4. Case Studies
In this section, the performance of the proposed method is evaluated using both a numerical simulation and the Continuous Stirred Tank Heater (CSTH) process. The LC-KSVD method is compared with traditional FGMM [
12] and LNS-PCA [
40] algorithms. To quantitatively assess the performance of these approaches, two widely used indicators are adopted: the fault detection rate (FDR) and the false alarm rate (FAR), defined as follows:
where
denotes the number of samples in set
x, and
f is the fault indicator.
In addition to FDR and FAR, we also report the mean squared error (MSE) [
41] of reconstruction as a supplementary measure of model accuracy
Since the proposed framework relies on sparse representation and reconstruction, MSE provides a direct evaluation of how well the monitoring model captures normal process behavior. A lower reconstruction error on normal data indicates that deviations caused by faults will yield more distinguishable residuals, thereby enhancing fault detection sensitivity.
4.1. Simulation Case
To demonstrate the effectiveness of the proposed method, a numerical simulation model with five variables is considered. The system is formulated as
where
are zero-mean white noise with a standard deviation of 0.01, and
are two dependent latent variables (data sources). The process operates in three modes, defined as
where
denotes the Gaussian distribution with mean
and variance
, and
represents a uniform distribution within the interval
.
For the training phase, 1200 normal samples (400 per mode) are generated according to Equation (
24). These samples are used to train the monitoring models. For testing, 1000 samples are generated for each fault scenario, with the first 400 being normal and the remaining 600 containing faults. The specific fault settings are detailed in
Table 1. In the simulation case, the dictionary size and sparsity level were set to 200 and 1, respectively. The regularization parameters in the optimization problem in Equation (
18) were selected through cross-validation on the training data, resulting in
,
, and
. The reconstruction accuracy of the proposed method was evaluated using MSE on the normalized training data. The results show that the three operating modes yield consistently low MSE values of 0.078, 0.082, and 0.075, respectively. These small errors indicate that the learned dictionaries accurately capture the intrinsic structure of each mode, thereby ensuring reliable reconstruction of normal process behavior. Such high reconstruction precision provides a solid basis for distinguishing deviations caused by faults and confirms the suitability of the proposed framework for multimode monitoring.
The fault detection performance was quantitatively assessed using the FDR and FAR defined in Equations (
21) and (
22), respectively. Comparative results among LC-KSVD, LNS-PCA (SPE and T2 statistics), and FGMM (BIP index) are presented in
Table 2 and
Table 3. As shown in
Table 2, the LC-KSVD approach achieved a perfect FDR of 1.0 in both Fault 1 and Fault 2 and 0.965 in Fault 3. These results demonstrate the superior sensitivity of the proposed method compared to the traditional approaches, particularly under complex multimodal distributions where the performance of PCA-based indices deteriorates significantly.
Table 3 further confirms the reliability of LC-KSVD through its low FAR across all scenarios, with values consistently below those of FGMM and LNS-PCA.
The fault detection performance was quantitatively evaluated using the FDR and FAR defined in Equations (
21) and (
22). Comparative results for LC-KSVD, LNS-PCA (SPE and T2 statistics), FGMM (BIP index) [
12], DAE [
42], MultiMode PCA [
43], and VAE [
44] are summarized in
Table 2 and
Table 3. As shown in
Table 2, the proposed LC-KSVD approach achieves the highest sensitivity, with perfect detection rates (FDR = 1.0) for Faults 1 and 2 and 0.965 for Fault 3. In contrast, PCA-based indices show significant performance degradation under multimodal distributions, while the neural network-based and GMM-based methods remain consistently inferior.
Table 3 further indicates that LC-KSVD also provides the lowest false alarm rates across all three fault scenarios, maintaining values at or below 0.0125, which is lower than those of all competing methods.
The monitoring curves in
Figure 1,
Figure 2 and
Figure 3 provide a visual confirmation of these quantitative results. After the onset of each fault, the reconstruction-error statistic of LC-KSVD promptly exceeds the control limit
, with clearer separability between normal and faulty samples than the baseline approaches. These results collectively demonstrate that the proposed LC-KSVD framework not only improves detection sensitivity but also achieves more reliable fault discrimination under complex multimode operating conditions. Moreover,
Figure 4 and
Figure 5 compare the original and reconstructed feature space representations for Faults 1 and 3, respectively. The reconstruction preserves essential features while filtering noise, thereby enhancing robustness. In addition to fault detection, the capability of the proposed method to isolate the source of the fault was also evaluated.
Figure 6 presents the fault isolation result for Fault 2 using the reconstruction-error-based variable contribution analysis. The contribution values of each process variable are computed after fault occurrence. These results indicate that the proposed LC-KSVD model accurately attributes the fault to the correct variable, even in the presence of multimodal variation. The result further confirms that the reconstruction-error-guided fault localization strategy is effective and interpretable.
In addition to fault detection, fault isolation performance is depicted in
Figure 7,
Figure 8 and
Figure 9, where the reconstruction-error-based fault contribution method accurately identified the faulty variables. In particular, the proposed method exhibited strong discrimination capability even for mild or slowly evolving faults such as Fault 3, where other methods tend to fail due to insufficient feature sensitivity.
In summary, the simulation case study confirms that the LC-KSVD-based monitoring scheme not only achieves high detection sensitivity and low false alarm rates across various multimode conditions but also effectively localizes fault variables through its reconstruction-error-guided isolation strategy. These results collectively verify the applicability and superiority of the method for complex industrial process monitoring scenarios.
4.2. Continuous Stirred Tank Heater Process Case
In this section, the proposed LC-KSVD-based method is applied to the Continuous Stirred Tank Heater (CSTH) process. The CSTH simulation platform, originally proposed by Thornhill [
45], is a widely used benchmark in process monitoring research due to its realistic modeling of heat and volumetric balances. It has been extensively adopted for performance comparison of various fault detection and isolation techniques. As illustrated in
Figure 10, hot and cold water streams are fed into a stirred vessel. The mixture is heated to a target temperature using steam and then discharged from the bottom of the tank. The process is governed by multiple Proportional-Integral (PI) control loops. The manipulated variables include hot water flow rate, cold water flow rate, and steam valve position, while the measured outputs comprise liquid level, temperature, and the flow rates of hot and cold water.
Detailed mode configurations used in this study are presented in
Table 4, and three experimental fault cases designed for the CSTH process are described in
Table 5. During the training stage, 800 normal samples are collected for each mode to construct the monitoring model, with an equal number of samples per mode. For testing, 400 samples are collected for each case, where the first 100 samples represent normal operation and the remaining 300 samples contain faults. In this case study, the dictionary size is set to 200, and the sparsity level is constrained to 1 to ensure compact and interpretable sparse representations. Similarly, the parameters of the proposed method were determined using cross-validation, with
,
, and
. For the training set, the reconstruction performance was further assessed across four operating modes. The proposed method achieved MSE values of 0.092, 0.054, 0.083, and 0.074, respectively. These results demonstrate that the framework can maintain consistently low reconstruction errors across different modes, even in the presence of complex dynamics. The particularly small MSE in Mode 2 (0.054) highlights the ability of the model to capture mode-specific characteristics with high fidelity. Overall, the low reconstruction errors across all modes confirm the robustness and precision of the proposed approach in modeling multimode industrial processes, which directly contributes to its superior fault detection capability.
The fault detection performance for the CSTH benchmark is quantitatively evaluated in terms of FDR and FAR, as reported in
Table 6 and
Table 7. The proposed LC-KSVD method achieves perfect detection in both scenarios (FDR = 1.0, FAR = 0), clearly outperforming all baseline approaches. For Fault 1, which corresponds to a multiplicative deviation in the temperature measurement under Mode 3, LNS-PCA (SPE) and MultiMode PCA reach FDRs of 0.887 and 0.900, respectively, while DAE and VAE perform slightly better with 0.923 and 0.943. In contrast, FGMM exhibits very poor sensitivity with an FDR of only 0.107. LC-KSVD, however, consistently detects all faulty samples without false alarms. For Fault 2, which involves a level bias across Modes 1 and 2, most methods yield relatively high detection rates (e.g., LNS-PCA(T2) and FGMM both achieve 1.0, while DAE and VAE reach 0.863 and 0.910, respectively), but several of them suffer from reduced performance under certain modes, and only LC-KSVD maintains both perfect FDR and zero FAR. The monitoring results in
Figure 11 and
Figure 12 provide further evidence of these advantages. After the fault onset at sample 101, the reconstruction-error statistic of LC-KSVD promptly and distinctly exceeds the control limit, whereas other methods either respond more slowly, show larger fluctuations, or produce false alarms. The reconstruction-error-based monitoring statistic generated by LC-KSVD clearly exceeds the control limit after fault onset while remaining below the threshold during normal operation. This indicates high sensitivity with minimal over-detection.
Figure 13 and
Figure 14 show the comparison between original and reconstructed signals under fault conditions. The LC-KSVD reconstruction effectively retains dominant process features while filtering out noise and irrelevant variation, thereby improving the clarity and interpretability of the monitoring signals.
Figure 15 and
Figure 16 demonstrate the fault isolation results for both cases. The proposed reconstruction-error-guided contribution analysis successfully identifies the faulty variables (temperature and level, respectively), even under strong mode-dependent process behavior. This case confirms that the LC-KSVD framework not only detects faults accurately but also provides valuable interpretability through its sparse representation and variable contribution structure.
For Fault 3, which introduces a random disturbance to the cold-water flow under Modes 1 and 3, the performance gap among methods becomes more evident. As shown in
Table 6, LC-KSVD again achieves perfect detection (FDR = 1.0, FAR = 0), whereas the other approaches display noticeable degradation. For example, LNS-PCA (SPE) and MultiMode PCA (T2) report FDRs of 0.580 and 0.673, respectively, while DAE and VAE improve sensitivity to 0.860 and 0.930 but still fall short of LC-KSVD. FGMM shows moderate performance with 0.850 FDR but suffers from higher false alarms. The monitoring curves in
Figure 17 confirm that LC-KSVD promptly captures the fault onset at sample 101 with a clear and stable separation from the control limit, whereas alternative methods exhibit delayed or fluctuating responses. Moreover,
Figure 18 highlights that the reconstruction step in LC-KSVD enhances detectability relative to the original signals, while
Figure 19 shows that the proposed contribution analysis correctly isolates the cold-water flow variable as faulty.
5. Conclusions
This study presents an effective multimode process monitoring framework that combines a parallelized GMM for mode separation with LC-KSVD dictionary learning for fault detection and isolation. The proposed method addresses key challenges in industrial settings, including high dimensionality, overlapping modes, and subtle fault signatures. By leveraging the strengths of probabilistic mode clustering and supervised sparse representation, the method not only improves monitoring accuracy but also enhances interpretability through contribution-based fault localization. Extensive experiments on both simulated and real-world CSTH systems confirm the superiority of the proposed approach in terms of detection sensitivity, false alarm suppression, and fault isolation accuracy. Compared with conventional methods, the LC-KSVD framework achieves near-perfect fault detection rates while maintaining zero or near-zero false alarms across all tested scenarios. Additionally, the reconstruction-error-based analysis enables precise identification of faulty variables under multimodal dynamics. Overall, the proposed monitoring strategy demonstrates strong potential for deployment in real industrial applications requiring robust and explainable fault diagnosis under diverse and evolving operating conditions.
Although the proposed framework demonstrates strong performance in both the numerical and CSTH case studies, several limitations should be acknowledged. As a supervised dictionary learning method, LC-KSVD may be sensitive to mislabeled training data, which could compromise the discriminative power of the learned representations. In addition, while the parallel EM and sparse coding steps improve efficiency, the scalability to very high-dimensional processes with hundreds of sensors or very long time-series data still requires further validation. Future research will focus on incorporating robust or semi-supervised strategies to alleviate the impact of noisy labels, conducting more systematic evaluations on gradual and non-stationary faults, and exploring hybrid schemes that integrate LC-KSVD with deep learning architectures to enhance scalability and representation power in large-scale industrial applications.