A Statistical Modeling and Monitoring Framework for Dynamic Processes Based on Knowledge Graph and Dissimilarity Analysis

Hao, Yunhan; Zhu, Shanliang

doi:10.3390/math14122047

Open AccessArticle

A Statistical Modeling and Monitoring Framework for Dynamic Processes Based on Knowledge Graph and Dissimilarity Analysis

by

Yunhan Hao

¹

and

Shanliang Zhu

^2,*

¹

School of Mathematics and Statistics, Qingdao University, Qingdao 266071, China

²

School of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(12), 2047; https://doi.org/10.3390/math14122047 (registering DOI)

Submission received: 18 April 2026 / Revised: 2 June 2026 / Accepted: 5 June 2026 / Published: 8 June 2026

(This article belongs to the Special Issue AI-Based and Data-Driven Modeling and Control: Mathematical Methods and Industrial Applications)

Download

Browse Figures

Versions Notes

Abstract

Dynamic industrial processes often exhibit complex variable interactions, and time-varying behaviors, which pose significant challenges to conventional multivariate statistical monitoring methods. To address these issues, this paper proposes a novel data-driven monitoring framework that integrates knowledge-informed bipartite graph embedding with multi-scale dissimilarity analysis. First, a bipartite graph-embedding strategy is developed to incorporate mechanistic knowledge into the modeling process, enabling a more interpretable representation of dynamic relationships among process variables. On this basis, a multi-scale recursive dissimilarity monitoring method is further designed to enhance detection performance by capturing process variations across different temporal scales while reducing sensitivity to sliding window selection. The effectiveness of the proposed framework is validated through a numerical example and a benchmark simulation process. The results demonstrate that the proposed method achieves improved fault detection performance and robustness compared with conventional approaches.

Keywords:

dynamic industrial processes; knowledge-informed bipartite graph; mechanistic knowledge; process monitoring; multi-scale dissimilarity analysis

MSC:

62P30

1. Introduction

Safe operation of industrial systems has long been a major concern, driving the rapid development of process monitoring theories and technologies. After years of advancement, significant theoretical and practical achievements have been made in industrial process monitoring research, with data-driven industrial process monitoring methods now becoming mainstream [1,2,3,4,5].

As an important branch of data-driven industrial process monitoring, multivariate statistical theory-based process monitoring has drawn considerable attention in the field of industrial system monitoring due to its good model interpretability. studies on multivariate statistical process monitoring covers various challenges in industrial processes, including non-Gaussian problems [6], nonlinear problems [7], nonstationary problems [8], dynamic problems [9], and their combined variants [10,11,12]. In addition, notable progress has also been made in areas such as distributed process monitoring [13], batch process monitoring [14], and incipient fault detection [15].

This study focuses on the modeling and state monitoring of dynamic industrial processes. Currently, studies on dynamic process monitoring based on multivariate statistical theory mainly employ two types of modeling approaches. The first are the sample matrix augmentation methods, such as dynamic principal component analysis (DPCA) [16], recursive dynamic component analysis [17], and auto-correlative feature analysis [9]. The other strategy for dynamic process modeling is based on autoregressive (AR) models, which can be implemented either in the original variable space or in the latent variable space. For instance, two-step PCA (TS-PCA) [18] is a typical AR-based method. Ma et al. [19] proposed a recursive innovation component statistical analysis method, which divides a dynamic system into dynamic and innovation components by constructing an autoregressive model. Both aforementioned methods characterize the dynamic relationships among variables by building AR models in the original variable space. Furthermore, Dong et al. [20] introduced a novel dynamic inner PCA (DiPCA) that captures dynamic relations using latent variables. However, both kinds of dynamic modeling methods assume a uniform time delay for all process variables, and neglect the sparse nature of dynamic structures (i.e., not all variable pairs exhibit dynamic coupling), which may lead to poor model generalization, high computational complexity, and various other issues.

To enhance model generalization capability and reduce computational complexity, modeling methods based on sparse constraints have attracted widespread attention. Common approaches include the Least Absolute Shrinkage and Selection Operator (LASSO) constraint and joint sparse constraints. For instance, Yan et al. [21] proposed a robust PCA (RPCA) to effectively suppress noise interference and applied it to process monitoring. Zhang et al. [22] introduced a self-learning PCA method for monitoring multimodal industrial processes. Yu et al. [23] developed a sparse modeling method based on distribution dissimilarity analysis (DISSIM) for fault detection and diagnosis in industrial systems. For distributed process monitoring, Sun et al. [24] proposed a modeling method based on

L_{2, 1}

-norm joint sparse constraints. Furthermore, for quality-related process monitoring, Xiu et al. [25] introduced a modeling approach using

L_{2, 0}

-norm joint sparse constraints. Although these methods were originally proposed for static processes, they can be extended to their dynamic counterparts. In research on dynamic process monitoring with sparse constraints, Zhang et al. [26] developed a sparse DiPCA (SDiPCA) for monitoring multimodal nonlinear processes. Zhang et al. [27] proposed a variational Bayesian sparse PCA that employs sparse autoregressive analysis to characterize correlations among dynamic latent variables. However, the aforementioned sparse-constrained studies primarily focus on reducing computational complexity and do not regularize the sparse constraints from the perspective of dynamic structure. This may lead to degraded modeling accuracy, thereby affecting monitoring performance.

In recent years, graph learning-based modeling methods have gained increasing attention [28]. These approaches can fully capture the intrinsic structure of data and have thus been widely applied in fields such as large-scale data analysis and text clustering [29,30]. As a special type of graph, bipartite graph is particularly suitable for representing relationships between two distinct sets of entities. Inspired by this, Cui et al. [31] were the first to propose a bipartite graph-based sparse dynamic matrix estimation method to capture temporal correlations between variables (i.e., the relationship between the current and past states of a dynamic process), which decouples the identification of a dynamic process into the identification of the dynamic structure and the strength of dynamic correlations. However, the graph structure matrix in SDMEM-BG is constructed in a purely data-driven manner, without considering additional prior information such as expert knowledge and physical connectivity among process variables. As a result, the learned graph structure may lack sufficient physical interpretability and credibility, especially in complex industrial systems with strong mechanistic characteristics. Moreover, the graph construction strategy in SDMEM-BG is relatively inflexible since it mainly relies on statistical correlations extracted from process data.

To overcome these limitations, this study proposes a knowledge-informed bipartite graph embedding (KBGE) framework, where the graph structure matrix can be flexibly constructed by integrating multiple sources of prior information, including expert knowledge, physical connections, and process topology information. Compared with existing sparse graph learning-based dynamic monitoring methods, the proposed framework provides a more interpretable and physically meaningful representation of dynamic process structures, thereby improving both modeling accuracy and model reliability.

In addition, existing methods based on DISSIM and their variants perform process monitoring by detecting changes in sample distribution within a sliding window [23,32,33]. However, the selection of window width has been a limiting factor for further improving monitoring performance. Furthermore, existing recursive monitoring methods in graph-based dynamic monitoring frameworks are generally developed under a single-scale sliding window structure, which may be insufficient for capturing process variations occurring at different temporal scales. To address this issue, this paper develops a recursive dissimilarity analysis method based on a multi-scale sliding window (MSSW), which ensures flexibility in window width selection while avoiding excessively high computational complexity.

Compared with conventional single-scale recursive monitoring strategies, the proposed MSSW-based monitoring framework can simultaneously characterize both short-term and long-term dynamic variations, thereby improving monitoring robustness and fault sensitivity under complex operating conditions.

Despite significant progress in dynamic process monitoring, several challenges remain unaddressed. Existing dynamic modeling methods often assume uniform time delays among all process variables, which may limit model generalization and increase computational complexity. Sparse-constrained approaches primarily focus on reducing computational cost but generally overlook the regularization of dynamic structures, potentially leading to degraded modeling accuracy. Graph-based methods, such as SDMEM-BG, construct dynamic structures purely from data, without incorporating prior knowledge or mechanistic insights, resulting in limited interpretability and credibility. Moreover, conventional single-scale recursive monitoring strategies are insufficient for capturing process variations occurring across multiple temporal scales, restricting fault detection performance in complex industrial systems.

To address these limitations, this study proposes a knowledge-informed bipartite graph-embedding framework for dynamic process monitoring, which integrates multiple sources of prior information, including expert knowledge, physical connections, and process topology. In addition, a multi-scale sliding window recursive dissimilarity analysis method is developed to flexibly capture both short-term and long-term process variations. Compared with existing methods, the proposed framework provides a more interpretable and physically meaningful representation of dynamic process structures, improves modeling accuracy, and enhances monitoring robustness under complex operating conditions.

The main contributions of this study are summarized as follows:

Knowledge-informed bipartite graph modeling for dynamic processes: A novel bipartite graph-embedding framework is developed to incorporate mechanistic knowledge into data-driven modeling. This approach provides a structured and interpretable representation of dynamic relationships among process variables, thus improving modeling accuracy and enhancing physical interpretability. Unlike existing graph learning methods such as SDMEM-BG, the proposed framework allows the graph structure matrix to be flexibly constructed by integrating expert knowledge, physical connectivity, and process topology information, improving both the reliability and credibility of the learned dynamic structure.
Multi-scale recursive dissimilarity monitoring strategy: A multi-scale recursive monitoring method is proposed to address the sensitivity of conventional DISSIM-based approaches to sliding window selection. By capturing process variations across multiple temporal scales, the proposed method improves fault detection robustness while reducing computational complexity. Compared with existing single-scale recursive monitoring methods, the proposed MSSW-based strategy achieves more comprehensive characterization of dynamic process behaviors and enhances monitoring performance for complex industrial systems with multi-scale temporal dynamics.

2. Background

2.1. Dynamic Process Modeling Method Based on AR Model

Let the dynamic process variables be

x

. m represents the dimension of the process variables, and

x \in R^{m \times 1}

. In industrial processes, dynamic behavior is typically characterized by temporal dependencies, meaning the current states of process variables are correlated with their past states. This relationship is commonly captured using an AR model, as supported by [18,19,34]:

\begin{matrix} x (t) & = A_{1} x (t - 1) + A_{2} x (t - 2) + \dots \\ + A_{q} x (t - q) + u (t) \\ = \hat{A} \hat{x} (t) + u (t), \end{matrix}

(1)

where q represents the time delay of the dynamic process model,

\hat{A} = [A_{1}, A_{2}, \dots, A_{q}] \in R^{m \times m q}

and

{\hat{x}}^{T} (t) = {[x^{T} (t - 1) x^{T} (t - 2) \dots x^{T} (t - q)]}^{T} \in R^{m q \times 1}

.

In Equation (1), the coefficient matrices

A_{1}, A_{2}, \dots, A_{q}

describe the dynamic influence of historical variables on the current variables. Specifically,

A_{i}

characterizes how the process state at time

t - i

contributes to the current state at time t. Each element in

A_{i}

can be interpreted as the influence strength from one historical process variable to one current process variable. Therefore,

\hat{A}

can be regarded as a dynamic matrix that collects the temporal relationships among process variables over different time delays.

From an intuitive perspective, Equation (1) decomposes the current process state into two parts. The first part,

\hat{A} \hat{x} (t)

, is the dynamic component that can be predicted from historical process information. It reflects the temporal propagation and coupling relationships within the process. The second part,

u (t)

, is the innovation component, which represents the newly generated information at the current time instant that cannot be explained by past process states. This term may contain external disturbances, unmodeled variations, measurement noise, or other instantaneous changes. In this sense, the AR model separates the predictable dynamic behavior from the remaining innovation/static component. Here, the term “static component” does not mean that the component is constant over time; rather, it refers to the part that remains after removing the predictable dynamic effect and is therefore analyzed from the perspective of instantaneous variation. Importantly,

u (t)

is assumed to be independent of any past process variable

x (t - k)

and any past innovation

u (t - k)

(

k \geq 1

).

Typically, the expectation of

u (t)

in Equation (1) may be non-zero due to steady-state offsets or constant operating levels in industrial processes. Therefore, a centering operation is necessary before monitoring. To achieve this, the difference between two process states separated by a time interval

τ

is considered. Here,

τ

denotes the differencing interval, or equivalently the number of sampling steps between two compared process states. When

τ = 1

, the model focuses on the change between two consecutive samples; when

τ

is larger, the model captures process variations over a longer temporal interval.

Based on Equation (1), the differenced form can be derived as:

\begin{matrix} Δ x (t, τ) & = x (t) - x (t - τ) \\ = \hat{A} (\hat{x} (t) - \hat{x} (t - τ)) + (u (t) - u (t - τ)) \\ = \hat{A} Δ \hat{x} (t, τ) + Δ u (t, τ) . \end{matrix}

(2)

By constructing the corresponding sample matrix, Equation (2) can be reformulated as Equation (3).

Δ X (t, τ) = \hat{A} Δ \hat{X} (t, τ) + Δ U (t, τ),

(3)

where

\{\begin{matrix} Δ X (t, τ) = [\begin{matrix} Δ x (t_{1}, τ), Δ x (t_{2}, τ), \dots, Δ x (t_{n}, τ) \end{matrix}] \\ Δ \hat{X} (t, τ) = [\begin{matrix} Δ \hat{x} (t_{1}, τ), Δ \hat{x} (t_{2}, τ), \dots, Δ \hat{x} (t_{n}, τ) \end{matrix}] \\ Δ U (t, τ) = [\begin{matrix} Δ u (t_{1}, τ), Δ u (t_{2}, τ), \dots, Δ u (t_{n}, τ) \end{matrix}] \end{matrix} .

(4)

If

τ

is large enough, it approximately satisfies the independence of

Δ \hat{x} (t, τ)

and

Δ u (t, τ)

, thereby enabling the separation of dynamic and static components. In Equation (3), the dynamic matrix

\hat{A}

can be estimated via the least squares method, with the objective function typically defined as follows.

\begin{matrix} J (\hat{A}) & = \sum_{k = 1}^{n} {∥ Δ x (t_{k}, τ) - \hat{A} Δ \hat{x} (t_{k}, τ) ∥}_{2}^{2} \\ = ∥ Δ X (t, τ) - \hat{A} Δ \hat{X} (t, τ) ∥_{F}^{2} . \end{matrix}

(5)

Minimizing the objective function in Equation (5) yields the following result.

\hat{A} = Δ X Δ {\hat{X}}^{T} {(Δ \hat{X} Δ {\hat{X}}^{T})}^{- 1} .

(6)

2.2. Basic Principles of DISSIM Monitoring Method

Define two datasets

Y_{i} \in R^{w_{i} \times m}, i = 1, 2

, where

w_{i}

denotes the number of samples in

Y_{i}

. Given standardized datasets and letting

Γ_{i}

be the covariance matrix of

Y_{i}

, Equation (7) follows.

\begin{matrix} Γ & = \frac{1}{w - 1} Y^{T} Y \\ = \frac{1}{w - 1} {[\begin{matrix} Y_{1} \\ Y_{2} \end{matrix}]}^{T} [\begin{matrix} Y_{1} \\ Y_{2} \end{matrix}] \\ = \sum_{i = 1}^{2} \frac{w_{1} - 1}{w - 1} Γ_{i} \end{matrix},

(7)

where

Y

is composed of

Y_{1}

and

Y_{2}

, and

Γ

is the covariance matrix corresponding to matrix

Y

. w denotes the number of samples in

Y

, and

w = w_{1} + w_{2}

.

Let the eigenvalue decomposition of

Γ

be

Γ = P Λ P^{T}

, where

P

is orthogonal and

Λ

is diagonal. Here,

Λ

is composed of the eigenvalues of

Γ

. Define

\hat{P} = P Λ^{- 1 / 2}

; then

{\hat{P}}^{T} Γ \hat{P} = I

obviously holds. Based on the resulting matrix

\hat{P}

, the following matrix transformation is applied:

{\hat{Y}}_{i} = \sqrt{\frac{w_{i} - 1}{w - 1}} Y_{i} \hat{P}, i = 1, 2 .

(8)

Let

R_{i}

denote the covariance matrix of

{\hat{Y}}_{i}

. Then, Equation (9) follows from Equation (8).

\begin{matrix} R_{1} + R_{2} & = \sum_{i = 1}^{2} \frac{1}{w_{i} - 1} {\hat{Y}}_{i}^{T} {\hat{Y}}_{i} \\ = \frac{{\hat{P}}^{T} (Y_{1}^{T} Y_{1} + Y_{2}^{T} Y_{2}) \hat{P}}{w - 1} \\ = {\hat{P}}^{T} Γ \hat{P} \\ = I_{m} . \end{matrix}

(9)

Equation (10) is obtained by applying eigenvalue decomposition to matrix

R_{i}

.

R_{i} ξ_{j}^{i} = λ_{j}^{i} ξ_{j}^{i}, j = 1, 2, \dots, m .

(10)

where

ξ_{j}^{i}

and

λ_{j}^{i}

denote the jth eigenvector and jth eigenvalue of

R_{i}

, respectively. Thus, Equation (11) can be obtained by combining Equation (9) with Equation (10).

\{\begin{matrix} R_{1} ξ_{j}^{1} = (I - R_{2}) ξ_{j}^{1} = λ_{j}^{1} ξ_{j}^{1} \\ R_{2} ξ_{j}^{2} = (I - R_{1}) ξ_{j}^{2} = λ_{j}^{2} ξ_{j}^{2} \end{matrix} .

(11)

The matrices

R_{1}

and

R_{2}

share identical eigenvectors, and their corresponding eigenvalues satisfy

λ_{j}^{(1)} + λ_{j}^{(2)} = 1

. Based on this property, Kano et al. [32] introduced the following monitoring indicator to assess the distribution dissimilarity of

Y_{1}

and

Y_{2}

.

D = \frac{4}{m} \sum_{j = 1}^{m} {(λ_{j} - 0.5)}^{2},

(12)

where

λ_{j}

can be set as

λ_{j}^{1}

or

λ_{j}^{2}

.

In DISSIM-based monitoring,

Y_{1}

and

Y_{2}

denote the training and testing datasets, respectively. After selecting a window size w, normal data are windowed to compute the Y statistic and its control limit. During online operation, the sliding window collects new samples; monitoring is performed by detecting distribution changes across successive windows.

3. The Proposed Method

The proposed framework mainly consists of two stages: modeling and monitoring. In the modeling stage, KBGE is employed to construct the dynamic structural model by integrating mechanistic knowledge with process data, thereby estimating the dynamic relationships among process variables. Based on the estimated dynamic matrix, the process variables are further separated into dynamic and static components, enabling the characterization of different process behaviors.

In the monitoring stage, MSSW is applied to perform multi-scale dissimilarity analysis on both the dynamic and static components. Compared with the conventional single-scale sliding window, the multi-scale sliding window can simultaneously capture process variation characteristics across different time scales, thereby improving the detection capability for complex faults and slowly varying abnormalities. Finally, the operating condition of the industrial process is effectively monitored through the integrated DISSIM statistics across multiple scales.

3.1. Knowledge-Informed Bipartite Graph Embedding for Dynamic Process Modeling

According to [31], the optimization of matrix

\hat{A}

in Equation (5) can be transformed into Equation (13).

\begin{matrix} min J (K, C) & = \sum_{k = 1}^{n} {∥ Δ x (t_{k}, τ) - (K ⊙ C) Δ \hat{x} (t_{k}, τ) ∥}_{2}^{2} \\ = {∥ Δ X - (K ⊙ C) Δ \hat{X} ∥}_{F}^{2} . \end{matrix}

(13)

where

\hat{A} = K ⊙ C

and “⊙” denotes the Hadamard product.

k

is a knowledge-informed bipartite graph to describe whether the dynamic relationships exist between a current state and a past state, and

C

represents the dynamic strength between the current state and the past state.

Equation (13) can be further transformed into Equation (14).

\begin{matrix} min J (C) & = \sum_{k = 1}^{n} {∥ Δ x (t_{k}, τ) - (K ⊙ C) Δ \hat{x} (t_{k}, τ) ∥}_{2}^{2} \\ = {∥ Δ X - (K ⊙ C) Δ \hat{X} ∥}_{F}^{2} . \end{matrix}

(14)

As indicated in (14), a sparse structured matrix

K

is required, and this study proposes a sparsification approach based on a knowledge-informed bipartite graph. The bipartite graph is constructed using the mechanism knowledge of the system, as illustrated in Figure 1.

The bipartite graph

G = \{V, S\}

is an undirected weighted graph, where

V = V_{1} \cup V_{2}

denotes the vertex set, with

V_{1} = \{x^{1}, x^{2}, \dots, x^{m}\}

and

V_{2} = \{{\hat{x}}^{1}, {\hat{x}}^{2}, \dots, {\hat{x}}^{m q}\}

representing the current and lagged process variables, respectively.

For each variable

x^{i}

, its

ε

-nearest-neighbor set

N_{ε} (x^{i})

is defined according to the physical or topological connectivity relationships among process variables. Specifically, two variables are regarded as neighbors if they exhibit direct physical coupling relationships in the industrial process, such as material transfer, energy exchange, equipment connection, or other process interactions determined by expert knowledge and process topology information. Therefore, the neighborhood relationship is not solely determined by statistical correlations in data, but also incorporates prior mechanistic knowledge of the industrial system.

Let matrix

K

denote the adjacency matrix of graph

G

, where

K \in R^{m \times m q}

and

k_{i j} \in {0, 1}

. The sparse adjacency matrix

K

is constructed based on the

ε

-nearest-neighbor rule as follows:

k_{i j} = \{\begin{matrix} 1, {\hat{x}}^{j} \in N_{ε} (x^{i}) \\ 0, {\hat{x}}^{j} \notin N_{ε} (x^{i}) \end{matrix},

(15)

where

N_{ε} (x^{i})

represents the set of variables that satisfy the predefined physical or topological neighborhood criterion of

x^{i}

.

In the proposed knowledge-informed bipartite graph, the matrix

K

is used to characterize the connections between current variables and historical variables. Assuming that the industrial process contains m process variables and the time-lag order is q, the current variable set is denoted as

x (t)

, while the historical variable set is represented as

{x (t - 1), x (t - 2), \dots, x (t - q)}

. Accordingly, one side of the bipartite graph corresponds to the current variables, and the other side corresponds to the lagged variables at different historical time instants. The resulting bipartite graph connection matrix is defined as

K \in {\{0, 1\}}^{m \times m q}

(16)

where

K_{i j} = 1

indicates that a known relationship or dependency exists between the

i t h

current variable and the

j t h

lagged variable, while

K_{i j} = 0

indicates no direct connection.

The construction of

K

mainly relies on available prior or mechanistic knowledge, such as physical connections among equipment, process flow information, known dynamic relationships among variables, and expert knowledge. Based on such information, an adjacency matrix

\tilde{W}

can first be constructed, where each entry is set to 1 if two variables have a direct physical connection, material/energy transfer relationship, or explicit process-related dependency, and set to 0 otherwise. Considering the time-lag q, the adjacency matrix

\tilde{W}

is then extended to the lagged-variable space to obtain the mechanistic-knowledge-based bipartite graph connection matrix:

K^{m} = [\tilde{W}, \tilde{W}, \dots, \tilde{W}]

(17)

Since mechanistic knowledge alone may not fully characterize the complex dependencies in industrial processes, an additional connection matrix

K^{d}

can be further constructed using data-driven methods or existing graph construction approaches reported in the literature. The final knowledge-informed bipartite graph connection matrix is then obtained by fusing the two types of connections:

K = K^{m} \lor K^{d}

(18)

where ∨ denotes the element-wise logical OR operation. In other words, a connection is retained in

K

if it exists in either

K^{m}

or

K^{d}

. Through this strategy, the matrix

K

integrates both mechanistic knowledge and supplementary variable relationship information.

Since each variable in practical industrial systems is usually associated with only a subset of variables rather than all variables, the resulting matrix

K

naturally exhibits a sparse structure instead of a fully connected one. Such a sparse connection matrix introduces meaningful prior constraints into the model, reduces unnecessary information propagation among irrelevant variables, and improves the consistency between the graph structure and the physical and dynamic characteristics of industrial processes.

Further, optimization problem (14) is transformed into problem (19).

\begin{matrix} min \hat{J} (\hat{A}) & = {∥ Δ X - \hat{A} Δ \hat{X} ∥}_{F}^{2} \\ s . t . \hat{A} = K ⊙ C \end{matrix} .

(19)

ADMM [35] is adopted to solve Equation (19) because the optimization problem involves both the data-fitting term and the knowledge-informed structural constraint

\hat{A} = K ⊙ C

. Directly optimizing

\hat{A}

under this element-wise structural constraint is inconvenient, especially when the graph structure is sparse. By introducing the auxiliary matrix

C

, ADMM separates the estimation of the dynamic coefficient matrix from the enforcement of the graph-guided sparsity structure. As a result, the original constrained problem can be decomposed into two simpler subproblems: the update of

\hat{A}

, which corresponds to a quadratic least-squares problem, and the update of

C

, which enforces the knowledge-informed graph structure through an element-wise masking operation. This decomposition makes the optimization procedure more tractable and easier to implement, and the corresponding augmented Lagrangian function is formulated in Equation (20).

\begin{matrix} L (\hat{A}, C, M) = & {∥ Δ X - \hat{A} Δ \hat{X} ∥}_{F}^{2} + \frac{μ}{2} {∥ \hat{A} - K ⊙ C ∥}_{F}^{2} \\ - T r [M^{T} (\hat{A} - K ⊙ C)] . \end{matrix}

(20)

At each iteration, ADMM updates the variables as follows:

(1): Update matrix $\hat{A}$

\begin{matrix} {\hat{A}}_{k + 1} = & \underset{\hat{A}}{\arg min} {∥ Δ X - \hat{A} Δ \hat{X} ∥}_{F}^{2} + \frac{μ}{2} {∥ \hat{A} - K ⊙ C_{k} ∥}_{F}^{2} \\ - T r [M_{k}^{T} (\hat{A} - K ⊙ C_{k})] \end{matrix}

(21)

Setting the derivative of the objective function with respect to

\hat{A}

to zero yields

\begin{matrix} {\hat{A}}_{k + 1} = Q {(2 Δ \hat{X} Δ {\hat{X}}^{T} + μ I_{m q \times m q})}^{- 1}, \end{matrix}

(22)

where

Q = (2 Δ X Δ {\hat{X}}^{T} + μ (K ⊙ C_{k}) + M)

.

The known sparse structure of

K

serves as prior information for optimizing

\hat{A}

, thus allowing Equation (22) to be transformed into (23).

\begin{matrix} {\hat{A}}_{k + 1} = Q {(2 Δ \hat{X} Δ {\hat{X}}^{T} + μ I_{m q \times m q})}^{- 1} ⊙ K . \end{matrix}

(23)

(2): Update matrix $C$

Given

{\hat{A}}_{k + 1}

and

M_{k}

, the subproblem with respect to

C

is written as

\begin{matrix} C_{k + 1} = \underset{C}{\arg min} \frac{μ}{2} {∥ {\hat{A}}_{k + 1} - K ⊙ C ∥}_{F}^{2} - T r [{(M_{k})}^{T} ({\hat{A}}_{k + 1} - K ⊙ C)] . \end{matrix}

(24)

Taking the derivative of Equation (24) with respect to

C

gives

\frac{\partial L}{\partial C} = μ K ⊙ (K ⊙ C - {\hat{A}}_{k + 1}) + K ⊙ M_{k} .

(25)

By setting Equation (25) to zero, the update of

C

can be obtained. Since

K

is a binary graph structure matrix,

K ⊙ K = K

. Therefore, the update rule of

C

is expressed as

C_{k + 1} = K ⊙ ({\hat{A}}_{k + 1} - \frac{M_{k}}{μ}) .

(26)

This update indicates that

C

is only estimated on the support of the knowledge-informed graph matrix

K

, while the elements outside the graph structure are forced to zero. Therefore, no element-wise division by

K

is required, and the small positive constant

ε

used for preventing division by zero is no longer needed in this update.

(3): Update matrix $M$

The matrix

M

serves as the Lagrangian multiplier and is updated via Equation (27).

M_{k + 1}^{T} = M_{k}^{T} - μ ({\hat{A}}_{k + 1} - K ⊙ C_{k + 1})

(27)

Algorithm 1 summarizes the procedure for optimizing the dynamic matrix

\hat{A}

.

Algorithm 1 ADMM Algorithm for Solving Problem (19)

Initialization:
(1) Construct the data matrix $X$ ;
(2) Specify the sampling interval $τ$ and construct the training sample matrices $Δ X$ and $Δ \hat{X}$ ;
(3) Set the number of iterations N and the threshold $ϵ$ ;
(4) Initialize the parameter $μ$ and the matrices ${\hat{A}}_{0}$ , $C_{0}$ , and $M_{0}$ .
for $k = 1 : N$
Calculate $\hat{A}$ in the $(k + 1)$ th iterations with (23);
Calculate $C$ in the $(k + 1)$ th iterations with (26);
Calculate $M$ in the $(k + 1)$ th iterations with (27);
if $(|\hat{J} ({\hat{A}}_{k + 1}) - \hat{J} ({\hat{A}}_{k})| < ϵ)$
break;
$k = k + 1$ ;
End for.

Since the proposed optimization problem is formulated as a well-defined constrained quadratic optimization problem with a linear equality constraint, the ADMM iterations are expected to converge under the standard ADMM framework. In the implementation, the convergence tolerance

ϵ_{tol}

and the maximum number of iterations are used as the stopping criteria to ensure stable numerical optimization.

The iteration in Algorithm 1 is terminated when the change of the objective function value between two consecutive iterations satisfies

|\hat{J} ({\hat{A}}_{k + 1}) - \hat{J} ({\hat{A}}_{k})| < ϵ

(28)

where

ϵ

denotes a predefined convergence tolerance. This stopping criterion indicates that the objective function has become sufficiently stable and further iterations bring negligible improvement. Since the optimization problem is solved within the ADMM framework, the iterative updating procedure follows the standard convergence behavior of ADMM.

3.2. Multi-Scale Recursive DISSIM Monitoring Method

To improve the fault detection performance, this study proposes a multi-scale recursive sliding window monitoring method based on DISSIM, as shown in Figure 2. During online monitoring, sliding windows of varying lengths are selected, and the DISSIM algorithm is employed to monitor the samples within each window. Hence, the health status of the system can be assessed at different time scales, and the final monitoring results are fused for output to enhance monitoring performance.

To improve the computational efficiency of covariance matrix updating during online monitoring, a recursive covariance update strategy for the DISSIM monitoring method under multi-scale scenarios is adopted in this study. In conventional sliding-window methods, the covariance matrix and its eigenvalue decomposition are usually recalculated using all samples within the updated window after each window shift, resulting in considerable repeated computation. In contrast, the recursive update strategy incrementally modifies the covariance matrix using only the newly added sample and the removed sample, thereby avoiding repeated calculations over the entire historical dataset. This approach significantly reduces the computational burden of online feature updating and eigenvalue decomposition while preserving the statistical characteristics of the covariance estimation, making it more suitable for real-time industrial process monitoring applications. For this purpose, Lemma 1 needs to be introduced.

Lemma 1

([36]). Suppose Σ is a diagonal matrix,

Σ = diag {σ_{1}, σ_{2}, \dots, σ_{m}}

. Consider

Φ = Σ + θ η η^{T}

with

θ \in R

and

η \in R^{m}

. It follows that the eigenvalues of Φ satisfy:

f (λ) = (1 + θ \sum_{i = 1}^{m} \frac{η_{i}^{2}}{σ_{i} - λ}) \prod_{i = 1}^{m} (σ_{i} - λ) = 0

(29)

and the eigenvector corresponding to the eigenvalue

λ_{i}

satisfies:

ξ_{i} = \frac{{(λ_{i} I_{m} - Σ)}^{- 1} η}{∥ ({(λ_{i} I_{m} - Σ)}^{- 1} η) ∥}, i = 1, 2, \dots, m .

(30)

The specific process of multi-scale recursive sliding window monitoring is as follows:

Let the width of the jth sliding window be w, and the dynamic component is defined as

\tilde{d} = \hat{A} x

. At time k, the sample matrices corresponding to the dynamic and static components within the jth sliding window are shown in Equation (31).

\{\begin{matrix} {\tilde{D}}_{w_{j}, k} = [{\tilde{d}}_{k - w_{j} + 1} {\tilde{d}}_{k - w_{j} + 2} \dots {\tilde{d}}_{k}] \\ U_{w_{j}, k} = [u_{k - w_{j} + 1} u_{k - w_{j} + 2} \dots u_{k}] \end{matrix} .

(31)

According to the mechanism of sliding window, Equation (32) holds for the

(k + 1)

th sliding window.

\{\begin{matrix} {\tilde{D}}_{w_{j}, k + 1} = [{\tilde{d}}_{k - w_{j} + 2} {\tilde{d}}_{k - w_{j} + 3} \dots {\tilde{d}}_{k + 1}] \\ U_{w_{j}, k + 1} = [u_{k - w_{j} + 2} u_{k - w_{j} + 3} \dots u_{k + 1}] \end{matrix} .

(32)

Assume that the covariance matrices of matrices

{\tilde{D}}_{w_{j}}

and

U_{w_{j}}

are

Γ_{{\tilde{d}}^{j}}

and

Γ_{u^{j}}

, respectively. Then the monitoring indicators for dynamic and static components are designed respectively, as shown in Equation (33).

\{\begin{matrix} T_{{\tilde{d}}^{j}} = \frac{4}{m} \sum_{i = 1}^{m} {(λ_{i}^{{\tilde{d}}^{j}} - 0.5)}^{2} \\ T_{u^{j}} = \frac{4}{m} \sum_{i = 1}^{m} {(λ_{i}^{u^{j}} - 0.5)}^{2} \end{matrix},

(33)

Here,

λ^{{\tilde{d}}^{j}}

and

λ^{u^{j}}

denote the eigenvalues of

Γ_{{\tilde{d}}^{j}}

and

Γ_{u^{j}}

, respectively. The definition of integrated monitoring indicators is shown in Equation (34). In this formulation, the window width

w_{j}

serves as the weighting coefficient for the corresponding monitoring statistic. Larger windows contain more historical observations and generally provide more stable estimates of the process state, whereas smaller windows are more sensitive to local and transient variations. Therefore, assigning weights proportional to the window widths allows the final monitoring statistic to balance short-term sensitivity and long-term stability. The resulting statistic effectively combines complementary information from multiple temporal scales and improves the robustness of process monitoring. Furthermore, the control limits

J_{T_{\tilde{d}}}

and

J_{T_{u}}

corresponding to

T_{\tilde{d}}

and

T_{u}

can be obtained through kernel density estimation under the confidence level of

ϕ

.

\{\begin{matrix} T_{\tilde{d}} = \frac{\sum_{j = 1}^{L} w_{j} T_{{\tilde{d}}^{j}}}{\sum_{j = 1}^{L} w_{j}} \\ T_{u} = \frac{\sum_{j = 1}^{L} w_{j} T_{u^{j}}}{\sum_{j = 1}^{L} w_{j}} \end{matrix}

(34)

By way of illustration, Equation (35) holds for the kth window of the dynamic component.

Γ_{{\tilde{d}}^{j}, k} = c o v ({\tilde{D}}_{w_{j}, k}) = \frac{1}{w_{j} - 1} {\tilde{D}}_{w_{j}, k} {\tilde{D}}_{w_{j}, k}^{T} = {\tilde{P}}_{j, k} Λ_{j, k} {\tilde{P}}_{j, k}^{T},

(35)

where

{\tilde{P}}_{j, k}

is an orthogonal matrix whose columns are the eigenvectors of

Γ_{{\tilde{d}}^{j}, k}

, and

Λ_{j, k}

is a diagonal matrix containing the corresponding eigenvalues.

According to [31,36], Equation (36) holds for the

(k + 1)

th window.

\begin{matrix} Γ_{{\tilde{d}}^{j}, k + 1} & = c o v ({\tilde{D}}_{w_{j}, k + 1}) \\ = & \frac{1}{w_{j} - 1} D_{w_{j}, k + 1} D_{w_{j}, k + 1}^{T} \\ = & \frac{1}{w_{j} - 1} (D_{w_{j}, k} D_{w_{j}, k}^{T} - {\tilde{d}}_{k - w_{j} + 1}^{j} {\tilde{d}}_{k - w_{j} + 1}^{T} + {\tilde{d}}_{k + 1} {\tilde{d}}_{k + 1}^{T}) \\ = & Γ_{{\tilde{d}}^{j}, k} + \frac{1}{w_{j} - 1} ({\tilde{d}}_{k + 1} {\tilde{d}}_{k + 1}^{T} - {\tilde{d}}_{k - w_{j} + 1} {\tilde{d}}_{k - w_{j} + 1}^{T}) \\ = & {\tilde{P}}_{j, k + 1} Λ_{j, k + 1} {\tilde{P}}_{j, k + 1}^{T} . \end{matrix}

(36)

Combining Equation (36) yields Equation (37).

\begin{matrix} Γ_{{\tilde{d}}^{j}, k + 1} & = Γ_{{\tilde{d}}^{j}, k} + \frac{1}{w_{j} - 1} ({\tilde{d}}_{k + 1} {\tilde{d}}_{k + 1}^{T} - {\tilde{d}}_{k - w_{j} + 1} {\tilde{d}}_{k - w_{j} + 1}^{T}) \\ = & {\tilde{P}}_{j, k} Λ_{j, k} {\tilde{P}}_{j, k}^{T} + \frac{1}{w_{j} - 1} ({\tilde{d}}_{k + 1} {\tilde{d}}_{k + 1}^{T} - {\tilde{d}}_{k - w_{j} + 1} {\tilde{d}}_{k - w_{j} + 1}^{T}) \\ = & {\tilde{P}}_{j, k} (Λ_{j, k} + \frac{1}{w_{j} - 1} z_{j, 1} z_{j, 1}^{T}) {\tilde{P}}_{j, k}^{T} - \frac{{\tilde{d}}_{k - w_{j} + 1} {\tilde{d}}_{k - w_{j} + 1}^{T}}{w_{j} - 1}, \end{matrix}

(37)

where

z_{j, 1} = {\tilde{P}}_{j, k}^{T} {\tilde{d}}_{k + 1}

.

According to Lemma 1, a rank-one modification for multi-scale DISSIM monitoring can be implemented, as shown in Equation (38).

Λ_{j, k} + \frac{1}{w_{j} - 1} z_{j, 1} z_{j, 1}^{T} = {\bar{P}}_{j, 1} {\bar{Λ}}_{j, 1} {\bar{P}}_{j, 1}^{T} .

(38)

Substitute Equation (38) into Equation (37) to obtain Equation (39).

\begin{matrix} Γ_{{\tilde{d}}^{j}, k + 1} & = {\tilde{P}}_{j, k} {\bar{P}}_{j, 1} {\bar{Λ}}_{j, 1} {\bar{P}}_{j, 1}^{T} {\tilde{P}}_{j, k}^{T} - \frac{1}{w_{j} - 1} {\tilde{d}}_{k - w_{j} + 1} {\tilde{d}}_{k - w_{j} + 1}^{T} \\ = {\tilde{P}}_{j, k} {\bar{P}}_{j, 1} ({\bar{Λ}}_{j, 1} - \frac{1}{w_{j} - 1} z_{j, 2} z_{j, 2}^{T}) {\bar{P}}_{j, 1}^{T} {\tilde{P}}_{j, k}^{T}, \end{matrix}

(39)

where

z_{j, 2} = {\bar{P}}_{j, 1}^{T} {\tilde{P}}_{j, k}^{T} {\tilde{d}}_{k - w_{j} + 1}

. With this, the second rank-one modification for multi-scale DISSIM monitoring can be computed by applying Lemma 1, resulting in Equation (40) as follows.

{\bar{Λ}}_{j, 1} - \frac{1}{w_{j} - 1} z_{j, 2} z_{j, 2}^{T} = {\bar{P}}_{j, 2} {\bar{Λ}}_{j, 1} {\bar{P}}_{j 2}^{T} .

(40)

Substitute Equation (40) into Equation (39) to obtain Equation (41).

\begin{matrix} Γ_{{\tilde{d}}^{j}, k + 1} & = {\tilde{P}}_{j,} {\bar{P}}_{j, 1} {\bar{P}}_{j, 2} {\bar{Λ}}_{j, 2} {\bar{P}}_{j, 2}^{T} {\bar{P}}_{j, 1}^{T} {\tilde{P}}_{j, k}^{T} \\ = {\tilde{P}}_{j, k + 1} Λ_{j, k + 1} {\tilde{P}}_{j, k + 1}^{T} \end{matrix}

(41)

Hence, we can derive the following relationship.

\{\begin{matrix} {\tilde{P}}_{j, k + 1} = {\tilde{P}}_{j, k} {\bar{P}}_{j, 1} {\bar{P}}_{j, 2} \\ Λ_{j, k + 1} = {\bar{Λ}}_{j, 2} \end{matrix} .

(42)

The recursive update process of the covariance matrix for the dynamic component’s multi-scale sliding window is described above. The update process for the static component follows a similar procedure and is therefore omitted for brevity.

3.3. Computational Complexity Analysis

In this study, the number of floating-point operations (FLOPs) is adopted as the metric for computational complexity analysis. According to [17,31], for the j-th scale, the nonrecursive method requires approximately

2 ω_{j} m^{2} + 9 m^{3}

FLOPs, where

ω_{j}

denotes the width of the j-th sliding window and m denotes the variable dimension. Therefore, for L scales, the total computational cost of the nonrecursive method is approximately

\sum_{j = 1}^{L} (2 ω_{j} m^{2} + 9 m^{3})

. In contrast, when

Γ_{{\tilde{d}}^{j}, k}

is known, the recursive method requires approximately

4 m^{3}

floating-point operations at each scale, leading to a total computational cost of approximately

4 L m^{3}

. It can be seen that the computational complexity of the nonrecursive eigenvalue decomposition algorithm is related to both the sliding window width

ω_{j}

and the number of scales L, whereas that of the recursive eigenvalue decomposition algorithm is independent of

ω_{j}

and only related to L. In other words, the sliding window width in the proposed method does not significantly affect its online computational performance. As mentioned above,

ω_{j} ≫ m

is generally required to accurately describe the data distribution. In this case, the recursive eigenvalue decomposition algorithm has a clear advantage in terms of computational efficiency.

It should be noted that the computational complexity in this study is evaluated using FLOPs rather than empirical execution time. FLOPs provide a hardware-independent measure of computational cost and allow a fair comparison between different algorithms without being affected by implementation details, software environments, or hardware configurations. Since the primary objective of this analysis is to evaluate the computational benefit introduced by the proposed recursive updating mechanism, FLOPs are adopted as the main complexity metric. The significant reduction in FLOPs compared with the corresponding non-recursive implementation indicates the suitability of the proposed method for online monitoring applications.

To summarize, this study proposes a framework for dynamic process modeling and monitoring that integrates knowledge-informed bipartite graph embedding with multi-scale sliding windows (KBGE-MSSW). The offline training and online testing processes of KBGE-MSSW are summarized in the schematic diagram of Figure 3.

4. Experimental Results and Analysis

In the experiments, a numerical example and Tennessee Eastman Process (TEP) simulation model are used to verify the performance of the proposed method. In the following experiments,

N = 50

,

τ = 60

,

ε = 3

, and

μ = 0.5

. The settings of parameters

τ

,

μ

, and

ε

in this study mainly follow the commonly used values reported in previous studies [9,31], and are further determined by considering the operating characteristics of the studied system and the features of the experimental data. The selection of the sliding window width

w_{j}

is closely related to the data sampling frequency and the operating cycle of the system, and the specific value of

w_{j}

is presented in the corresponding subsections. In general, the amount of data contained within a window should be larger than that corresponding to one complete cycle of the system under steady-state operation, so that the information within the window can adequately reflect the operating state of the system. Meanwhile, an excessively large window width would increase the online computational burden. Therefore, a trade-off between information completeness and computational efficiency was considered in determining the window width. The number of scales affects both the stability of the experimental results and the computational complexity of the algorithm. A larger number of scales can help characterize the system state from multiple time scales, but it also increases the online computational cost. Therefore, the final number of scales was determined by considering both the reliability of the diagnostic results and the computational efficiency required for online application. Overall, the parameter settings adopted in this study were not solely based on empirical experience, but were determined by comprehensively considering previous studies, the system operating cycle, sampling frequency, result stability, and computational complexity. The control limit is obtained using KDE, with a confidence level set to

0.99

based on cross validation.

4.1. The Numerical Example 1

To verify the effectiveness of the proposed KBGE-MSSW, the following numerical example containing dynamic characteristics are constructed:

\begin{matrix} x (t) = & [\begin{matrix} {\bar{A}}_{1} {\bar{A}}_{2} \end{matrix}] {[\begin{matrix} x^{T} (t - 1), x^{T} (t - 2) \end{matrix}]}^{T} \\ + \bar{B} [\begin{matrix} 10 n_{1} (t) + 10 \\ 10 n_{2} (t) + 20 \end{matrix}] + 0.01 w (t) . \end{matrix}

where

{\bar{A}}_{1} = [\begin{matrix} 0.00 & 0.70 & 0.00 & - 0.20 & - 0.15 \\ 0.00 & - 0.30 & 0.75 & 0.00 & 0.00 \\ 0.45 & 0.60 & 0.00 & 0.45 & 0.75 \\ 0.00 & 0.60 & 0.00 & 0.00 & 0.15 \\ 0.00 & 0.15 & - 0.45 & 0.30 & - 0.40 \end{matrix}],

{\bar{A}}_{2} = [\begin{matrix} 0.00 & 0.10 & - 0.15 & 0.00 & 0.00 \\ 0.30 & - 0.15 & 0.18 & 0.00 & 0.00 \\ - 0.15 & - 0.45 & 0.12 & 0.15 & 0.00 \\ 0.00 & - 0.30 & 0.07 & 0.00 & - 0.19 \\ 0.00 & 0.00 & 0.00 & - 0.60 & - 0.15 \end{matrix}],

\bar{B} = {[\begin{matrix} 0.00 & 0.00 & 0.00 & 0.00 & 0.00 \\ 0.00 & 0.00 & 0.00 & 0.00 & 0.00 \end{matrix}]}^{T} .

Each random variable

n_{i}

is drawn from a standard normal distribution, and the random vector

e

follows a standard multivariate normal distribution. The initial state is randomly sampled from the range

[- 10, 10]

. The generated data therefore undergo an initial transient before stabilizing. The offline model is trained on 6000 normal samples. In the testing phase, four types of faults are simulated. Each test run contains 2000 samples, with a fault introduced at the 501st sample point. The specific fault scenarios are described below:

Fault 1: A change of matrix ${\bar{A}}_{1}$

${\bar{A}}_{1} = [\begin{matrix} 0.00 & 0.70 & 0.00 & - 0.20 & - 0.15 \\ 0.00 & - 0.30 & 0.75 & 0.00 & 0.00 \\ 0.45 & 0.60 & 0.00 & 0.45 & 0.75 \\ 0.00 & 1.10 & 0.00 & 0.00 & 0.15 \\ 0.00 & 0.15 & - 0.45 & 0.30 & - 0.40 \end{matrix}],$
Fault 2: A change of matrix ${\bar{A}}_{2}$

${\bar{A}}_{2} = [\begin{matrix} 0.00 & 0.10 & - 0.15 & 0.00 & 0.00 \\ 0.30 & - 0.15 & 0.18 & 0.00 & 0.00 \\ - 0.15 & 0.00 & 0.12 & 0.15 & 0.00 \\ 0.00 & - 0.30 & 0.07 & 0.00 & - 0.19 \\ 0.00 & 0.00 & 0.00 & - 0.60 & - 0.15 \end{matrix}],$
Fault 3: Changes from external disturbances

$\bar{B} = {[\begin{matrix} 0.01 & 0.00 & 0.00 & 0.00 & 0.00 \\ 0.01 & 0.00 & 0.06 & 0.00 & 0.00 \end{matrix}]}^{T} .$
Fault 4: A change occurs in the variance the independent variable

$w = [\begin{matrix} w_{1} \\ w_{2} \\ w_{3} \\ w_{4} \\ w_{5} \end{matrix}] \sim N (μ, Σ), Σ = [\begin{matrix} 0.0004 \\ 0.0004 \\ 0.04 \\ 0.0004 \\ 0.0004 \end{matrix}]$

4.1.1. Basic Performance Verification

To evaluate the effectiveness of the proposed knowledge-informed bipartite graph structure, the temporal relationships among variables are assumed to be known and are used as prior knowledge for graph construction. Specifically, if a variable at the current time instant has a known temporal dependency on a variable at a previous time instant, the corresponding edge is retained in the bipartite graph. Otherwise, the corresponding connection is set to zero. In this way, the prior relationships among variables can be encoded into a binary connection matrix, where an entry equal to 1 indicates the existence of a known relationship between two variable nodes, while an entry equal to 0 indicates the absence of a direct relationship.

The structure of the bipartite graph in the experiment is constructed following the method in Ref. [31], but the adjacency matrix it is assumed that the adjacency matrix is known. Since each variable is generally directly dependent on only a subset of historical variables rather than all variables, the resulting connection matrix is not an all-one matrix but exhibits an explicit sparse structure. This sparse bipartite graph introduces prior dependency information among variables into the model training process, thereby constraining the model connections, reducing unnecessary information propagation between irrelevant variables, and improving the consistency between the model structure and the known system relationships.

Figure 4 shows the convergence curve of the loss function for solving problem (19) Algorithm 1. It can be seen that the loss function converges after a finite iterations.

To further quantitatively evaluate the similarity between the original dynamic matrix

A

and the identified dynamic matrix

\hat{A}

, a similarity metric is introduced.

S_{c} = \frac{〈vec (A), vec (\hat{A})〉}{{∥vec (A)∥}_{2} {∥vec (\hat{A})∥}_{2}},

(43)

where

vec (\cdot)

denotes the vectorization operation and

〈 \cdot, \cdot 〉

denotes the inner product. A larger

S_{c}

, especially a value close to 1, indicates higher structural similarity between the original and identified dynamic matrices. The calculated values of

S_{c}

further confirms that the identified dynamic matrix is highly consistent with the original one, which supports the visual comparison shown in Figure 5 and verifies the effectiveness of the proposed identification method. According to Figure 5,

S_{c} = 87.52 %

. That is, the dynamic matrix identified by the proposed KBGE exhibits a high degree of consistency with the parameters of the dynamic matrix of the numerical example.

4.1.2. Ablation Experiment

To validate the efficacy of each component in the proposed KBGE-MSSW framework, ablation studies are carried out. The fault detection rate (FDR) and the false alarm rate (FAR) are adopted as the performance metrics. The number of multi-scale sliding windows is set to

L = 3

. For the multi-scale recursive DISSIM strategy, the window sizes

[500, 800, 1000]

were selected to represent short, medium, and long term temporal scales, respectively. These window sizes are not designed as an exhaustive parameter grid, but rather as complementary scales to describe process variations with different temporal characteristics. A smaller window is more sensitive to local and abrupt changes, while a larger window provides a more stable representation of long-term operating behavior. The intermediate window balances these two aspects. The interval among the three scales was determined by considering temporal diversity, information completeness, and online computational efficiency. The results are summarized in Table 1.

It should be noted that the proposed KBGE-MSSW framework is developed progressively from the baseline framework SDMEM-BG-SW. Therefore, the ablation study adopts a progressive module replacement strategy. Specifically, SDMEM-BG-SW is first used as the baseline method, where the sparse dynamic matrix estimation method based on bipartite graph (SDMEM-BG) [31] is combined with the conventional single-scale sliding window (SW). Then, the SW strategy in SDMEM-BG-SW is replaced by the multi-scale sliding window strategy, resulting in SDMEM-BG-MSSW. This comparison is used to evaluate the contribution of the proposed multi-scale monitoring mechanism. Subsequently, the data-driven graph construction mechanism in SDMEM-BG-SW is replaced by the proposed knowledge-based graph embedding (KBGE) mechanism, resulting in KBGE-SW. This comparison is used to evaluate the contribution of the knowledge graph modeling mechanism. Finally, both modules are replaced simultaneously to obtain the complete KBGE-MSSW framework, which is used to examine the overall improvement achieved by the joint effect of the two enhanced modules.

Accordingly, the comparisons among SDMEM-BG-SW, SDMEM-BG-MSSW, KBGE-SW, and KBGE-MSSW can reveal the individual and combined contributions of the proposed modules. The performance improvements of MSSW-based variants over their SW-based counterparts, namely SDMEM-BG-MSSW versus SDMEM-BG-SW and KBGE-MSSW versus KBGE-SW, demonstrate that the multi-scale sliding window strategy can enhance monitoring capability by capturing fault-related variations over different temporal scales. Similarly, the superior performance of KBGE-based variants over SDMEM-BG-based variants, namely KBGE-SW versus SDMEM-BG-SW and KBGE-MSSW versus SDMEM-BG-MSSW, indicates that the proposed KBGE mechanism improves the modeling precision by incorporating knowledge-guided graph representation, thereby leading to better fault detection performance.

As shown in Table 1, compared with the baseline SDMEM-BG-SW, SDMEM-BG-MSSW improves the mean FDR from 88.82% to 90.42% for the

S_{\tilde{d}} / T_{\tilde{d}}

indicator and from 85.60% to 88.38% for the

S_{u} / T_{u}

indicator, which verifies the effectiveness of the MSSW strategy. Compared with SDMEM-BG-SW, KBGE-SW improves the mean FDR from 88.82% to 89.55% for the

S_{\tilde{d}}

indicator and from 85.60% to 90.32% for the

S_{u}

indicator, confirming the contribution of the KBGE-based modeling mechanism. Furthermore, the complete KBGE-MSSW framework achieves the highest mean FDRs of 90.95% and 90.82% for the two indicators, respectively, while maintaining a low FAR. These results demonstrate that the KBGE and MSSW modules are complementary, and their integration can further improve the overall fault monitoring performance.

4.1.3. Verification of Fault Detection Performance

This section compares the fault detection performance of the proposed method with several established approaches, including PCA, DPCA, TS-PCA, the dynamic RPCA based on a matrix augmentation strategy (D-RPCA), DiPCA, SDiPCA, SDMEM-BG-SW, and the proposed KBGE-MSSW. In this experiment, the time delay for all methods is uniformly set to

q = 2

, while the remaining hyperparameters follow the settings specified in the original papers. Figure 6 presents the fault detection profiles of KBGE-MSSW for four different fault types, with the corresponding fault detection rate (FDR) and false alarm rate (FAR) values annotated. A comprehensive comparison of the FDR results for all methods is summarized in Table 2.

The results indicate that the proposed KBGE-MSSW achieves the highest overall FDRs among all the compared methods. Specifically, the mean FDRs for both monitoring indicators,

T_{\tilde{d}}

and

T_{u}

, provided by KBGE-MSSW exceed 90%.

4.2. Experiments on Tennessee Eastman Process

The TEP dataset is a benchmark widely used in optimization, control, fault diagnosis, and process monitoring [37,38]. It comprises 11 manipulated variables and 41 measured variables, resulting in a total of 52 process variables. The TEP simulation encompasses six distinct operating modes and models the closed-loop dynamics of an industrial chemical process with high fidelity. Furthermore, it includes simulations for 21 predefined process faults. A general schematic of the TEP benchmark is provided in Figure 7 (a detailed description can be found in [39]). Due to its comprehensive nature and widespread adoption as a standard testbed, the TEP dataset is employed in this study to evaluate and compare the performance of the proposed method against existing approaches.

For the TE process, the knowledge-informed bipartite graph is constructed by incorporating both process mechanistic knowledge and data-driven relational information. First, the 52 process variables in the TE process are regarded as graph nodes, and an adjacency matrix

\tilde{W}

is constructed according to the physical connections and process flow information among variables in the TE process model. In this adjacency matrix, if two variables have a direct physical connection or an explicit process-related relationship, the corresponding matrix entry is set to 1. Otherwise, it is set to 0. Therefore, this adjacency matrix characterizes the direct variable relationships determined by mechanistic knowledge in the TE process.

Considering that dynamic time-lag effects may exist among process variables, a time-lag order of q is further introduced, and the adjacency matrix obtained from physical connections is extended to the lagged-variable space. Specifically, based on

\tilde{W}

, the bipartite graph connection matrix is constructed as

K_{1} = [\tilde{W}, \tilde{W}]

, where the two matrices

\tilde{W}

correspond to the variable connections at different lagged time instants. In this manner, the physical connection information is not only used to describe the relationships among current variables, but is also extended to characterize the dynamic dependencies between current variables and historical variables. However, the adjacency matrix obtained solely from physical connections may not fully capture all dependency relationships in a complex industrial process. To further supplement potential variable relationships, another connection matrix

K_{2}

is constructed according to the graph construction method in Ref. [31]. Finally, the two types of connection information are fused through an element-wise logical OR operation, and the final knowledge-informed bipartite graph connection matrix is obtained as

K = K_{1} \lor K_{2}

As a result, the constructed knowledge-informed bipartite graph incorporates both the physical connection information of the TE process and supplementary variable relationship information. Since each variable in a practical process is typically associated with only a subset of other variables, the final connection matrix

K

still maintains a sparse structure.

In this experiment, classical dynamic process monitoring methods were used for comparison. Furthermore, the

T^{2}

and Q indicators are used as monitoring indicators in TS-PCA, D-RPCA, DiPCA, and SDiPCA. The experimental results are shown in Table 3. A synthesis of FDRs in Table 3 and the mean FAR results in Table 4 reveals a clear trade-off between detection capability and false alarm control across the different methods. The traditional statistical method TS-PCA achieves a relatively high FDR for most significant faults (mean FDR around 82–83%), but it also exhibits high FARs (13.78% for

T^{2}

and 15.12% for Q indicators), indicating that its control limits are conservatively set, which can easily lead to false alarms. In contrast, D-RPCA maintains a very low false alarm level (as low as 0.60%) at the cost of significantly compromised detection capability (mean FDR only about 55%), which may result in substantial missed detections in practice. DiPCA and SDiPCA strike a preliminary balance between detection (approximately 68–70% FDR) and false alarms (approximately 3–5% FAR), yet their capability remains limited for complex or subtle faults (e.g., fault 3, 9, 15). Furthermore, SDMEM-BG-SW significantly enhances the detection of dynamic faults while maintaining extremely low false alarm rates (0.00% for

T_{\tilde{d}}

and 0.15% for

T_{u}

), showing particularly outstanding performance under conditions such as fault 5–7, 10, 12, 15, 16 and 19. Furthermore, the proposed KBGE-MSSW method not only maintains the advantage of a low FARs but also achieves the best overall detection performance with the

T_{u}

statistic (mean FDR reaches 96.97%). It demonstrates significant improvement in detecting faults that are challenging for traditional methods (e.g., fault 3, 9, 15). This indicates that the introduced knowledge enhancement and MSSW mechanisms effectively improve the model’s discriminative ability and robustness for complex fault patterns. In summary, the proposed KBGE-MSSW method achieves a superior balance between high FDRs and low FARs, showing promising potential for engineering applications.

It is worth noting that several subtle or incipient faults in the TEP dataset, such as Fault 3, Fault 9, and Fault 15, remain challenging for process monitoring methods due to their weak disturbance amplitudes and strong overlap with normal operating dynamics. As shown in Table 3, conventional methods, including TS-PCA, D-RPCA, DiPCA, and SDiPCA, generally achieve very low FDRs for these faults. For example, the FDRs of most traditional dynamic monitoring methods for Fault 9 are lower than 5%, indicating that the fault-related variations are difficult to distinguish from normal process fluctuations.

Compared with existing methods, the proposed KBGE-MSSW method significantly improves the detection performance for these challenging faults under the

T_{u}

statistic. Specifically, the FDRs for Fault 3, Fault 9, and Fault 15 reach 95.49%, 87.09%, and 97.12%, respectively. This demonstrates that the introduced knowledge-guided graph-embedding mechanism and multi-scale monitoring strategy can effectively enhance the discriminative capability for complex and weak fault patterns.

Nevertheless, it can also be observed that the detection performance under the

T_{d}

statistic remains limited for certain incipient faults. For instance, the FDRs of Fault 3 and Fault 9 under

T_{d}

are still close to zero. This indicates that although the proposed method can capture subtle abnormal information in the residual-related subspace, some slowly varying fault characteristics may still be masked by normal dynamic variations in the dominant dynamic subspace.

The above limitation mainly originates from two aspects. First, the disturbance amplitudes of incipient faults are extremely weak, resulting in significant overlap between faulty and normal operating distributions. Second, although the MSSW mechanism provides multi-scale temporal information, the current framework still relies on fixed window-based dynamic modeling, which may not fully characterize long-term slow-varying fault evolution patterns.

5. Conclusions

Although the proposed KBGE-MSSW demonstrates promising performance in simulation environments, several limitations should be acknowledged with caution. First, the construction of the knowledge-informed bipartite graph relies on available prior mechanistic knowledge. If such knowledge is incomplete, inaccurate, or difficult to obtain, the modeling performance and interpretability of the proposed method may be affected. Second, although the proposed framework is effective for the numerical example and the TEP simulation model, its scalability to larger-scale industrial systems with more variables, stronger coupling relationships, and more complex operating modes requires further investigation.

In addition, the performance of the multi-scale recursive monitoring strategy may be influenced by window selection and model parameter settings. Different window lengths and model parameters may affect the balance among detection sensitivity, monitoring robustness, convergence behavior, and computational efficiency. Although the parameter settings adopted in this study follow established practices and recommendations reported in the literature, a comprehensive sensitivity analysis of key parameters has not been conducted. Therefore, systematic parameter selection, robustness evaluation, sensitivity analysis, and adaptive parameter tuning strategies will be further investigated in future work. Furthermore, the selected multi-scale window widths are determined according to the operating characteristics and cycle information of the monitored process. More systematic investigations into optimal window selection and multi-scale configuration strategies remain important topics for future research.

Although the FLOPs analysis demonstrates the computational advantage of the proposed recursive monitoring framework over its non-recursive counterpart, a comprehensive evaluation of empirical execution time under different hardware and software environments has not been conducted in the current study. Since practical online monitoring performance is influenced not only by algorithmic complexity but also by implementation details, numerical libraries, and computing platforms, systematic runtime comparisons and deployment-oriented evaluations will be investigated in future work. The computational cost of graph embedding and recursive multi-scale monitoring should also be carefully considered when applying the proposed method to large-scale or real-time industrial scenarios.

Furthermore, the present study mainly validates the proposed method using a numerical example and the TEP simulation model, which provide controllable and reproducible testing conditions. However, real industrial processes may involve measurement noise, missing data, nonstationary operating conditions, sensor drift, and unrecorded disturbances. These factors may introduce additional challenges to the stability, robustness, and generalization capability of the proposed framework. Therefore, future work will incorporate real industrial data to further evaluate the practical applicability of KBGE-MSSW. Moreover, data preprocessing, abnormal data correction, online updating mechanisms, and fault localization and traceability strategies will be further investigated to improve the applicability of the proposed framework in complex industrial environments.

Author Contributions

Conceptualization, S.Z.; methodology, S.Z.; software, Y.H.; validation, Y.H.; formal analysis, Y.H.; investigation, Y.H.; resources, Y.H. and S.Z.; data curation, Y.H.; writing—original draft preparation, Y.H.; writing—review and editing, S.Z.; visualization, Y.H.; supervision, S.Z.; project administration, S.Z.; funding acquisition, S.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

DPCA	dynamic principal component analysis
TS-PCA	two-step principal component analysis
AR	autoregressive
DiPCA	dynamic inner principal component analysis
SDiPCA	sparse dynamic inner principal component analysis
LASSO	least absolute shrinkage and selection operator
DISSIM	dissimilarity analysis
KBGE	knowledge-informed bipartite graph embedding
MSSW	multi-scale sliding window
SDMEM-BG	sparse dynamic matrix estimation method based on bipartite graph
SW	sliding window
FDR	fault detection rate
FAR	false alarm rate

References

Wang, Y.; Si, Y.; Huang, B.; Lou, Z. Survey on the theoretical research and engineering applications of multivariate statistics process monitoring algorithms: 2008–2017. Can. J. Chem. Eng. 2018, 96, 2073–2085. [Google Scholar] [CrossRef]
Chen, H.; Jiang, B.; Ding, S.X.; Huang, B. Data-driven fault diagnosis for traction systems in high-speed trains: A survey, challenges, and perspectives. IEEE Trans. Intell. Transp. Syst. 2022, 23, 1700–1716. [Google Scholar] [CrossRef]
Zhao, C. Perspectives on nonstationary process monitoring in the era of industrial artificial intelligence. J. Process Control 2022, 116, 255–272. [Google Scholar] [CrossRef]
Cheng, Y.; Liu, L.; Liao, Z.; Chen, B.; Yan, J.; Chen, Z. A novel knowledge distillation framework for bearing fault diagnosis under imbalanced samples. Struct. Health Monit. 2026. online ahead of print. [Google Scholar] [CrossRef]
Miao, Y.; Xia, Y.; Liu, J. Remaining useful life prediction via a double convolutional attention-based CNN-GRU model. IEEE Trans. Instrum. Meas. 2025, 74, 3544313. [Google Scholar] [CrossRef]
Zhou, P.; Zhang, R.; Xie, J.; Liu, J.; Wang, H.; Chai, T. Data-driven monitoring and diagnosing of abnormal furnace conditions in blast furnace ironmaking: An integrated PCA-ICA method. IEEE Trans. Ind. Electron. 2021, 68, 622–631. [Google Scholar] [CrossRef]
Huang, J.; Yan, X. Quality-driven principal component analysis combined with kernel least squares for multivariate statistical process monitoring. IEEE Trans. Control Syst. Technol. 2019, 27, 2688–2695. [Google Scholar] [CrossRef]
Wu, D.; Zhou, D.; Chen, M. Probabilistic stationary subspace analysis for monitoring nonstationary industrial processes with uncertainty. IEEE Trans. Ind. Inform. 2022, 18, 3114–3125. [Google Scholar] [CrossRef]
Ma, X.; Wu, D.; Gao, S.; Hou, T.; Wang, Y. Autocorrelation feature analysis for dynamic process monitoring of thermal power plants. IEEE Trans. Cybern. 2023, 53, 5387–5399. [Google Scholar] [CrossRef] [PubMed]
Yuan, X.; Wang, Y.; Yang, C.; Ge, Z.; Song, Z.; Gui, W. Weighted linear dynamic system for feature representation and soft sensor application in nonlinear dynamic industrial processes. IEEE Trans. Ind. Electron. 2018, 65, 1508–1517. [Google Scholar] [CrossRef]
Kong, X.; Yang, Z.; Luo, J.; Li, H.; Yang, X. Extraction of reduced fault subspace based on KDICA and its application in fault diagnosis. IEEE Trans. Instrum. Meas. 2022, 71, 3505212. [Google Scholar] [CrossRef]
Li, C.; Li, G.; Chen, X.; Zhou, P.; He, X. A multiblock kernel dynamic latent variable model for large-scale industrial process monitoring. IEEE Trans. Instrum. Meas. 2022, 71, 3529910. [Google Scholar] [CrossRef]
Jiang, Q.; Chen, S.; Yan, X.; Kano, M.; Huang, B. Data-driven communication efficient distributed monitoring for multiunit industrial plant-wide processes. IEEE Trans. Autom. Sci. Eng. 2022, 19, 1913–1923. [Google Scholar] [CrossRef]
Marjanovic, O.; Lennox, B.; Sandoz, D.; Smith, K.; Crofts, M. Real-time monitoring of an industrial batch process. Comput. Chem. Eng. 2006, 30, 1476–1481. [Google Scholar] [CrossRef]
Qin, Y.; Yan, Y.; Ji, H.; Wang, Y. Recursive correlative statistical analysis method with sliding windows for incipient fault detection. IEEE Trans. Ind. Electron. 2022, 69, 4185–4194. [Google Scholar] [CrossRef]
Ku, W.; Storer, R.H.; Georgakis, C. Disturbance detection and isolation by dynamic principal component analysis. Chemom. Intell. Lab. Syst. 1995, 30, 179–196. [Google Scholar] [CrossRef]
Shang, J.; Chen, M. Recursive dynamic transformed component statistical analysis for fault detection in dynamic processes. IEEE Trans. Ind. Electron. 2018, 65, 578–588. [Google Scholar] [CrossRef]
Lou, Z.; Shen, D.; Wang, Y. Two-step principal component analysis for dynamic processes monitoring. Can. J. Chem. Eng. 2018, 96, 160–170. [Google Scholar] [CrossRef]
Ma, X.; Si, Y.; Qin, Y.; Wang, Y. Fault detection for dynamic processes based on recursive innovational component statistical analysis. IEEE Trans. Autom. Sci. Eng. 2023, 20, 310–319. [Google Scholar] [CrossRef]
Dong, Y.; Qin, S.J. A novel dynamic PCA algorithm for dynamic data modeling and process monitoring. J. Process Control 2018, 67, 1–11. [Google Scholar] [CrossRef]
Yan, Z.; Chen, C.-Y.; Yao, Y.; Huang, C.-C. Robust multivariate statistical process monitoring via stable principal component pursuit. Ind. Eng. Chem. Res. 2016, 55, 4011–4021. [Google Scholar] [CrossRef]
Zhang, J.; Zhou, D.; Chen, M. Self-learning sparse PCA for multimode process monitoring. IEEE Trans. Ind. Inform. 2023, 19, 29–39. [Google Scholar] [CrossRef]
Yu, W.; Zhao, C.; Huang, B.; Xie, M. An unsupervised fault detection and diagnosis with distribution dissimilarity and lasso penalty. IEEE Trans. Control Syst. Technol. 2023, 32, 767–779. [Google Scholar] [CrossRef]
Sun, R.; Wang, Y.; Mou, Z.; He, K. Fault diagnosis for large-scale processes based on robust multiblock global orthogonal projections to latent structures. IEEE Trans. Autom. Sci. Eng. 2023, 20, 1972–1982. [Google Scholar] [CrossRef]
Xiu, X.; Miao, Z.; Yang, Y.; Liu, W. Deep canonical correlation analysis using sparsity-constrained optimization for nonlinear process monitoring. IEEE Trans. Ind. Inform. 2022, 18, 6690–6699. [Google Scholar] [CrossRef]
Zhang, J.; Chen, M.; Hong, X. Monitoring multimode nonlinear dynamic processes: An efficient sparse dynamic approach with continual learning ability. IEEE Trans. Ind. Inform. 2023, 19, 8029–8038. [Google Scholar] [CrossRef]
Zhang, Q.; Xu, W.; Xie, L.; Su, H. Dynamic fault detection and diagnosis for alkaline water electrolyzer with variational Bayesian sparse principal component analysis. J. Process Control 2024, 135, 103173. [Google Scholar] [CrossRef]
Mohammadi, M.; Berahmand, K.; Sadiq, S.; Khosravi, H. Knowledge tracing with a temporal hypergraph memory network. In Proceedings of the 26th International Conference on Artificial Intelligence in Education (AIED 2025), Palermo, Italy, 22–26 July 2025; Springer: Cham, Switzerland, 2025; pp. 77–85. [Google Scholar]
Zhu, J.; Chen, X.; Yang, H.; Nie, F. Unsupervised adaptive bipartite graph embedding. IEEE Trans. Knowl. Data Eng. 2023, 35, 10514–10525. [Google Scholar] [CrossRef]
Lin, S.; Luo, H.; Yan, Y.; Xiao, G.; Wang, H. Co-clustering on bipartite graphs for robust model fitting. IEEE Trans. Image Process. 2022, 31, 6605–6620. [Google Scholar] [CrossRef]
Cui, M.; Wang, Y.; Guo, J.; Hou, T.; Ma, X. A Dynamic Process Modeling Method Based on Bipartite Graph and Recursive Monitoring for Catalytic Cracking Unit. IEEE Trans. Control Syst. Technol. 2025, 33, 2230–2242. [Google Scholar]
Kano, M.; Hasebe, S.; Hashimoto, I.; Ohno, H. Statistical process monitoring based on dissimilarity of process data. AIChE J. 2002, 48, 1231–1240. [Google Scholar] [CrossRef]
Li, T.; Han, Y.; Xu, W.; Geng, Z. Novel adaptive fault detection method based on kernel entropy component analysis integrating moving window of dissimilarity for nonlinear dynamic processes. J. Process Control 2023, 125, 1–18. [Google Scholar] [CrossRef]
Yule, G.U. On a method of investigating periodicities in disturbed series. Philos. Trans. R. Soc. Lond. A 1927, 226, 267–298. [Google Scholar] [CrossRef]
Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers; Now Publishers: Norwell, MA, USA, 2011. [Google Scholar]
Bunch, J.R.; Nielsen, C.P.; Sorensen, D.C. Rank-one modification of the symmetric eigenproblem. Numer. Math. 1978, 31, 31–48. [Google Scholar] [CrossRef]
Lau, C.; Ghosh, K.; Hussain, M.; Hassan, C. Fault diagnosis of Tennessee Eastman process with multi-scale PCA and ANFIS. Chemom. Intell. Lab. Syst. 2013, 120, 1–14. [Google Scholar] [CrossRef]
Xu, Y.; Shen, S.; He, Y.; Zhu, Q. A novel hybrid method integrating ICA-PCA with relevant vector machine for multivariate process monitoring. IEEE Trans. Control Syst. Technol. 2019, 27, 1780–1787. [Google Scholar] [CrossRef]
Downs, J.; Vogel, F. A plant-wide industrial process control problem. Comput. Chem. Eng. 1993, 17, 245–255. [Google Scholar] [CrossRef]

Figure 1. A schematic diagram of the knowledge-informed bipartite graph of a dynamic system.

Figure 2. Schematic diagram of multi-scale recursive sliding window monitoring method.

Figure 3. A schematicillustration of the proposed modeling and monitoring methods.

Figure 4. The convergence curve of the loss function.

Figure 5. Visualizationof dynamic matrix identification results.

Figure 6. The faultdetection results of different methods.

Figure 7. The general overview of TEP benchmark model.

Table 1. Ablation study results based on fault detection rates (unit: %).

Methods	SDMEM-BG-SW				SDMEM-BG-MSSW
Indicator	$S_{\tilde{d}}$		$S_{u}$		$T_{\tilde{d}}$		$T_{u}$
Indicator	FDR	FAR	FDR	FAR	FDR	FAR	FDR	FAR
Fault 1	82.78	0.00	68.42	0.00	88.25	0.00	80.84	0.38
Fault 2	79.11	0.00	78.17	0.00	79.31	0.00	77.30	0.36
Fault 3	100.00	0.00	100.00	0.20	100.00	0.00	100.00	0.38
Fault 4	93.39	0.00	95.79	0.00	94.13	0.00	95.39	0.00
Mean Value	88.82	0.00	85.60	0.05	90.42	0.00	88.38	0.28
Methods	KBGE-SW				KBGE-MSSW
Indicator	$S_{\tilde{d}}$		$S_{u}$		$T_{\tilde{d}}$		$T_{u}$
Indicator	FDR	FAR	FDR	FAR	FDR	FAR	FDR	FAR
Fault 1	82.98	0.00	88.58	0.00	87.72	0.00	89.32	0.00
Fault 2	79.17	0.00	78.84	0.00	79.44	0.00	79.24	0.00
Fault 3	100.00	0.00	100.00	0.20	100.00	0.00	100.00	0.20
Fault 4	96.06	0.00	93.86	0.00	96.66	0.00	94.73	0.00
Mean Value	89.55	0.00	90.32	0.05	90.95	0.00	90.82	0.05

The highest FDR corresponding to each type of fault is bolded.

Table 2. Comparison of fault detection rates using different methods (unit: %).

Methods	PCA		DPCA		TS-PCA		D-RPCA
Indicator	$T^{2}$	Q	$T^{2}$	Q	$T^{2}$	Q	$T^{2}$	Q
Fault 1	0.00	0.00	0.00	0.00	23.70	8.34	8.68	0.00
Fault 2	8.53	8.27	15.09	7.74	75.83	73.77	66.36	0.00
Fault 3	0.00	0.00	0.00	0.00	99.93	93.59	100.00	0.00
Fault 4	0.00	0.00	0.00	0.00	71.63	64.02	70.29	0.00
Mean Value	2.13	2.07	3.77	1.94	67.77	59.93	61.33	0.00
Methods	DiPCA		SDiPCA		SDMEM-BG-SW		KBGE-MSSW
Indicator	$T^{2}$	Q	$T^{2}$	Q	$T_{\tilde{d}}$	$T_u$	$T_{\tilde{d}}$	$T_u$
Fault 1	27.33	1.07	27.73	0.87	82.78	68.42	87.72	89.32
Fault 2	77.40	61.93	79.60	67.87	79.11	78.17	79.44	79.24
Fault 3	99.80	77.73	99.87	86.67	100.00	100.00	100.00	100.00
Fault 4	75.73	19.00	89.27	41.07	93.39	95.79	96.66	94.73
Mean Value	70.07	39.93	74.12	49.12	88.82	85.60	90.95	90.82

The highest FDR corresponding to each type of fault is bolded.

Table 3. Comparison of fault detection rates of different methods on the dataset of Tennessee Eastman process (Unit: %).

Methods	TSPCA		D-RPCA		DiPCA		SDiPCA		SDMEM-BG-SW		KBGE-MSSW
Indicators	$T^{2}$	$Q$	$T^{2}$	$Q$	$T^{2}$	$Q$	$T^{2}$	$Q$	$T_{d}$	$T_{u}$	$T_{d}$	$T_{u}$
fault1	100.00	100.00	94.12	99.37	97.38	99.75	97.38	99.63	72.43	99.75	74.19	99.87
fault2	99.62	99.12	93.49	98.37	97.63	98.13	98.13	97.75	64.91	98.12	69.80	98.25
fault3	16.67	22.18	1.75	1.38	4.50	4.63	5.75	3.88	6.52	94.74	0.00	95.49
fault4	99.87	100.00	100.00	24.16	99.38	99.50	99.63	99.13	32.46	99.62	37.97	99.75
fault5	100.00	100.00	6.88	23.28	99.75	99.63	99.75	99.75	59.65	100.00	64.54	100.00
fault6	100.00	100.00	100.00	100.00	99.75	99.75	99.75	99.75	98.87	100.00	98.87	100.00
fault7	99.37	100.00	99.87	100.00	99.75	75.88	99.75	99.50	98.25	100.00	85.34	100.00
fault8	99.12	99.25	26.16	97.37	83.50	97.50	94.63	97.13	96.99	97.87	96.87	97.99
fault9	15.54	17.42	2.13	0.88	4.13	3.75	3.75	3.75	0.00	85.09	0.00	87.09
fault10	87.97	92.48	3.13	24.66	55.75	50.50	55.75	48.38	95.61	96.99	95.61	96.99
fault11	82.96	88.85	76.85	41.55	75.75	68.00	75.50	71.25	96.87	98.87	96.99	99.00
fault12	99.87	99.87	62.70	97.50	97.63	96.63	98.88	96.75	98.12	99.87	97.74	99.87
fault13	96.24	96.99	64.33	94.37	93.63	95.75	94.38	95.38	93.73	95.61	93.73	95.61
fault14	100.00	97.99	90.36	100.00	83.50	97.25	83.25	98.13	99.25	99.75	99.25	99.87
fault15	19.55	19.42	1.50	4.88	4.25	5.13	2.38	4.13	11.15	97.37	8.40	97.12
fault16	91.48	89.72	4.26	16.52	50.13	46.13	53.38	45.38	97.87	98.87	97.87	98.87
fault17	97.87	98.25	94.74	79.60	86.88	96.75	85.63	96.50	96.87	97.49	96.74	97.62
fault18	92.11	92.23	89.86	89.24	89.50	90.50	89.50	90.38	88.97	90.73	88.97	91.85
fault19	77.44	90.23	8.51	0.88	28.38	18.25	29.63	16.75	94.99	98.62	95.24	98.62
fault20	91.73	90.60	29.91	36.30	58.63	63.25	58.00	65.13	90.10	91.48	89.97	91.60
fault21	54.76	57.39	10.01	31.79	31.13	23.38	43.00	27.75	53.88	90.60	57.02	90.98
Mean Value	82.01	83.43	50.50	55.34	68.61	68.10	69.89	69.34	73.69	96.74	73.58	96.97

The highest FDR corresponding to each type of fault is bolded.

Table 4. Comparison of average false alarm rates of different methods on the dataset of Tennessee Eastman process (Unit: %).

Methods	TSPCA		D-RPCA		DiPCA		SDiPCA		SDMEM-BG-SW		KBGE-MSSW
Indicators	$T^{2}$	$Q$	$T^{2}$	$Q$	$T^{2}$	$Q$	$T^{2}$	$Q$	$T_{d}$	$T_{u}$	$T_{d}$	$T_{u}$
Mean FARs	13.78	15.12	1.64	0.60	3.75	4.67	3.10	4.17	0.00	0.15	0.00	0.15

Indicators with an mean FAR over 5% are bolded.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hao, Y.; Zhu, S. A Statistical Modeling and Monitoring Framework for Dynamic Processes Based on Knowledge Graph and Dissimilarity Analysis. Mathematics 2026, 14, 2047. https://doi.org/10.3390/math14122047

AMA Style

Hao Y, Zhu S. A Statistical Modeling and Monitoring Framework for Dynamic Processes Based on Knowledge Graph and Dissimilarity Analysis. Mathematics. 2026; 14(12):2047. https://doi.org/10.3390/math14122047

Chicago/Turabian Style

Hao, Yunhan, and Shanliang Zhu. 2026. "A Statistical Modeling and Monitoring Framework for Dynamic Processes Based on Knowledge Graph and Dissimilarity Analysis" Mathematics 14, no. 12: 2047. https://doi.org/10.3390/math14122047

APA Style

Hao, Y., & Zhu, S. (2026). A Statistical Modeling and Monitoring Framework for Dynamic Processes Based on Knowledge Graph and Dissimilarity Analysis. Mathematics, 14(12), 2047. https://doi.org/10.3390/math14122047

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Statistical Modeling and Monitoring Framework for Dynamic Processes Based on Knowledge Graph and Dissimilarity Analysis

Abstract

1. Introduction

2. Background

2.1. Dynamic Process Modeling Method Based on AR Model

2.2. Basic Principles of DISSIM Monitoring Method

3. The Proposed Method

3.1. Knowledge-Informed Bipartite Graph Embedding for Dynamic Process Modeling

3.2. Multi-Scale Recursive DISSIM Monitoring Method

3.3. Computational Complexity Analysis

4. Experimental Results and Analysis

4.1. The Numerical Example 1

4.1.1. Basic Performance Verification

4.1.2. Ablation Experiment

4.1.3. Verification of Fault Detection Performance

4.2. Experiments on Tennessee Eastman Process

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI