A Novel Dynamic Process Monitoring Algorithm: Dynamic Orthonormal Subspace Analysis

Weichen Hao; Shan Lu; Zhijiang Lou; Yonghui Wang; Xin Jin; Syamsunur Deprizon

doi:10.3390/pr11071935

,

and

¹

School of Information and Control Engineering, Liaoning Petrochemical University, Fushun 113005, China

²

Institute of Intelligence Science and Engineering, Shenzhen Polytechnic, Shenzhen 518055, China

³

Faculty of Engineering, Technology & Built Environment, UCSI University, Kuala Lumpur 56000, Malaysia

⁴

Postgraduate Department, Universitas Bina Darma, Palembang 30111, Indonesia

Processes2023, 11(7), 1935;https://doi.org/10.3390/pr11071935

This article belongs to the Section Process Control and Monitoring

Version Notes

Order Reprints

Abstract

Orthonormal subspace analysis (OSA) is proposed for handling the subspace decomposition issue and the principal component selection issue in traditional key performance indicator (KPI)-related process monitoring methods such as partial least squares (PLS) and canonical correlation analysis (CCA). However, it is not appropriate to apply the static OSA algorithm to a dynamic process since OSA pays no attention to the auto-correlation relationships in variables. Therefore, a novel dynamic OSA (DOSA) algorithm is proposed to capture the auto-correlative behavior of process variables on the basis of monitoring KPIs accurately. This study also discusses whether it is necessary to expand the dimension of both the process variables matrix and the KPI matrix in DOSA. The test results in a mathematical model and the Tennessee Eastman (TE) process show that DOSA can address the dynamic issue and retain the advantages of OSA.

Keywords:

process monitoring; key performance indicators; orthonormal subspace analysis; dynamic process

1. Introduction

Process monitoring and fault detection are two important aspects of process systems engineering because they are the key issues to address in order to ensure the safety and the normal operation of industrial processes [1].As such, traditional data-driven algorithms such as principal components analysis (PCA) [2] and independent components analysis (ICA) [3] have been proposed to monitor processes and to improve the product quality. PCA and ICA can effectively detect faults in a process. However, in the actual production process at a modern industrial plant, there are a large number of controllers, sensors and actuators distributed widely, and not all data need to be analyzed [4,5]. That is to say, not all process variables directly affect the safety and the product quality. The information highly relevant to the product quality and economic benefits are called key performance indicators (KPIs), and their role should be emphasized in process monitoring [6,7]. It is worth mentioning that both PCA and ICA monitor KPI-related and KPI-unrelated components simultaneously, and they perform poorly in detecting faults in KPI-related components because the fault information might be submersed in the disturbances of numerous KPI-unrelated components. As such, KPI-related process monitoring such as partial least squares (PLS) [8] and canonical correlation analysis (CCA) [9] algorithms have developed rapidly in recent decades, and this development is essential for ensuring production safety and obtaining superior operation performance.

However, there are still some drawbacks to these traditional KPI algorithms. First, the residual subspace calculated by the PLS algorithm is non-orthogonal to the principal components (PCs) subspace, which means that some KPI-related information may leak into the residual spaces [10,11]. Second, the CCA algorithm requires KPIs to be available during both offline training and online monitoring stages as it uses KPI variables to construct indices [12,13]. Third, both PLS and CCA algorithms are unable to extract PCs [14,15].

To address the above issues in traditional KPI-related algorithms, Lou et al. proposed orthonormal subspace analysis (OSA) [16]. OSA can divide the process data and KPIs into three orthonormal subspaces, namely, subspaces of KPI-related components, KPI-unrelated components in process data, and process-unrelated components in KPIs. Furthermore, the cumulative percent variance method is used to select the number of PCs in an OSA algorithm. Due to the ability of the OSA algorithm to independently monitor each subspace, the OSA algorithm is not limited by KPIs during the offline and online stages.

The original OSA was proposed for addressing the monitoring issues in static process problems, so it assumes that the observations are time-independent. However, dynamic features widely exist in most industrial processes, and, hence, the auto-correlation relationships in variables interfere with the extraction of the KPI-related information [17,18]. Therefore, the subspaces obtained by the OSA algorithm are not orthonormal in dynamic processes.

The “time lag shift” method, which lists the historical data as additional variables to the original variable set, is an effective measure for handling the dynamic issue, and it has been adopted in the PLS and CCA algorithms, i.e., the dynamic PLS (DPLS) and dynamic CCA (DCCA) algorithms. Therefore, in this paper, the “time lag shift” method is also combined with the OSA algorithm, named the dynamic OSA (DOSA) algorithm, and is applied to the Tennessee Eastman (TE) process to illustrate its efficiency.

The contributions of this study are as follows. First, this study proposes DOSA for dealing with the low detection rate problem caused by the dynamics processes. DOSA can determine whether the fault in a dynamics process originates from KPI-related or KPI-unrelated process variables or the measurement of KPIs. Second, this study discusses whether it is necessary to expand the dimension of both the process variables matrix and the KPI matrix in order to reduce the computation. At the same time, a new method to select the time lag number in the “time lag shift” structure is proposed. Additionally, we analyze the impact of the sampling period on DOSA. Third, we place an emphasis on the real-time nature of information and design new monitoring indices. Finally, this study compares the detection rates of the OSA, DOSA, DPLS, and DCCA algorithms.

The remainder of this paper is organized into five sections. Section 2 discusses the classical OSA algorithm and the “time lag shift” method. Section 3 proposes DOSA for dynamics process monitoring. Section 4 compares the DOSA algorithm with other KPI-related algorithms based on TE process testing. Section 5 reviews the contributions of this work.

2. Methods

2.1. Orthonormal Subspace Analysis

Here, we take

X \in R^{n \times s}

as the process variables matrix (where

n

is the number of samples, and

s

is the number of process variables), and the standard PLS identification technique introduces the KPI matrix as

Y \in R^{n \times r}

(where

r

is the number of KPIs). OSA decomposes both

X

and

Y

into the following bilinear terms:

{\begin{cases} X = T_{c o m} Ξ_{X}^{T} + E_{O S A} \\ Y = T_{c o m} Ξ_{Y}^{T} + F_{O S A} \end{cases},

(1)

where

T_{c o m} \in R^{n \times ϕ}

(

ϕ

is the number of principal components) is the common latent variables shared by

X

and

Y

;

Ξ_{X} \in R^{s \times ϕ}

and

Ξ_{Y} \in R^{r \times ϕ}

are the transformation matrices; and

E_{O S A} \in R^{n \times s}

and

F_{O S A} \in R^{n \times r}

are the residual matrices.

Then, OSA, along with PLS and CCA, is called ‘KPI-related algorithm’. As opposed to PLS and CCA, the extracted subspaces of OSA are proved to be orthogonal [16]. That is to say,

T_{c o m}

,

E_{O S A}

, and

F_{O S A}

are orthogonal in Equation (1), and, most importantly, they can be monitored independently.

2.2. The “Time Lag Shift” Method

The proposed OSA algorithm in Section 2.1 implicitly assumes that the current observations are statistically independent to the historical observations [19,20]. That is to say, OSA only considers the correlation between variables at the same time but does not consider the mutual influence of variables at different times. However, most data from industrial processes show degrees of dynamic characteristics; that is, the sampling data at different times are correlated. For such a process, the static OSA algorithm is not applicable.

The most common method to address such a problem is to use an autoregressive (AR) model to describe the dynamic characteristics. Similarly, the OSA algorithm can be extended to take into account the serial correlations by augmenting each observation vector,

X (t) \in R^{1 \times s}

or

Y (t) \in R^{1 \times r}

, at the current time

t

with the previous

l_{x}

or

l_{y}

observations in the following manner [21]:

{\begin{cases} \tilde{X} (t) = [X (t), X (t - 1), \dots, X (t - l_{x})] \in R^{1 \times [(l_{x} + 1) \times s]} \\ \tilde{Y} (t) = [Y (t), Y (t - 1), \dots, Y (t - l_{y})] \in R^{1 \times [(l_{y} + 1) \times s]} \end{cases},

(2)

As known in Equation (2), the first

s

columns of

\tilde{X} (t)

and the first

r

columns of

\tilde{Y} (t)

represent the data at the current time, and the rest represent the data at the past time. For

n

sampling times, one can obtain the augmented matrices

\tilde{X} \in R^{n \times [(l_{x} + 1) \times s]}

and

\tilde{Y} \in R^{n \times [(l_{y} + 1) \times s]}

.

By performing dimension expansion on the data matrix in Equation (2), the static OSA methods can be used to analyze the autocorrelation, cross-correlation, and hysteresis correlation among the data synchronously. That is to say,

\tilde{X}

and

\tilde{Y}

will be decomposed by OSA. More details can be found in Section 3.

3. Dynamics Orthonormal Subspace Analysis

3.1. Determination of the Lag Number

As the traditional lag determination methods, such as the Akaike information criterion (AIC) [22] and the Bayesian information criterion (BIC) [23], are only suitable for a steady state, a new lag determination method should be proposed for DOSA.

Suppose the relationship between the data at the current time and the past time is as follows:

{\begin{cases} X (t) = X (t - 1) A_{1} + X (t - 2) A_{2} + \dots + X (t - l_{x}) A_{l_{x}} + D_{x} (t) = \bar{X} (t) \bar{A} + D_{x} (t) \\ Y (t) = Y (t - 1) B_{1} + Y (t - 2) B_{2} + \dots + Y (t - l_{y}) B_{l_{y}} + D_{y} (t) = \bar{Y} (t) \bar{B} + D_{y} (t) \end{cases},

(3)

where

\bar{X} (t) = [X (t - 1), X (t - 2), \dots, X (t - l_{x})] \in R^{1 \times (l_{x} \times s)}

,

\bar{Y} (t) = [Y (t - 1), Y (t - 2), \dots, Y (t - l_{y})] \in R^{1 \times (l_{y} \times r)}

,

\bar{A} = [A_{1}, A_{2}, \dots, A_{l_{x}}] \in R^{n \times (l_{x} \times s)}

, and

\bar{B} = [B_{1}, B_{2}, \dots, B_{l_{y}}] \in R^{n \times (l_{y} \times r)}

.

D_{x} (t) \in R^{1 \times s}

and

D_{y} (t) \in R^{1 \times s}

denote the disturbance introduced at each time, and it is statistically independent of the past data. The coefficient matrices

\bar{A}

and

\bar{B}

can be estimated from the least square method as follows:

{\begin{cases} \bar{A} = {[{\bar{X}}^{T} (t) \bar{X} (t)]}^{- 1} {\bar{X}}^{T} (t) X (t) \\ \bar{B} = {[{\bar{Y}}^{T} (t) \bar{Y} (t)]}^{- 1} {\bar{Y}}^{T} (t) Y (t) \end{cases} .

(4)

Therefore,

D_{x} (t)

and

D_{y} (t)

can be estimated as follows:

{\begin{cases} D_{x} (t) = X (t) - \bar{X} (t) \bar{A} = X (t) - \bar{X} (t) {[{\bar{X}}^{T} (t) \bar{X} (t)]}^{- 1} {\bar{X}}^{T} (t) X (t) \\ D_{y} (t) = Y (t) - \bar{Y} (t) \bar{B} = Y (t) - \bar{Y} (t) {[{\bar{Y}}^{T} (t) \bar{Y} (t)]}^{- 1} {\bar{Y}}^{T} (t) Y (t) \end{cases} .

(5)

Then, the optimal number of time lag will be the one that creates the following indices:

{\begin{cases} L a g_{x} = {‖ \sum_{t = 1}^{n} D_{x} (t) ‖}^{2} = {‖ X - \bar{X} {[{\bar{X}}^{T} \bar{X}]}^{- 1} {\bar{X}}^{T} X ‖}^{2} \\ L a g_{y} = {‖ \sum_{t = 1}^{n} D_{y} (t) ‖}^{2} = {‖ Y - \bar{Y} {[{\bar{Y}}^{T} \bar{Y}]}^{- 1} {\bar{Y}}^{T} Y ‖}^{2} \end{cases}

(6)

the minimum and the indices will not change significantly if we continue increasing the time lag.

As opposed to

X (t)

and

Y (t)

,

D_{x} (t)

and

D_{y} (t)

are time-uncorrelated and independent of the initial states of

X (t)

and

Y (t)

, so they can be adopted to the dynamic process in both steady and unsteady states.

Additionally, we also set up an index to describe ‘the value of Lag_x or Lag_y would not change significantly’ as shown in Equation (7):

R C % = \frac{| L a g_{i - 1} - L a g_{i} |}{L a g_{i - 1}} \times 100 %,

(7)

where

L a g_{i}

represents the value of

L a g_{x}

or

L a g_{y}

when the lag number is

l_{x} (l_{x} > 1)

or

l_{y} (l_{y} > 1)

, and

L a g_{i - 1}

represents the value of Lag_x or Lag_y when the lag number is

l_{x} - 1

or

l_{y} - 1

. If the value of

R C %

begins to be less than 5%, we will say that ‘the value of

L a g_{x}

or

L a g_{y}

would not change significantly’.

3.2. DOSA Procedure

Step 1. The “Time Lag Shift” method mentioned in Section 2.2. Calculate the lag number of $l_{x}$ and $l_{y}$ in Equation (6). Then, augment $X (t)$ and $Y (t)$ with the previous observations shown in Equation (2). In doing so, we can obtain the augmented matrix $\tilde{X}$ and $\tilde{Y}$ with n samples.
Step 2. Traditional OSA mentioned in Section 2.1.

(a): Calculate the Y-related component $X_{O S A} \in R^{n \times [(l_{x} + 1) \times s]}$ and the X-related component $Y_{O S A} \in R^{n \times [(l_{y} + 1) \times s]}$ using Equation (8). $X_{O S A}$ and $Y_{O S A}$ are both called ‘the common component’ and are shown to be equal in reference [16], as shown below:

${\begin{cases} X_{O S A} = \tilde{Y} {({\tilde{Y}}^{T} \tilde{Y})}^{- 1} {\tilde{Y}}^{T} \tilde{X} \\ Y_{O S A} = \tilde{X} {({\tilde{X}}^{T} \tilde{X})}^{- 1} {\tilde{X}}^{T} \tilde{Y} \end{cases} .$

(8)

We tend to focus on process variables related to KPIs in industrial processes. By extracting common components and monitoring them (Step 3), one can know whether there are faults in the variables related to KPIs.

(b): Calculate the non-Y-related component $E_{O S A} \in R^{n \times [(l_{x} + 1) \times s]}$ and the non-X-related component $F_{O S A} \in R^{n \times [(l_{y} + 1) \times s]}$ as

${\begin{cases} E_{O S A} = \tilde{X} - X_{O S A} \\ F_{O S A} = \tilde{Y} - Y_{O S A} \end{cases},$

(9)

where E_OSA and F_OSA are both called ‘the unique component’. By extracting and monitoring the unique components (Step 3), one can know whether there are faults in the variables unrelated to KPIs.
(c): Extract the PCs in X_OSA using the PCA decomposition method because the variables in X_OSA might be highly correlated:

${\begin{cases} X_{O S A} = T_{c o m x} P_{c o m}^{T} + E_{f} \\ T_{c o m x} = X_{O S A} P_{c o m} \end{cases},$

(10)

where $T_{c o m x} \in R^{n \times k}$ represents the score matrix of the common component; $P_{c o m} \in R^{[(l_{x} + 1) \times s] \times k}$ is the loading matrix of the common component; $E_{f} \in R^{n \times [(l_{x} + 1) \times s]}$ is the residual matrix; and k is the number of PCs. In this step, the PCs are selected by using the CPV method, and the threshold value follows the PCA criterion, e.g., 85%.

In theory, the score matrices of the common components

X_{O S A}

and

Y_{O S A}

are equal unless there is something wrong with the relationship between X and Y. We use the sum of squares of the score matrices to monitor whether there are faults in the relationship between X and Y (Step 3). Similarly to Equation (10), the score matrix of the common component is

T_{c o m y} = Y_{O S A} P_{c o m}

.

Step 3. Monitoring indices calculation.

Taking into account the real-time nature of the information, PCA monitoring is not directly performed for

X_{O S A}

, E_OSA, and

F_{O S A}

because these components contain a great amount of information at the past time. The calculation of the indices if as follows:

(a): The first $s$ columns of X_OSA are monitored by the PCA approach and can then be used to generate the $T_{C}^{2}$ and $S P E_{C}$ indices. That is to say, we only monitor the data at the current time.
(b): Similarly, the first $s$ columns of $E_{O S A}$ and the first $r$ columns of $F_{O S A}$ can be monitored by the PCA approach and can then be used to generate the indices $T_{E}^{2}$ , $T_{F}^{2}$ , $S P E_{E}$ , and $S P E_{F}$ .
(c): Furthermore, if there is something wrong with the relationship between X and Y, there will be significant differences between the score matrices $T_{c o m x}$ and $T_{c o m y}$ . Therefore, the following index can be used to test the abnormal relationship:

$S P E_{X Y} = (T_{c o m x} - T_{c o m y}) {(T_{c o m x} - T_{c o m y})}^{T} .$

(11)

Figure 1 summarizes the procedure presented below.

Figure 1. The flow chart of DOSA.

3.3. A Dynamics Model Analyzed with DOSA

3.3.1. Dynamics Model

To analyze the characteristics of the DOSA method and compare its performance with the OSA algorithm, we use a simplistic simulation process in illustrating the monitoring performances of them. Consider a large-scale process in which each single subprocess can be expressed using a time-invariant, state-space model as follows:

{\begin{cases} X (t) = C [X (t - 1), X (t - 2), X (t - 3)] + D [s_{1}, s_{2}] + ξ \\ Y (t) = E [Y (t - 1), Y (t - 2), Y (t - 3)] + F [s_{1}, s_{2}] + ζ \end{cases},

(12)

where

s_{1}

,

s_{2}

, and

s_{3}

are independent Gaussian distributed vectors;

ξ

and

ζ

are the noisy components, which are independent of the process measurements; and C and E and D and F are the coefficient matrices of the dynamic and static parts, respectively. Here, we take three algorithms into consideration: OSA; the DOSA that expands the dimension of

X

, which is denoted as DOSA-X; the DOSA that expands the dimension of both

X

and

Y

, which is denoted as DOSA-XY.

3.3.2. The Optimal Numbers of Time Lag

To determine the number of time lag, the dynamics model with several numbers of lags that are different from the normal data are fitted. Here,

l_{x}

and

l_{y}

are the numbers of lags in matrix X and Y, respectively. In this work, we set

l_{x} \in [0.1 \dots, 6]

and

l_{y} \in [0.1 \dots 6]

, and several values of

L a g_{x}

and

L a g_{y}

are shown in Figure 2 and Figure 3.

Figure 2. The values of

L a g_{x}

under different

l_{x}

values.

Figure 3. The values of

L a g_{y}

under different

l_{y}

values.

From the analyses shown in Figure 2 and Figure 3, the values of

L a g_{x} (l_{x} = 3)

would be lowest if

l_{x}

was less than or equal to 3, and the values of

L a g_{y} (l_{y} = 3)

tended to be lowest if

l_{y}

was less than or equal to 3. At this time, the values of both

L a g_{x}

and

L a g_{y}

would not decrease significantly if we continued increasing the values of

l_{x}

and

l_{y}

. Therefore, the optimal lag numbers were

l_{x} = 3

and

l_{y} = 3

, and this can be seen intuitively in the diagram. Furthermore, the several values of

L a g_{x}

,

L a g_{y}

, and

R C %

are listed in Table 1 and Table 2.

Table 1. The values of

L a g_{x}

under different

l_{x}

values.

Table 2. The values of

L a g_{y}

under different

l_{y}

values.

From the data presented in Table 1 and Table 2, the values of

R C %

were less than 5% when l_x and l_y gradually increased from 3. This also means that the optimal numbers of lags were l_x = 3 and l_y = 3, which is consistent with the true value.

Here, we take the traditional BIC method, which has a larger penalty than the AIC, as an example to calculate the optimal number of this model. When selecting the best model from a set of alternative models, the model with the lowest BIC should be chosen.

From the data presented in Table 3 and Table 4, the optimal numbers of lags were

l_{x} = 2

and

l_{y} = 3

. However, we introduced a third-order lag as Section 3.3.1 mentioned. Therefore, instead of the BIC, the original method of this work was applied to test the algorithm.

Table 3. The values of BIC under different

l_{x}

values.

Table 4. The values of BIC under different

l_{y}

values.

3.3.3. Testing Results

(a) Fault 1: a step change with an amplitude of 3 in

s_{1}

. Certainly, the static parameter

s_{1}

is the unique part of

X

. The detection rates and false alarm rates of three algorithms are shown in Table 5. In Table 5, the detection rate of

T_{E}^{2}

was extremely high, so we could correctly infer that the fault occurred in the unique part of

X

. In other words, it is possible that there was a fault in the process variables instead of in the measurement of the KPIs. It is more important that the detection rates of the two dynamics monitoring methods were higher than the detection rate of the OSA. Thus, the dynamics problem could be solved by DOSA in this case. Furthermore, the effect of the dimension expansion for both

X

and

Y

was better than the dimension expansion for

X

alone. It can be hypothesized that expanding the dimension of the matrix can improve the sensitivity of the algorithm to the fault.

Table 5. Fault 1 detection rates and false alarm rates of three algorithms.

(b) Fault 2: a step change with an amplitude of 3 in

s_{3}

. It is obvious that the static parameter

s_{3}

is the unique part of

Y

. The results are shown in Table 6. As can be seen in Table 6, we had already expanded the dimension of

X

, but the detection rates of all of the indices were extremely low. Then, we found that the index

T_{F}^{2}

performed better while expanding the dimension of both

X

and

Y

. This means that the fault occurred in the unique part of

Y

. That is to say, there was a fault in the measurement of the KPIs instead of the process variables. In addition, the detection rate of DOSA-XY was extremely higher than the other two algorithms. Thus, an algorithm for the dimension expansion of data matrices with dynamic processes performs well while also solving the dynamics issue.

Table 6. Fault 2 detection rates and false alarm rates of three algorithms.

(c) Fault 3: a step change with an amplitude of 3 in

s_{2}

. Certainly, the static parameter

s_{2}

is the common part of both

X

and

Y

. The results are shown in Table 7. As can be seen in Table 7, we could not judge the location of the fault if we did not expand the dimension of Y because the detection rates of most of the indices were about 50%. Then, the index

T_{C}^{2}

performed better while expanding the dimension of both

X

and

Y

. This means that the fault occurred in the common part of both

X

and

Y

. That is to say, there was a fault in both the process variables and in the measurement of the KPIs. In addition, the detection rate of DOSA-XY was extremely higher than that of the other two algorithms. Thus, an algorithm for the dimension expansion of data matrices with dynamic processes performs well while dealing with the dynamics issue.

Table 7. Fault 3 detection rates and false alarm rates of three algorithms.

(d) Fault 4: the matrix

D

changed to

D_{f}

:

{\begin{cases} D = [\begin{matrix} \begin{array}{l} 0.1 \\ 0.1 \end{array} & \begin{array}{l} 0.1 \\ 0 \end{array} & \begin{array}{l} 0.2 \\ 0.1 \end{array} & \begin{array}{l} 0 \\ 0.1 \end{array} \end{matrix}] \\ D_{f} = [\begin{matrix} \begin{array}{l} 0.1 \\ 0.1 \end{array} & \begin{array}{l} 0.3 \\ 0 \end{array} & \begin{array}{l} 0.2 \\ 0.1 \end{array} & \begin{array}{l} 0 \\ 0.1 \end{array} \end{matrix}] \end{cases} .

(13)

Generally, the coefficient matrix

D

affects the relationship of

X

and

Y

. The results are shown in Table 8. In Table 8, the index SPE_XY that specifically detects the relationship of

X

and

Y

performed well. We could infer that there was a high probability of a fault in

D

or

F

. Then, the detection rate of DOSA-XY was extremely higher than that of the other two algorithms. Thus, an algorithm for the dimension expansion of both

X

and

Y

performs well while also solving the dynamics issue.

Table 8. Fault 4 detection rates and false alarm rates of three algorithms.

3.3.4. The Influence of Sampling Period on DOSA

In sum, it is necessary to expand the dimension of both

X

and

Y

. In this section, we will take the effect of the sampling rate on the DOSA algorithm into account. The dynamics models and faults in Section 3.3.1 and Section 3.3.3 still apply to this section.

Firstly, the section will discuss the effect of doubling the sampling period on the selection of the lag number. We still set

l_{x} \in [0.1 \dots, 6]

and

l_{y} \in [0.1 \dots, 6]

, followed by several values of

L a g_{x}

and

L a g_{y}

, and the corresponding changes in rate are listed in Table 9 and Table 10.

Table 9. The values of

L a g_{x}

for doubling the sampling period.

Table 10. The values of

L a g_{y}

for doubling the sampling period.

As shown in Table 9 and Table 10, the optimal lag numbers were

l_{x} = 1

and

l_{y} = 1

because the values of

R C %

were less than 5% when

l_{x}

and

l_{y}

gradually increased from 1. That is to say, the optimal lag numbers were affected by the sampling period. Thus, the effect of the sampling period on the detection rates of the DOSA was also a concern.

(a): Fault 1: the fault occurs in the unique part of $X$ . The experimental comparison of the primitive and doubled sampling periods is shown in Table 11. As also shown in the table, the detection rate of $T_{E}^{2}$ decreased by about 9%, and the detection rate of $S P E_{E}$ decreased by about 4%.

Table 11. Comparison of primitive and doubled sampling periods (Fault 1).
(b): Fault 2: the fault occurs in the unique part of $Y$ . The experimental comparison of the primitive and doubled sampling periods is shown in Table 12. As also shown in the table, the detection rate of $T_{F}^{2}$ decreased by about 8%, and the detection rate of $S P E_{F}$ decreased by about 3%.

Table 12. Comparison of primitive and doubled sampling periods (Fault 2).
(c): Fault 3: the fault occurs in the common part of $X$ and $Y$ . The experimental comparison of the primitive and doubled sampling periods is shown in Table 13. As also shown in the table, the detection rate of $T_{C}^{2}$ decreased by about 8%, and the detection rate of $S P E_{C}$ decreased by about 5%.

Table 13. Comparison of primitive and doubled sampling periods (Fault 3).
(d): Fault 4: the fault occurs in the coefficient matrix $D$ , which affects the relationship of $X$ and $Y$ . The experimental comparison of the primitive and doubled sampling periods is shown in Table 14. As can be seen in Table 14, there was no significant change in the detection rate of $S P E_{X Y}$ .

Table 14. Comparison of primitive and doubled sampling periods (Fault 4).

Based on the above testing results, we can see that the change in sampling period affected the determination of the lag numbers. The detection rates were also slightly affected. That is to say, the DOSA algorithm is sensitive to the change in sampling period because the AR model, which is constructed by the DOSA, will be different with the change in sampling period. We hope to solve this problem as we continue our improvement of this project in the future.

3.4. Conclusion

As shown by the above results, we can conclude the following:

(1): It is necessary to expand the dimension of both $X$ and $Y$ .
(2): DOSA could adequately solve the dynamics issue.
(3): DOSA is able to directly analyze the location of the fault. Thus, we can know whether a fault actually occurs in KPI-related process variables, KPI-unrelated process variables, and the measurement of the KPIs.
(4): DOSA is sensitive to the change in sampling period.

4. Comparison Study Based on Tennessee Eastman Process

4.1. Tennessee Eastman Process

In this section, we would like to briefly introduce an industrial benchmark of the Tennessee Eastman (TE) process [24,25]. All the discussed methods will be further applied to demonstrate their efficiencies. The TE process model is a realistic simulation program of a chemical plant, which is widely accepted as a benchmark for control and monitoring studies [26]. The flow diagram of the process is described in [27,28], and the FORTRAN code of the process is available on the Internet. The process has two products from four reactants as shown in Equation (14):

{\begin{cases} A (g) + C (g) + D (g) \to G (l i q) \\ A (g) + C (g) + E (g) \to H (l i q) \\ A (g) + E (g) \to F (l i q) \\ 3 D (g) \to 2 F (l i q) \end{cases},

(14)

The TE process has 52 variables, including 41 process variables and 11 manipulated variables. Table 15 lists a set of 15 known faults introduced to the TE process. Training and test sets have been collected by running 25 and 48 h simulations, respectively, in which faults have been introduced 1 and 8 h into the simulation, and each variable is sampled every 3 min. Thus, training sets consist of 500 samples, whereas test sets contain 960 samples per set of simulation [29,30].

Table 15. Descriptions of known faults in TE process.

4.2. The Numbers of Time Lag in TE Process

Here,

L_{x}

and

L_{y}

are the lag numbers in the augmented process variables matrix and the augmented KPI matrix, respectively. In this work, we set

L_{x} \in [0, 1, \dots, 6]

and

L_{y} \in [0, 1, \dots, 6]

. Several values of Lag_x and Lag_y and their corresponding changes in rate are listed in Table 16 and Table 17.

Table 16. The values of

L a g_{x}

under different

L_{x}

values.

Table 17. The values of

L a g_{y}

under different

L_{y}

values.

From the data presented in Table 16 and Table 17, the values of

L a g_{x} (L_{x} = 3)

tended to be the lowest, and the values of

L a g_{y} (L_{y} = 3)

tended to be the lowest if

L_{y}

was less than or equal to 3. At this time, the values of the rate of change were less than 5% when

L_{y}

gradually increased from 3. That is to say, the values of

L a g_{y}

would not decrease significantly if we continued increasing L_y. Therefore, the optimal numbers of lags were

L_{x} = 3

and

L_{y} = 3

.

4.3. Simulation Study

We tend to focus on the ability to detect KPI-related faults in the TE process. Table 18 lists a set of nine KPI-related faults introduced to the TE process. It shows the detection and false alarm rates for four algorithms: OSA, DOSA, Dynamics CCA (DCCA), and Dynamics PLS (DPLS).

Table 18. Testing results of KPI-related faults for the TE process.

Considering the data presented in Table 18, DOSA shows better performance compared to the other algorithms for KPI-related faults. Meanwhile, the DOSA algorithm showed a great advantage in Faults 1–2, 8, and 12–13 over the OSA algorithm. From this analysis, it can be concluded that the DOSA algorithm performs better than the OSA algorithm on dynamic problems. Figure 4 shows the simulation diagram of OSA and DOSA monitoring in these faults. The blue line represents the value of the statistic, and the red line represents the value of the control limit. When the blue line is higher than the red line, a fault has occurred. It is obvious that the DOSA algorithm is more sensitive to these faults.

Figure 4. The simulation comparison of OSA and DOSA monitoring in Faults 1–2, 8, and 12–13.

5. Conclusions

In this paper, we have presented an improved algorithm of OSA for conducting large-scale process monitoring, called the DOSA algorithm, and compared its performance against DPLS and DCCA, which are KPI-related algorithms that are also used to solve dynamic problems.

Considering the testing results of the dynamics model, this article proved that it is necessary to expand the dimension of both the process variables matrix and the KPI matrix while using the DOSA algorithm. Furthermore, the DOSA algorithm is able to adequately solve the dynamics issue; Thus, we can know whether a fault actually occurs in the KPI-related or KPI-unrelated process variables or in the measurement of the KPIs.

The comparative study was conducted using the Tennessee Eastman benchmark process, and we can conclude that the DOSA algorithm achieves better detection rates of faults from the analysis of the results obtained. However, the DOSA algorithm is sensitive to the change in sampling period. We intend to solve this problem as we continue the improvement of this project in the future.

Author Contributions

Conceptualization, W.H.; methodology, W.H. and Z.L.; validation, S.L.; formal analysis, W.H.; resources, Y.W. and S.L.; writing—original draft preparation, W.H.; writing—review and editing, Z.L. and W.H.; visualization, W.H.; supervision, S.L., X.J., and S.D.; project administration, Z.L.; funding acquisition, S.L. and Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Natural Science Foundation of Guangdong Province, China (NO. 2022A1515011040), the Natural Science Foundation of Shenzhen, China (NO. 20220813001358001) and the Young Talents program offered by the Department of Education of Guangdong Province, China (2021KQNCX210).

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhu, J.; Jiang, M.; Liu, Z. Fault Detection and Diagnosis in Industrial Processes with Variational Autoencoder: A Comprehensive Study. Sensors 2022, 22, 227. [Google Scholar] [CrossRef]
Zhao, F.; Rekik, I.; Lee, S.W.; Liu, J.; Zhang, J.; Shen, D. Two-Phase Incremental Kernel PCA for Learning Massive or Online Datasets. Complexity 2019, 2019, 5937274. [Google Scholar] [CrossRef]
Zhang, S.; Zhao, C. Hybrid Independent Component Analysis (H-ICA) with Simultaneous Analysis of High-Order and Second-Order Statistics for Industrial Process Monitoring. Chemom. Intell. Lab. Syst. 2019, 185, 47–58. [Google Scholar] [CrossRef]
Qin, Y.; Lou, Z.; Wang, Y.; Lu, S.; Sun, P. An Analytical Partial Least Squares Method for Process Monitoring. Control. Eng. Pract. 2022, 124, 105182. [Google Scholar] [CrossRef]
Yin, S.; Zhu, X.; Kaynak, O. Improved PLS Focused on Key-Performance-Indicator-Related Fault Diagnosis. IEEE Trans. Ind. Electron. 2015, 62, 1651–1658. [Google Scholar] [CrossRef]
Wang, H.; Gu, J.; Wang, S.; Saporta, G. Spatial Partial Least Squares Autoregression: Algorithm and Applications. Chemom. Intell. Lab. Syst. 2019, 184, 123–131. [Google Scholar] [CrossRef]
Tao, Y.; Shi, H.; Song, B.; Tan, S. Parallel Quality-Related Dynamic Principal Component Regression Method for Chemical Process Monitoring. J. Process Control 2019, 73, 33–45. [Google Scholar] [CrossRef]
Sim, S.F.; Jeffrey Kimura, A.L. Partial Least Squares (PLS) Integrated Fourier Transform Infrared (FTIR) Approach for Prediction of Moisture in Transformer Oil and Lubricating Oil. J. Spectrosc. 2019, 2019, e5916506. [Google Scholar] [CrossRef]
Kanatsoulis, C.I.; Fu, X.; Sidiropoulos, N.D.; Hong, M. Structured SUMCOR Multiview Canonical Correlation Analysis for Large-Scale Data. IEEE Trans. Signal Process. 2019, 67, 306–319. [Google Scholar] [CrossRef]
Cai, J.; Dan, W.; Zhang, X. ℓ0-Based Sparse Canonical Correlation Analysis with Application to Cross-Language Document Retrieval. Neurocomputing 2019, 329, 32–45. [Google Scholar] [CrossRef]
Su, C.H.; Cheng, T.W. A Sustainability Innovation Experiential Learning Model for Virtual Reality Chemistry Laboratory: An Empirical Study with PLS-SEM and IPMA. Sustainability 2019, 11, 1027. [Google Scholar] [CrossRef]
Alvarez, A.; Boente, G.; Kudraszow, N. Robust Sieve Estimators for Functional Canonical Correlation Analysis. J. Multivar. Anal. 2019, 170, 46–62. [Google Scholar] [CrossRef]
de Cheveigné, A.; Di Liberto, G.M.; Arzounian, D.; Wong, D.D.E.; Hjortkjær, J.; Fuglsang, S.; Parra, L.C. Multiway Canonical Correlation Analysis of Brain Data. NeuroImage 2019, 186, 728–740. [Google Scholar] [CrossRef]
Tong, C.; Lan, T.; Yu, H.; Peng, X. Distributed Partial Least Squares Based Residual Generation for Statistical Process Monitoring. J. Process Control 2019, 75, 77–85. [Google Scholar] [CrossRef]
Si, Y.; Wang, Y.; Zhou, D. Key-Performance-Indicator-Related Process Monitoring Based on Improved Kernel Partial Least Squares. IEEE Trans. Ind. Electron. 2021, 68, 2626–2636. [Google Scholar] [CrossRef]
Lou, Z.; Wang, Y.; Si, Y.; Lu, S. A Novel Multivariate Statistical Process Monitoring Algorithm: Orthonormal Subspace Analysis. Automatica 2022, 138, 110148. [Google Scholar] [CrossRef]
Song, Y.; Liu, J.; Chu, N.; Wu, P.; Wu, D. A Novel Demodulation Method for Rotating Machinery Based on Time-Frequency Analysis and Principal Component Analysis. J. Sound Vib. 2019, 442, 645–656. [Google Scholar] [CrossRef]
Zhang, C.; Guo, Q.; Li, Y. Fault Detection Method Based on Principal Component Difference Associated with DPCA. J. Chemom. 2019, 33, e3082. [Google Scholar] [CrossRef]
Dong, Y.; Qin, S.J. A Novel Dynamic PCA Algorithm for Dynamic Data Modeling and Process Monitoring. J. Process Control 2018, 67, 1–11. [Google Scholar] [CrossRef]
Oyama, D.; Kawai, J.; Kawabata, M.; Adachi, Y. Reduction of Magnetic Noise Originating from a Cryocooler of a Magnetoencephalography System Using Mobile Reference Sensors. IEEE Trans. Appl. Supercond. 2022, 32, 1–5. [Google Scholar] [CrossRef]
Lou, Z.; Shen, D.; Wang, Y. Two-step Principal Component Analysis for Dynamic Processes Monitoring. Can. J. Chem. Eng. 2018, 96, 160–170. [Google Scholar] [CrossRef]
Sakamoto, W. Bias-reduced Marginal Akaike Information Criteria Based on a Monte Carlo Method for Linear Mixed-effects Models. Scand. J. Stat. 2019, 46, 87–115. [Google Scholar] [CrossRef]
Gu, J.; Fu, F.; Zhou, Q. Penalized Estimation of Directed Acyclic Graphs from Discrete Data. Stat. Comput. 2019, 29, 161–176. [Google Scholar] [CrossRef]
Wan, J.; Li, S. Modeling and Application of Industrial Process Fault Detection Based on Pruning Vine Copula. Chemom. Intell. Lab. Syst. 2019, 184, 1–13. [Google Scholar] [CrossRef]
Huang, J.; Ersoy, O.K.; Yan, X. Fault Detection in Dynamic Plant-Wide Process by Multi-Block Slow Feature Analysis and Support Vector Data Description. ISA Trans. 2019, 85, 119–128. [Google Scholar] [CrossRef]
Plakias, S.; Boutalis, Y.S. Exploiting the Generative Adversarial Framework for One-Class Multi-Dimensional Fault Detection. Neurocomputing 2019, 332, 396–405. [Google Scholar] [CrossRef]
Zhao, H.; Lai, Z. Neighborhood Preserving Neural Network for Fault Detection. Neural Netw. 2019, 109, 6–18. [Google Scholar] [CrossRef]
Suresh, R.; Sivaram, A.; Venkatasubramanian, V. A Hierarchical Approach for Causal Modeling of Process Systems. Comput. Chem. Eng. 2019, 123, 170–183. [Google Scholar] [CrossRef]
Amin, M.T.; Khan, F.; Imtiaz, S. Fault Detection and Pathway Analysis Using a Dynamic Bayesian Network. Chem. Eng. Sci. 2019, 195, 777–790. [Google Scholar] [CrossRef]
Cui, P.; Zhan, C.; Yang, Y. Improved Nonlinear Process Monitoring Based on Ensemble KPCA with Local Structure Analysis. Chem. Eng. Res. Des. 2019, 142, 355–368. [Google Scholar] [CrossRef]

Figure 1. The flow chart of DOSA.

Figure 2. The values of

L a g_{x}

under different

l_{x}

values.

Figure 3. The values of

L a g_{y}

under different

l_{y}

values.

Figure 4. The simulation comparison of OSA and DOSA monitoring in Faults 1–2, 8, and 12–13.

Table 1. The values of

L a g_{x}

under different

l_{x}

values.

Table 1. The values of

L a g_{x}

under different

l_{x}

values.

	$l_{x} = 0$	$l_{x} = 1$	$l_{x} = 2$	$l_{x} = 3$	$l_{x} = 4$	$l_{x} = 5$	$l_{x} = 6$
$L a g_{x}$	8000.4	5489.7	3620.8	1540.9	1540	1539.7	1538.6
$R C %$	/	31.38%	34.04%	57.44%	0.06%	0.19%	0.71%

Table 2. The values of

L a g_{y}

under different

l_{y}

values.

Table 2. The values of

L a g_{y}

under different

l_{y}

values.

	$l_{y} = 0$	$l_{y} = 1$	$l_{y} = 2$	$l_{y} = 3$	$l_{y} = 4$	$l_{y} = 5$	$l_{y} = 6$
$L a g_{y}$	7999.1	6327.2	5864.1	5276.4	5275	5274.3	5270.1
$R C %$	/	20.9%	7.32%	10.02%	0.03%	0.01%	0.08%

Table 3. The values of BIC under different

l_{x}

values.

Table 3. The values of BIC under different

l_{x}

values.

	$l_{x} = 0$	$l_{x} = 1$	$l_{x} = 2$	$l_{x} = 3$	$l_{x} = 4$	$l_{x} = 5$	$l_{x} = 6$
BIC	−11,427.54	−11,423.19	−11,453.90	−11,447.21	−11,442.30	−11,439.81	−11,433.15

Table 4. The values of BIC under different

l_{y}

values.

Table 4. The values of BIC under different

l_{y}

values.

	$l_{y} = 0$	$l_{y} = 1$	$l_{y} = 2$	$l_{y} = 3$	$l_{y} = 4$	$l_{y} = 5$	$l_{y} = 6$
BIC	−9208.71	−9214.21	−9237.06	−9237.20	−9230.39	−9223.74	−9218.98

Table 5. Fault 1 detection rates and false alarm rates of three algorithms.

Methods	OSA
Indices	$T_{C}^{2}$	SPE_C	$T_{E}^{2}$	SPE_E	$T_{F}^{2}$	SPE_F	SPE_XY
Detection rate	1.2	2.4	61.68	15.17	1.8	1	14.97
False alarm rate	1.6	0.6	0.8	0.8	1	0.4	1
Methods	DOSA-X
Indices	$T_{C}^{2}$	SPE_C	$T_{E}^{2}$	SPE_E	$T_{F}^{2}$	SPE_F	SPE_XY
Detection rate	1.6	2.2	87.62	55.69	2	1.2	2.4
False alarm rate	1.8	0.6	1.2	2.2	0.8	0.4	0.8
Methods	DOSA-XY
Indices	$T_{C}^{2}$	SPE_C	$T_{E}^{2}$	SPE_E	$T_{F}^{2}$	SPE_F	SPE_XY
Detection rate	0.8	1.8	93.21	53.29	2.2	1.2	10.58
False alarm rate	0.8	0.4	2.4	1.2	0.4	0.8	0.2

Table 6. Fault 2 detection rates and false alarm rates of three algorithms.

Methods	OSA
Indices	$T_{C}^{2}$	SPE_C	$T_{E}^{2}$	SPE_E	$T_{F}^{2}$	SPE_F	SPE_XY
Detection rate	0.6	1	1	0.2	42.91	8.58	32.73
False alarm rate	1	0.8	1.6	1	0.8	1.2	2
Methods	DOSA-X
Indices	$T_{C}^{2}$	SPE_C	$T_{E}^{2}$	SPE_E	$T_{F}^{2}$	SPE_F	SPE_XY
Detection rate	0.4	0.8	1	0.8	44.31	5.6	44.71
False alarm rate	1	1.6	1.2	2	1.2	1.2	1
Methods	DOSA-XY
Indices	$T_{C}^{2}$	SPE_C	$T_{E}^{2}$	SPE_E	$T_{F}^{2}$	SPE_F	SPE_XY
Detection rate	0.6	0.2	2.4	0.4	91.82	9.58	62.48
False alarm rate	2.81	0.6	3.61	1.4	1.6	1.6	1

Table 7. Fault 3 detection rates and false alarm rates of three algorithms.

Methods	OSA
Indices	$T_{C}^{2}$	SPE_C	$T_{E}^{2}$	SPE_E	$T_{F}^{2}$	SPE_F	SPE_XY
Detection rate	45.51	29.34	30.94	30.94	45.51	12.38	16.97
False alarm rate	1.4	1	2.61	1.6	1.4	1.2	0.8
Methods	DOSA-X
Indices	$T_{C}^{2}$	SPE_C	$T_{E}^{2}$	SPE_E	$T_{F}^{2}$	SPE_F	SPE_XY
Detection rate	45.51	15.17	50.7	52.5	45.51	25.55	1.4
False alarm rate	1.4	1.6	1.2	2	1.4	2.4	1.6
Methods	DOSA-XY
Indices	$T_{C}^{2}$	SPE_C	$T_{E}^{2}$	SPE_E	$T_{F}^{2}$	SPE_F	SPE_XY
Detection rate	90.82	65.67	7.19	37.72	49.1	52.5	11.98
False alarm rate	3.41	1.4	0.8	2.61	2.2	1.6	1.2

Table 8. Fault 4 detection rates and false alarm rates of three algorithms.

Methods	OSA
Indices	$T_{C}^{2}$	SPE_C	$T_{E}^{2}$	SPE_E	$T_{F}^{2}$	SPE_F	SPE_XY
Detection rate	1.2	1.8	1.4	2	1.2	0.2	64.27
False alarm rate	1.2	0.6	1.2	1.2	1.2	0.8	0.4
Methods	DOSA-X
Indices	$T_{C}^{2}$	SPE_C	$T_{E}^{2}$	SPE_E	$T_{F}^{2}$	SPE_F	SPE_XY
Detection rate	1.2	1	0.6	1	1.2	1.6	77.45
False alarm rate	1.2	1.2	1.2	1.8	1.2	1.2	1
Methods	DOSA-XY
Indices	$T_{C}^{2}$	SPE_C	$T_{E}^{2}$	SPE_E	$T_{F}^{2}$	SPE_F	SPE_XY
Detection rate	0.4	1.6	1	0.6	1	0.8	99.6
False alarm rate	0.4	1.4	2.81	0.4	1.2	1	1.6

Table 9. The values of

L a g_{x}

for doubling the sampling period.

Table 9. The values of

L a g_{x}

for doubling the sampling period.

	$l_{x} = 0$	$l_{x} = 1$	$l_{x} = 2$	$l_{x} = 3$	$l_{x} = 4$	$l_{x} = 5$	$l_{x} = 6$
$L a g_{x}$	501	202.26	194.27	186.8	183.66	179.02	176.04
$R C %$	/	59.63%	3.95%	3.84%	1.68%	2.53%	1.66%

Table 10. The values of

L a g_{y}

for doubling the sampling period.

Table 10. The values of

L a g_{y}

for doubling the sampling period.

	$l_{y} = 0$	$l_{y} = 1$	$l_{y} = 2$	$l_{y} = 3$	$l_{y} = 4$	$l_{y} = 5$	$l_{y} = 6$
$L a g_{y}$	501	472.91	468.78	468.64	466.8	465.47	464.08
$R C %$	/	5.61%	0.87%	0.03%	0.39%	0.28%	0.30%

Table 11. Comparison of primitive and doubled sampling periods (Fault 1).

Condition	Primitive sampling period
Indices	$T_{E}^{2}$	$S P E_{E}$
Detection rate	93.21	53.29
False alarm rate	2.4	1.2
Condition	Doubled sampling period
Indices	$T_{E}^{2}$	$S P E_{E}$
Detection rate	84.6	49.36
False alarm rate	1.6	1.2

Table 12. Comparison of primitive and doubled sampling periods (Fault 2).

Condition	Primitive sampling period
Indices	$T_{F}^{2}$	$S P E_{F}$
Detection rate	91.82	9.58
False alarm rate	1.6	1.6
Condition	Doubled sampling period
Indices	$T_{F}^{2}$	$S P E_{F}$
Detection rate	83.13	6.43
False alarm rate	1.6	1.2

Table 13. Comparison of primitive and doubled sampling periods (Fault 3).

Condition	Primitive sampling period
Indices	$T_{C}^{2}$	$S P E_{C}$
Detection rate	90.82	65.67
False alarm rate	3.41	1.4
Condition	Doubled sampling period
Indices	$T_{C}^{2}$	$S P E_{C}$
Detection rate	82.33	60.84
False alarm rate	2	1.6

Table 14. Comparison of primitive and doubled sampling periods (Fault 4).

Condition	Primitive sampling period
Indices	$S P E_{X Y}$
Detection rate	99.6
False alarm rate	1.6
Condition	Doubled sampling period
Indices	$S P E_{X Y}$
Detection rate	98.39
False alarm rate	2.4

Table 15. Descriptions of known faults in TE process.

Fault ID	Process Variable	Type	KPI-Related
1	A/C feed ratio, B composition constant	Step	Yes
2	B composition, A/C ration constant		Yes
3	D feed temperature
4	Reactor cooling water inlet temperature
5	Condenser cooling water inlet temperature		Yes
6	A feed loss		Yes
7	C header pressure loss-reduced availability		Yes
8	A, B and C feed composition	Random variation	Yes
9	D feed temperature
10	C feed temperature		Yes
11	Reactor cooling water inlet temperature
12	Condenser cooling water inlet temperature		Yes
13	Reaction kinetics	Slow drift	Yes
14	Reactor cooling water valve	Sticking
15	Condenser cooling water valve	Sticking

Table 16. The values of

L a g_{x}

under different

L_{x}

values.

Table 16. The values of

L a g_{x}

under different

L_{x}

values.

	$L_{x} = 0$	$L_{x} = 1$	$L_{x} = 2$	$L_{x} = 3$	$L_{x} = 4$	$L_{x} = 5$	$L_{x} = 6$
$L a g_{x}$	159	62.95	28.32	3.6	13,104.51	66,817.24	34,678.86

Table 17. The values of

L a g_{y}

under different

L_{y}

values.

Table 17. The values of

L a g_{y}

under different

L_{y}

values.

	$L_{y} = 0$	$L_{y} = 1$	$L_{y} = 2$	$L_{y} = 3$	$L_{y} = 4$	$L_{y} = 5$	$L_{y} = 6$
$L a g_{y}$	159	112.86	100.61	85.87	83.91	79.84	79
$R C %$	/	29.02%	10.85%	14.65%	2.28%	4.85%	1.05%

Table 18. Testing results of KPI-related faults for the TE process.

	DPLS	DCCA		OSA		DOSA
	$T_{}^{2}$	$S P E_{1}$	$S P E_{2}$	$T_{C}^{2}$	$S P E_{C}$	$T_{C}^{2}$	$S P E_{C}$
False alarm rate	0	1.3	1.3	0	0	0	0.63
Fault 1	42.625	73.7	91.4	61.75	88.25	99.375	97.375
Fault 2	98.75	86	89	15.375	53.75	97.125	96.375
Fault 5	20.125	98.9	99.9	16.875	11.25	22.375	15.125
Fault 6	96.5	100	100	99.125	100	100	100
Fault 7	38	17.5	34.5	21.5	89	63.75	29.125
Fault 8	68	43.3	53.1	67	51.625	92.875	74.75
Fault 10	5.375	21.9	37.2	60.875	13.125	66.875	70.25
Fault 12	31	66.2	85.2	69.125	51.125	94.625	77.375
Fault 13	66.125	78.6	85.2	80	70.625	90.75	76.625

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

A Novel Dynamic Process Monitoring Algorithm: Dynamic Orthonormal Subspace Analysis

Abstract

1. Introduction

2. Methods

2.1. Orthonormal Subspace Analysis

2.2. The “Time Lag Shift” Method

3. Dynamics Orthonormal Subspace Analysis

3.1. Determination of the Lag Number

3.2. DOSA Procedure

3.3. A Dynamics Model Analyzed with DOSA

3.3.1. Dynamics Model

3.3.2. The Optimal Numbers of Time Lag

3.3.3. Testing Results

3.3.4. The Influence of Sampling Period on DOSA

3.4. Conclusion

4. Comparison Study Based on Tennessee Eastman Process

4.1. Tennessee Eastman Process

4.2. The Numbers of Time Lag in TE Process

4.3. Simulation Study

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics