A Dynamic GLR-Based Fault Detection Method for Non-Gaussain Dynamic Processes

: Non-Gaussian dynamic processes are ubiquitous due to the presence of non-Gaussian distributed variables. Therefore, fault detection of non-Gaussian dynamic processes plays a vital role to maintain the safe operation of systems and symmetry of data distribution. In this paper, a dynamic generalized likelihood ratio (DGLR)-based fault detection method is proposed for non-Gaussian dynamic processes. Different from the conventional principal component analysis (PCA)-based, dynamic PCA-based, and PCA-based GLR fault detection methods, the novelty of the proposed method is that the GLR is extended to non-Gaussian dynamic processes, and the randomized algorithm is integrated for threshold setting to attenuate the inﬂuence of non-Gaussian. The application scope of these methods is also discussed. The proposed method is compared with four existing fault detection methods on a numerical simulation and the continuous stirred-tank reactor (CSTR) process. The achieved results show that the proposed method is able to signiﬁcantly improve the detection performance in terms of fault detection rate and prompt response to faults.


Introduction
Fault detection is becoming increasingly important to maintain high quality products, operation safety of processes, and symmetry of data distribution. In recent years, considerable attention has been paid to research on solving fault detection problems. Model-based and data-driven methods are two common types [1][2][3][4][5][6]. In the model-based approaches, accurate physical or mathematical models are needed. On the other hand, in data-driven methods, only the availability of historical process data is required. Data-driven fault detection techniques have been widely used due to the simple application form and fewer requirements on development. The commonly used data-driven methods include generalized likelihood ratio (GLR)-based [7,8], multivariate analysis (MVA)-based, such as principal component analysis (PCA) [9][10][11][12], partial least square (PLS) [13][14][15], canonical correlation analysis (CCA) [16][17][18].
PCA is one of the most MVA techniques used for FD, which considers a single dataset. The successful application of such a method can be found in a wide range of applications, for example, dynamical, time varying, non-Gaussian and nonlinear processes [19]. Several variants of the standard PCA have been developed. These variants include dynamic PCA which is used to find dynamical linear relationships between the process variables [20], moving window PCA which handles time varying features [21], kernel density based-PCA which is used in non-Gaussian fault detection [22]. For GLR-based fault detection methods, some variants have also been developed in the fields of time varying [23], non-Gaussian [24], etc. However, the dynamic characteristic in process is rarely considered in the existing methods, and the better detection performance could be achieved by considering the auto-correlation characteristic in dynamic processes.
Furthermore, the successful application of the GLR-based method requires that the process data follows a Gaussian distribution. In practice, processes with non-Gaussian features put forward more challenges for fault detection. To this end, some variants of the existing fault detection methods have been developed. Commonly, there are two types of methods to deal with the fault detection problem for non-Gaussian processes. The first ones are either to use some methods, which are free of distribution, such as independent component analysis (ICA) [19] and support vector machine-based methods [25,26], or to extract high-order statistics and then use the standard method based on the obtained statistics [27]. For example, in [26], the ICA method is firstly used to get the independent components and then the support vector data description method is applied to generate a suitable threshold. The authors in [27] used various statistics to quantify process characteristics, such as non-Gaussian, furthermore monitoring these statistics instead of process variables themselves to perform fault detection. The second type of methods are first to estimate a probability distribution of the monitored variables or statistics, and then determine an appropriate threshold based on the resulting distribution. These methods can be referred to as distribution estimation-based methods [28]. A great number of methods have been developed for the estimation of distribution, including Gaussian Mixture Models (GMM) [29,30], kernel-based approaches [31,32] and sequential quantile estimation [33]. Motivated by the success of the second ones, in this paper, we use the same strategy. Although these existing methods are successful in this application domain, their performance in fault detection is commonly constrained by the determination of kernel structure and method-specified parameters, for example, the bandwidth parameter for a Gaussian kernel [31]. Therefore, due to the ability to iterative updating threshold, a randomized algorithm-based threshold learning method is used to enhance the dynamic GLR (DGLR)-based method to deal with the fault detection task in non-Gaussian dynamic processes.
Motivated by the above analysis, a DGLR-based fault detection method combined with the threshold learning method is proposed for non-Gaussian dynamic processes. The contribution of this work is four-fold: (1) to develop a DGLR-based detection statistic for non-Gaussian dynamic processes; (2) to improve DGLR-based fault detection performance by iteratively learning the suitable threshold by a randomized algorithm; (3) to compare the DGLR-based fault detection methods with the GLR-based, PCA-based, DPCA-based, and PCA-based GLR ones [34]. Based on our best knowledge, there are few works to compare these methods with the purpose to clarify the application scope of these method and to guide the practitioners to select a suitable fault detection method; (4) to assess the DGLR-based fault detection performance by comparing it with the GLR-based, PCAbased, DPCA-based, and PCA-based GLR methods using a numerical simulation and the continuous stirred-tank reactor (CSTR) process. Notation 1. The notation used in this paper is standard. R n denotes the n-dimensional Euclidean space consisting of n × 1 vectors with real components, R n×m is the set of all n × m real matrices, and diag(. . . , . . . , . . .) is a square diagonal matrix. A(:, i) represents the i-th column of A. I n is an n × n identity matrix. x ∼ N (µ x , Σ x ) denotes that x is a normally distributed random vector with mean µ x and covariance Σ x . E(·) denotes the expectation operator. χ 2 (m) stands for the chi-square distribution with m degrees of freedom. Let pr(χ 2 > χ 2 α (m)) = α be the probability that χ 2 > χ 2 α (m) equals α (significance level).

The Basics of GLR-Based Fault Detection Technique
Consider the following fault detection problem using a GLR-based technique. Given a general model where y * ∼ N (0, Σ) represents the statistical features of the process and m is the dimension of the variable. Since f = 0 denotes the fault-free case, our task consists of detecting a fault f = 0 with N number of available measurements of y 1 , . . . , y N . The fault detection task can be solved by testing the following hypotheses based on the available data y [7] H 0 , null hypothesis: f = 0, fault-free, H 1 , alternative hypothesis: f = 0, faulty.
The probability density functions (pdf) of y * and y are respectively given as The log likelihood ratio is defined as To increase the confidence of the decision-making procedure, generally more samples are required. Using N samples of data y, Equation (4) is extended as Evidently, the maximum of S N 1 is whenȳ = E(y) is achieved. Since E(y) is generally unknown, it can be replaced by its maximum likelihood estimate In practice Σ is also unknown, which needs to be identified from the data. It is straightforward that which gives an asymptotically unbiased estimate of the covariance matrix. Thus if N is sufficiently large, the unknown parameter Σ could be approximated by its estimatê

Randomized Algorithm-Based Threshold Setting
The randomized method has been widely used to the analysis and design of a robust control system [35,36]. Recently, this method has been used for threshold setting for the non-Gaussian process because it is independent of the probability distribution [28]. This successful application can be attributed to the iterative update of the threshold by means of the estimation of false alarm rate (FAR). The basic idea behind is that the required threshold should guarantee a desired false alarm rate, which is predefined. Let J th , p FAR andp FAR be the threshold, allow FAR and the estimated FAR, respectively A lowest threshold which is satisfied with the given false alarm rate is obtained by the following Algorithm 1 from [28].
As given in Theorem 1 in [28], for a sufficiently small ∆, the estimated threshold satisfies J th,min ≤ J th ≤ J th,min + ∆ with J th,min the lowest threshold. Since ∆ is sufficiently small, the estimated threshold approaches the lower threshold, i.e., J th ≈ J th,min .

The Proposed Method
In this section, a DGLR-based fault detection method is proposed for non-Gaussian dynamic processes. Firstly, a DGLR-based test statistic is built for detection purposes. Then, it is well known in probability theory that the complete information of a Gaussian distribution can be described by the mean value and the covariance. Therefore, if the measured variable y follows a Gaussian distribution, the threshold setting can be achieved using the standard distribution table. However, in a non-Gaussian case, the distribution of the test statistics inevitably deviates from the standard distribution, e.g., χ 2 distribution. In this case, the threshold, which is set based on Gaussian assumption, will decrease the detection performance, e.g., lower detection rate or higher false alarms. An alternative solution to this problem is to set an appropriate threshold. Therefore, in Section 3.1, the DGLR method is integrate with the RA-based threshold-setting algorithm for the purpose of fault detection of non-Gaussian processes.

DGLR with RA-Based Threshold Setting Algorithm
In the DGLR-based fault detection method, the solution of fault detection problem consists of two procedures: • Off-line training. Using the stacking data to identify the unknown parameters, i.e., the mean value E(y) and the covariance matrix Σ; • On-line implementation. Detecting faults with on-line data.
In the first procedure, with N recorded data available, the data can be augmented and stacked in the following manner: where y T k denotes the recorded data at time k, and p + 1 represents the number of samples. The mean value E(y) and the covariance matrix Σ can be estimated according to Equations (7) and (8), respectively. For on-line implementation, first, we collect on-line measurement data y k+i = [y k+i , y k+i+1 , . . . , y k+i+p ] T , i = 1, . . . , n, and then calculatē For fault detection purposes, a test statistic in a maximum likelihood ratio-like form can be used as By summarizing the previous analysis, an extension of DGLR using RA-based threshold setting is proposed to detect underlying faults subject to non-Gaussian processes. The step-by-step procedure of the DGLR with RA-based threshold setting algorithm is illustrated in Algorithm 2.
Algorithm 2: DGLR-based fault detection method with RA-based threshold setting.
Off-line training S1: Computation of : Determine the corresponding thresholds J th,ng using Algorithm 1 with a given significance level α, in which the statistic, J DGLR is estimated according to Equation (12); On-line implementation S3: Collect real-time measurement y k+i , i = 1, . . . , n and calculatē S4: Build test statistic J DGLR = n(∆ȳ TΣ−1 ∆ȳ) S5: Check the decision logic: Considering the applications of PCA-based, DPCA-based, and PCA-based GLR fault detection methods, in this subsection, we briefly discuss the relationship among the three fault detection methods with the GLR-based one to distinguish their scope.
In practice, the direct application of the DGLR-based method may be unavailable due to numerical reasons, e.g., the invertibility of the estimated covariance matrix. This fact leads to the application of PCA-based methods, in which the SVD (singular value decomposition) of the estimated covariance matrix is the core. The principle of DPCA is similar to that of PCA, the following introduces the basic PCA technique [5]: where P pc = [p 1 , . . . , p γ ] ∈ R m×γ and P res = p γ+1 , . . . , p m ∈ R m×(m−γ) consist of the loading vectors, known as the principal components and residual components, respectively; γ represents the number of principal components, Λ pc = diag(λ 1 , . . . , λ γ ) and Λ res = diag λ γ+1 , . . . , λ m contain the corresponding eigenvalues, satisfying For fault detection, the Hotelling's test statistic can be calculated with a single, on-line measurement From the calculation formula of two test statistics, it is clear that the data normalization plays a central role. It not only provides us with an estimation of the covariance matrix in off-line training, but also delivers the required residual signal for fault detection.
If the principal component γ < m, then the matrix P pc ∈ R m×γ is rank deficient, i.e., rank(P pc ) = γ < m. From the fault detection viewpoint, the matrix P pc is not 'all pass' for faults, that is, there exists f = 0 such that P T pc f = 0. This situation is caused by the artificial design of rank deficient matrix P pc . If the principal component γ equals m, then the J PCA statistic reduces to Hotelling's T 2 statistic, that is, when γ = m As introduced in [5], the test statistic J DGLR in the form (12) is also called Hotelling's T 2 test statistic. It is evident that the test statistic in form (14) is equivalent with the one in (12) under the single on-line measurement case.
It is worthwhile noting that, recently, a PCA-based GLR fault detection method was proposed in [34], in which the PCA technique is only used for establishing the mathematical process model and the GLR test is used to evaluate the residual signal. Table 1 presents a comparison between them to clarify their relationship. Actually, the numerical problem ofΣ −1 is rare due to the presence of process noise and measurement error. Hence, in this paper, we assume thatΣ −1 is available.

Remark 2.
The other use of PCA is for dimensionality reduction purpose. However, for fault detection, dimension reduction is not always crucial [32].

Fault Detection in Synthetic Data
In this example, a non-Gaussian dynamic process is first simulated by using Matlab. Then, the performance of the GLR-based, DGLR-based, PCA-based, DPCA-based, and PCA-based GLR fault detection methods is assessed through its application to detect faults in synthetic data.

Data Generation
Except for the fact that the noise sources are non-Gaussian, the model for data generation is the same as the one used in [38], which is given as: where T y k and u k are the output and control input vectors, n k denotes the independent non-Gausian white noise, f y,k is the fault introduced in k-th sensor. The four controllers are given as where The above model is used to simulate 1000 fault-free data samples. These data are used to estimate the required parameters for the three fault detection methods. The number of principal components is determined as four by using of the cumulative percent variance method. p is set to be five, which is determined as given in [20]. To validate the detection performance, three faults listed in Table 2 are introduced in this process.
After identifying the required parameters, one significant problem remaining in the training phase is to set the threshold. For threshold-setting, the sample number N = 2.65 × 10 4 is set using Equation (9) for = 0.01, δ = 0.01. The remaining steps are followed by Algorithm 2. Given a significance level of 0.01, Table 3 lists the thresholds used in this work. Table 2. Faults introduced in process.

Fault IDs
Description Value of δ 1 y 1,i = y 1,0 + δ 0.2 2 y 1,i = y 1,0 + δt 0.005 3 y 1,i = y 1,0 + δ N (0, 0.04) In order to demonstrate the advantage of the proposed RA-based threshold setting, the J DGLR test statistic is used as an example. The result of this statistic is shown in Figure 1, where the red line represents the Gaussian assumption-based threshold and the green line represents the RA-based threshold. Figure 1 shows that the J DGLR test statistic is always below the threshold value, which means a zero false alarm rate. Unfortunately, zero FAR will lead to a lower fault detection rate (FDR). From Figure 1, we can see that the RA-based threshold makes the J DGLR statistic approach to a FAR of 0.01, which satisfies the given significance level. Due to the limited space, we only use this statistic as an example. Next, only the RA-based threshold is used for comparison purpose.

Comparing the Five Methods Using Faulty Data
The testing data set, which is simulated using the same model given in Equation (15), consists of 1000 data samples, which are completely independent from the training data. In this case, the possibility of only a single fault is considered. To assess the abilities of the various fault detection methods, Fault 2 was introduced at the sample of 400. To compare the performances of the GLR-based, DGLR-based, PCA-based, DPCA-based, and PCA-based GLR methods, the receiver operating characteristic (ROC) curves of the five methods are shown in Figure 2, which shows the FDR for different values of the fault alarm rate (FAR). The ROC curves provide a measure to compare the detection accuracy of all test statistics in three methods as well as their sensitivities to variations in the detection thresholds. Figure 2 shows that there is a trade-off between a high FDR and a low FAR. It can be seen that the PCA-based GLR test provides a higher FDR than the conventional PCA-based method. This fact is consistent with the conclusion in [34]. The J GLR statistic of GLR-based method has a similar detection performance as the PCA-based GLR test.
Evidently, the DGLR-based method has the highest FDR than the other four test statistics. This clearly shows the advantages of the DGLR-based method over the other methods. Furthermore, the FDR performance of the DGLR-based method can be further improved by choosing the number of p. It should also be kept in mind that a large value of p can lead to a high detection delay. Therefore, there is also a trade-off between FDR and detection delay when determining an appropriate p.

Data Generation
To validate the proposed method, a CSTR process is used with several typical faulty scenarios. CSTR is widely used in chemical processes and the Matlab simulation model used in this study is similar to that used in [39]. The schematic of this CSTR is shown in Figure 3, where the reactor temperature T is controlled by manipulating the coolant flow rate Q c . As can be seen, it consists of three inputs (C i , T i , and T ci ) and four outputs (C, T, T C , and Q c ). The corresponding model is given as follows: where v i represents process noise, k is an Arrhenius-type rate constant, k = k 0 exp −E RT . The parameters in the above model are listed in Table 4. Five typical faults are used to validate the proposed fault detection method, which have been descripted and listed in Table 5. b 0 , a 0 , T c,0 , C 0 , and Q c,0 in table are nominal values. Fault 1 and Fault 2 represent catalyst decay and heat transfer fouling, and Faults 3-5 are the sensor faults on each of the three measured variables. In existing literature, there are several ways to check the non-Gaussianity. Among them, the probability plot of the variables is the most commonly used. We plot the quantiles of the A phase of converter current as an example. Figure 4 shows the distribution of the variable, where '+' denotes the sample and the dotted line shows the locus of zero-mean samples which are normally distributed. It can be seen that the samples do not match the zero-mean normal distribution, that is, the samples from the example system obey the non-Gaussian distribution.

Comparing the Five Methods Using Faulty Data
The CSTR data consist of two blocks: the training and test data blocks. The normal operating data will be referred to as the training data. Then, parameters E(y) andΣ used in the proposed method are estimated from the training data. After obtaining the necessary parameters, one remaining issue in the training phase is to determine an appropriate threshold. For threshold-setting, the sample number N = 2.65 × 10 4 is chosen by means of Equation (9)  To demonstrate the effectiveness of the proposed method, the monitoring performance will be discussed. The detection sensitivity of a fault detection method is commonly quantified by calculating the FDR, which will be used for discussing the detection sensitivity of the proposed methods. The response of the fault detection method is commonly represented by the detection delay (DD), which is the time period it takes to detect a fault after occurrence of the fault. As the desired FAR is given for threshold-setting, the two indicators, FDR and DD, are used to assess the detection performance of fault detection methods. From the result discussion of Figure 2, the detection performance of the proposed method is compared with the GLR-based, PCA-based, DPCA-based, and PCA-GLR-based methods by using all faults described above. The superiority of the test statistic J DGLR over the other test statistics considered in this paper is shown in Table 6 with respect to FDR. Due to the non-Gaussian characteristic, the test statistic J DGLR with RA achieves the better performance with the higher FDR value than all faults compared. Detection delays of all test statistics are presented in Table 7. The unit of DD is the same as the sample interval. As shown in Table 7, the J DGLR with RA test statistics is able to detect most of these faults earlier than the other ones. This point alos demonstrates the advantage of the test statistics J DGLR . For demonstrating the advantages of DGLR with the RA approach, the detection results of both methods for Faults 3 and 5 are shown in Figures 5 and 6, respectively. The solid line represents the test statistic, the red line is the threshold based on a Gaussian assumption, and the blue line means the threshold based on the RA approach. In both figures, from top to bottom, there are J GLR test statistics of the GLR-based method, J DGLR of the DGLR-based method, J PCA of the PCA-based method, J DPCA of the DPCA-based method, and J PG of the PCA-based GLR method, respectively. FDRs of the GLR-based, DGLR-based, PCA-based, DPCA-based, and PCA-based GLR methods are 6.7%, 36.3%, 5.4%, 14.87%, and 21.54%, respectively. DDs of the GLR-based, DGLR-based, PCA-based, DPCA-based, and PCA-based GLR methods are 1, 2, 1, 2, and 2 min, respectively. All figures clearly show that all test statistics are able to detect the faults. In addition, the test statistics with the RA approach result in a higher FDR and smaller detection delay than the standard test statistics without RA. It should be noted that the performances of J DGLR test statistic with respect to FDR and DD are better than its companions in both faulty cases.

Conclusions
In this work, a DGLR-based method has been proposed for non-Gaussian dynamic processes, which combines the standard GLR method and the RA approach to iteratively learn the suitable threshold, which releases the assumption of Gaussian distributed variables. Furthermore, the DGLR-based approach has been compared with GLR-based, PCA-based, DPCA-based, and PCA-based GLR fault detection approaches to clarify the application scope of these methods. The major difference between them lies in the inverse of the estimated covariance matrix. In addition, the detection performance of the DGLR-based method has been compared with the aforementioned methods using a numerical simulation of a non-Gaussian process and the CSTR process. The comparison results show that the DGLR-based approach is better than the other methods. For instance, the average FDRs of the GLR-based, DGLR-based, PCA-based, DPCA-based, and PCA-based GLR approaches are 49.84%, 72.52%, 28.16%, 47.82%, 62.69% in CSTR process. The average DDs of the GLR-based, DGLR-based, PCA-based, DPCA-based, and PCA-based GLR methods are 13, 9, 12.8, 20.4, and 20.4 min in CSTR process. The J DGLR test statistics of the DGLR-based method shows the best detection performance compared with all other test statistics considered in this paper. Because the detection performance could be affected by kinds of factors, the robustness of the proposed method will be validated by using real data collected in industrial processes in the future.

Conflicts of Interest:
The authors declare that they have no known competing financial interest or personal relationship that could have appeared to influence the work reported in this paper.