Canonical Variate Residuals-Based Fault Diagnosis for Slowly Evolving Faults

This study puts forward a novel diagnostic approach based on canonical variate residuals (CVR) to implement incipient fault diagnosis for dynamic process monitoring. The conventional canonical variate analysis (CVA) fault detection approach is extended to form a new monitoring index based on Hotelling’s T2, Q and a CVR-based monitoring index, Td. A CVR-based contribution plot approach is also proposed based on Q and Td statistics. Two performance metrics: (1) false alarm rate and (2) missed detection rate are used to assess the effectiveness of the proposed approach. The CVR diagnostic approach was validated on incipient faults in a continuous stirred tank reactor (CSTR) system and an operational centrifugal compressor.


Introduction
Rotating machines, such as centrifugal compressors, are widely used due to their high performance and robustness [1].These machines typically operate under adverse conditions such as high pressures and speeds.Therefore, performance deterioration and failure are unavoidable.In order to solve this problem, data-driven machine health monitoring systems (MHMS) [2] were introduced to realize predictive maintenance.Data-driven MHMS comprises four main steps: extracting features from collected data, detecting an incipient fault, determining the variables mostly associated with the fault and implementing a prognostic model to predict machine degradation.It is clear that these technical processes are crucial for the safe, efficient and sustainable operation of any rotating machinery.Therefore, it is not surprising that automated data-driven machine health monitoring has become increasingly popular in recent years.
Failures of rotating machinery can cause unnecessary maintenance operations and large economic losses, and it is crucial to find ways to monitor the status of rotating machines in real time.Early diagnostics of process faults enables the implementation of an appropriate maintenance strategy, alleviating the consequences of unplanned down-time and equipment failure.Multivariate statistical process monitoring (MSPM) algorithms have recently seen improvements in diagnosing process abnormalities.MSPM techniques such as principal component analysis (PCA) [3], independent component analysis (ICA) [4] and canonical variate analysis (CVA) [5] have been widely applied for the detection of process abnormalities in industrial plants and systems.In addition, alternatives to the standard multivariate monitoring methods [6][7][8][9], which take into consideration the correlations between timestamps in the past and the future, have also been put forward for dynamic processes Energies 2019, 12, 726 2 of 16 monitoring.Amongst the aforementioned MSPM techniques, CVA-based approaches were shown to be superior to other monitoring methods in terms of lead time and false positive rates [8].Demand for facilitating fault prognosis has driven increased attention towards the development of incipient fault detection techniques, and great efforts have been made to improve the detectability of slow evolving faults [10][11][12][13].The challenge lies in whether an advanced index (which is suitable for early detection of incipient faults) can be constructed based on process measurements to monitor dynamic processes operating under varying operational conditions.Recently, a canonical variate dissimilarity (CVD) index (hereafter referred to as T d ) was put forward to address the challenge of early fault detection of incipient faults under changing operating conditions [14].However, the CVD index tends to incur higher false alarm rates than traditional health indicators when system dynamics change rapidly [14].In this study, the traditional CVA approach is extended to form a new monitoring index based on Hotelling's T 2 and Q monitoring indices and the canonical variate residuals-based monitoring index T d .According to [15], the results of the canonical variate analysis between the future and past observations are referred to as canonical variate residuals (CVR).CVR measures the distinctions between past and future observations and is potentially a more sensitive index for monitoring incipient faults than indices using only past measurements.The performance of the proposed monitoring index is demonstrated on an operational compressor and a continuous stirred tank reactor (CSTR) system.The experimental results indicate that the proposed health index can detect slowly developing faults earlier than Hotelling's T 2 and Q statistics while still maintaining an acceptable false alarm rate.
Another major task of data-driven MHMS is to find the influential variables that are most likely related to a detected fault.As an essential stage in process monitoring, data-driven fault identification techniques have evolved quickly owing to the prosperity of MSPM approaches.A reconstruction-based contribution method was proposed in [16] to improve the diagnosability of the traditional PCA-based contributions.A deviation-based PCA contribution method was put forward [17] to enable the monitoring of nonlinear systems.While useful, the traditional one-dimensional contribution charts only demonstrate the variable contributions at one time instant, and multiple contribution charts are required when applied to slowly developing faults.In an effort to solve this problem, the concept of using 2-D contribution maps for identifying process variables was proposed in [18].The concept of using CVA-based 2-D contribution maps for identifying process variables associated with specific fault occurrences was first proposed in [19].This method was later utilized in [5,20] to identify faulty variables responsible for compressor faults.CVA was utilized together with cause-and-effect relationships among process variables to perform fault-root cause analysis in [21].However, a causal relationship of the underlying process is commonly scarce for real industrial processes.CVA was also used as a fault classification tool in [22,23] for fault diagnosis.However, fault classification techniques normally require available historical failure data in respect to different types of faults and are therefore not ideal for real industrial systems, since the known event logs may not be available for some plants.Most of the aforementioned fault diagnostic techniques were validated on abrupt faults and CVA's applicability for fault identification of slowly developing faults has not been fully investigated.In this study, a canonical variate residuals-based (CVR-based) contribution plot method based on T d and Hotelling's Q statistics for isolating faulty variables (specifically for incipient fault identification tasks) has been developed.To the authors' best knowledge, this is the first time CVR-based contributions have been derived and utilized for the fault identification of incipient faults.
The major contributions of this paper are as follows: • The development of a new monitoring index T c based on statistics, T 2 , Q and T d .The combined index T c is seen to be more sensitive than T 2 and Q for slowly developing faults while still maintaining satisfactory missed detection rates.

•
The development of a CVR-based contribution method for the monitoring of slowly evolving faults.To our best knowledge, it is the first time that it is the first time that CVR-based contribution has been derived and used for fault identification.

•
The use of the proposed diagnostic method for incipient fault diagnosis using data captured from a CSTR simulation program and an operational industrial centrifugal compressor.
y a,t and y b,t are then normalized to the zero-mean vectors ŷa,t and ŷb,t in order to avoid the domination of variables with excessive values.Then, the normalized future and past vectors ŷa,t and ŷb,t are rearranged as follows: in order to generate the reshaped matrices Ŷa and Ŷb .In Equations ( 3) and ( 4), N = M − a − b + 1 and M denotes the length of y t .Then the covariance matrices of Ŷa and Ŷb , namely ∑ a,a and ∑ b,b as well as the cross-covariance matrix ∑ a,b can be computed from: The vector of canonical correlations achieved by performing singular value decomposition (SVD) on the Hankel matrix H: Suppose that two data sets Ŷa ∈ R na×N and Ŷb ∈ R nb×N are available for diagnosing possible anomalies.The remaining issue is to compute the diagnostic observers that can achieve satisfactory fault diagnostic performance with a given threshold.In conventional CVA-based approaches, only past data vectors ŷa,t are used to construct test statistics: 2.1.2.T 2 and Q Indices Two widely used indices, Hotelling's T 2 and the Q indices [24,25], are computed based on the state and residual space information z t and e t , respectively.

T d Index
Motivated by the fact that CVA is able to find maximum correlations between two data sets, practitioners can detect small changes by examining how far away future canonical variates are deviated from past canonical variates (e.g., by examining the usual correlation between past and future).This leads to a diagnostic observer called canonical residuals that quantifies the distinctions between the past and future measurements.Canonical residuals are generated as: where L T q denotes the first q rows of the projection matrix L T , and U T q .Similarly, J T q is the first q rows of the projection matrix J T , and J T = ∑ −1/2 a,a V T q .∑ q = diag λ 1 , λ 2 , . . ., λ q is a diagonal matrix with its diagonal elements being the first q canonical correlations calculated as Equation (6).Canonical residuals are measures of the discrepancies between the past and future measurements and are able to provide more effective feature representation of small shifts in the early stage of emerging faults compared with diagnostic statistics derived from the traditional CVA approach [26].
Since the condition monitoring data are mean-variance normalized, the mean of the canonical residuals r t is: The covariance of r t can be calculated as: The distinctions between the past and future measurements are centered around a zero mean under healthy conditions.Hence, diagnostic test statistics can be calculated as the multivariate standard distance of the discrepancy features from zero [27]: where c is a normalizing constant, and S = I − ∑ ∑ T is the covariance matrix of the test and the healthy data.The roots of the multivariate standard distance between two random vectors can be traced back to the results presented in [27], which is described as follows: Given two random vectors x 1 and x 2 , the univariate standard distance between the two vectors is defined as: where a is a vector of unit length and a T a = 1. a T x 1 and a T x 2 are the orthogonal projections of the vectors x 1 and x 2 on the linear space spanned by a, respectively.S is the covariance matrix of x 1 and x 2 .Thus, f (a) denotes the univariate standard distance between vectors x 1 and x 2 in the one-dimensional subspace spanned by a.According to [27], the multivariate standard distance between x 1 and x 2 is attained for a = c(x 1 − x 2 ) T S −1 (x 1 − x 2 ) and takes the value: 1/2 (16) Energies 2019, 12, 726 5 of 16

Combined Index T c
In this study, fault detection is carried out using a new health index T c that combines Hotelling's T 2 and Q statistics and the CVR-based monitoring index T d .
where σ T 2 , σ Q and σ T d denote the control limit of T 2 , Q and T d index, respectively.Equation ( 17) generalizes the combined expression by Alcala and Qin [16] to include T d .Fault detection is implemented by comparing the values of the new monitoring index with a pre-defined threshold.In this study, all control limits are calculated from an adaptive kernel density estimator based on a linear diffusion process [28].A non-parametric density estimator can provide an estimate of density for any given random data and has been widely applied for fault diagnosis.In the traditional CVA methods, fault thresholds are obtained based on the assumption that the density of the T 2 , Q and T d indices are Gaussian, which may not hold true in real-world applications due to the presence of system nonlinearities.Later on, a kernel density estimation method was put forward [24] to solve this problem.However, this method lacks local adaptivity, which may result in high sensitivity to outliers [28].Moreover, the kernel density estimation method involves a bandwidth selection procedure, which requires a preliminary normal model to be determined.The adaptive kernel density estimator which is used in this study completely avoids the bandwidth selection process and is thus strictly non-parametric and suitable for online monitoring.Furthermore, the adaptive density estimator improves local adaptivity as the estimator is regarded as a transition density of a linear diffusion process.After the probability density functions are estimated from the sample data of the T 2 , Q or T d indices, the threshold for individual health indicators is calculated from the probability density function (PDF) for a given significance level α as follows: where I denotes indices σ T 2 , σ Q or σ T d , p(I) denotes the PDF of a health indicator, and T 2 α is the calculated fault threshold.
The determination of an appropriate fault threshold is crucial because selecting a threshold too large will lead to the index being too insensitive, while selecting a threshold too small will lead to the index being over sensitive to outliers.The adaptive kernel density estimation approach used in this paper is a promising alternative to conventional estimators that involve a bandwidth selection process as the bandwidth is chosen automatically.For a fair comparison with the T 2 and Q indices, fault thresholds for T d are calculated with the same settings as those for T 2 and Q indices.
Figure 1 illustrates how the observation window of vectors [y a,t , y b,t ] is updated at each sampling time.For the CVR-based online monitoring, each new observation enters the future observation window of length b, while the past observation window slides by a single increment in order for updating the samples covered by the past window.The past and future observation window updates recursively as a new measurement becomes available.
The determination of an appropriate fault threshold is crucial because selecting a threshold too large will lead to the index being too insensitive, while selecting a threshold too small will lead to the index being over sensitive to outliers.The adaptive kernel density estimation approach used in this paper is a promising alternative to conventional estimators that involve a bandwidth selection process as the bandwidth is chosen automatically.For a fair comparison with the  and  indices, fault thresholds for  are calculated with the same settings as those for  and  indices.Figure 1 illustrates how the observation window of vectors  , ,  , is updated at each sampling time.For the CVR-based online monitoring, each new observation enters the future observation window of length , while the past observation window slides by a single increment in order for updating the samples covered by the past window.The past and future observation window updates recursively as a new measurement becomes available.

CVA-Based Contributions
The following definition of CVA-based variable contribution was proposed by Jiang et al. [19]:

CVA-Based Contributions
The following definition of CVA-based variable contribution was proposed by Jiang et al. [19]: where C T 2 and C Q denote the variable contribution (or fault score) calculated based on the state and residual space information in the CVA model, respectively.C i,T 2 is the contribution of variable ŷi to the monitoring statistic T 2 and C i,Q is the contribution of ŷi to the monitoring statistic Q. z j K j,i ŷa,i is the contribution of variable ŷi to the jth canonical variate z j .Similarly, e j G j,i ŷa,i denotes the contribution of variable ŷi to the jth canonical residual variate e j .The combined contribution according to [19] uses equal weights for the state and residual space contribution The proposed CVR-based contribution is calculated as follows: where , and denotes the state space contribution.
where C Q is the residual space contribution, and C T d ,Q denotes the proposed CVR-based combined contribution.

CSTR Fault Description
The proposed method is first evaluated using data created from a CSTR system.The CSTR Simulink model utilized in this study was generated by the authors of [14], which was designed especially for simulating incipient faults.The CSTR model is simulated using Matlab Simulink.Table 1 summarizes the process variables for this CSTR system.Among these variables, the system inputs are C i , T i and T ci , and the system outputs are C, T, T c and Q c .The schematic of the CSTR is shown in Figure 2. A detailed description of the process can be found in [14].The CSTR process' dynamic model is formulated as: where Q is inlet flow rate, ∆H r represents heat of reaction, U A is the heat transfer coefficient, ρ and ρ C are fluid density, C p and C pc are fluid heat capacity, and V and V c are tank and jacket volume, respectively.Healthy and unhealthy data sets were obtained from the CSTR model for 1200 min of operation.All data were obtained at a sampling rate of one sample per minute.As shown in Figure 3, the operating conditions were deliberately varied by perturbing the system inputs around their mean values every sixty samples.For faulty data sets, each testing dataset started with no fault, and the fault starts after 200 min of operation.The name of different process variables is summarized in Table 1.Three fault scenarios were utilized to assess the effectiveness of the proposed fault detection approach.The different types of faulty conditions considered are summarized in Table 1.The parameter  was set to one during normal operating conditions.During faulty conditions,  decayed gradually from one toward zero.All faults were introduced after 1400 min of normal operation.The faulty variables sample dataset for fault 1, 2, and 3 are shown in Figures 4-6.It is worth noting that fault 2 and fault 3 become visible at different times because the decaying rates were deliberately varied.Measured variables of the CSTR process are summarized in Table 2.
Table 1.Slowly evolving fault scenarios in the continuous stirred tank reactor (CSTR) system.Three fault scenarios were utilized to assess the effectiveness of the proposed fault detection approach.The different types of faulty conditions considered are summarized in Table 1.The parameter  was set to one during normal operating conditions.During faulty conditions,  decayed gradually from one toward zero.All faults were introduced after 1400 min of normal operation.The faulty variables sample dataset for fault 1, 2, and 3 are shown in Figures 4-6.It is worth noting that fault 2 and fault 3 become visible at different times because the decaying rates were deliberately varied.Measured variables of the CSTR process are summarized in Table 2.
Table 1.Slowly evolving fault scenarios in the continuous stirred tank reactor (CSTR) system.
Fault ID Fault Description Decaying rate Fault Type Three fault scenarios were utilized to assess the effectiveness of the proposed fault detection approach.The different types of faulty conditions considered are summarized in Table 1.The parameter a 1 was set to one during normal operating conditions.During faulty conditions, a 1 decayed gradually from one toward zero.All faults were introduced after 1400 min of normal operation.The faulty variables sample dataset for fault 1, 2, and 3 are shown in Figures 4-6.It is worth noting that fault  2.  Three fault scenarios were utilized to assess the effectiveness of the proposed fault detection approach.The different types of faulty conditions considered are summarized in Table 1.The parameter  was set to one during normal operating conditions.During faulty conditions,  decayed gradually from one toward zero.All faults were introduced after 1400 min of normal operation.The faulty variables sample dataset for fault 1, 2, and 3 are shown in Figures 4-6.It is worth noting that fault 2 and fault 3 become visible at different times because the decaying rates were deliberately varied.Measured variables of the CSTR process are summarized in Table 2.
Table 1.Slowly evolving fault scenarios in the continuous stirred tank reactor (CSTR) system.

Variable ID
Variable Units

. Compressor Fault Description
In order to further assess the ability of the proposed diagnostic technique to effectively detect incipient faults and identify faulty variables, the model was tested using data captured from an operational industrial compressor.This machine is a high-pressure centrifugal compressor running at a large refinery in Europe (hereafter referred to as compressor A).The measured time series from compressor A consisted of 2199 observations and 22 variables.Table 3 summarizes the names of different process variables.For this study, all data were captured at a sampling rate of one sample per hour.As shown in Figure 7, the root-cause variables are the two different bearing-vibration sensors; the machine continued to run until the 2199th sampling point.

CSTR Fault Detection
The CVR-based diagnostic approach is first trained using a data set collected from normal operating conditions.The scale of time lags  and  were estimated through the autocorrelation analysis [5] of the root summed squares of all variables in the training data set.Here the number of

CSTR Fault Detection
The CVR-based diagnostic approach is first trained using a data set collected from normal operating conditions.The scale of time lags a and b were estimated through the autocorrelation analysis [5] of the root summed squares of all variables in the training data set.Here the number of time lags a and b were set to five.Since the underlying process data is non-stationary and non-linear, and does not follow a Gaussian distribution, a kernel density estimator based on a linear diffusion process [28] was adopted here to determine the upper control limits of the test statistics.All upper control limits for healthy operational conditions in this investigation were calculated at the 99% confidence level (i.e., the probability the test statistics are smaller than the predefined threshold is 99%).A key step in CVA is to determine the order of the reduction, that is, its number of retained states q.In this study, the optimal number of retained states q was selected such that the false alarm rate is minimized during cross-validation.For the purpose of finding the optimal number of q which gives the lowest false alarm rate, a healthy dataset containing 1200 observations was used to test the trained CVA diagnostic model.The false alarm rate was plotted against different values of q in Figure 8.For low and high values of dimensionality, the false alarm rate is high due to large number of T c threshold violations.q = 15 was finally adopted in this work as it resulted in the lowest number of false positives.
where  denotes a monitoring index and  denotes the corresponding upper control limit.It is visible from Figure 9 that the combined index detects the fault at 1544 minutes of sampling time, providing ample time to plan maintenance activities, while  and  become sensitive to the fault only after 1614 minutes of sampling time.In Figures 10 and 11, both  and  struggle to cross the fault threshold, leading to a higher MDR (see Table 4), while  appears to be more sensitive to small changes at the initial stage of the fault.Table 4 summarizes the performance of the fault detection methods studied.The bold values show the fault cases where CVR presents a superior performance than  and  statistics.It is observed that the combined index  is more sensitive than  and  for slowly developing faults, leading to earlier fault detection times.Also, the  index resulted in lower missed detections than the other two statistics under faulty operating conditions, thereby making it a promising alternative to existing indices.
As mentioned previously, the performance of CVA is superior to other dimension reduction techniques when validated using a multiphase flow facility [3] working under changing operating conditions.The proposed diagnostic method inherits the strength of CVA in handling varying operating conditions, leading to low false alarm rates for all the faulty cases studied (see Table 5).The fault detection results are depicted in Figures 9-11.Two performance metrics: (1) detection time (DT) and ( 2) missed detection rate (MDR) are utilized to evaluate the performance of the proposed T c index and its counterparts.MDR is computed as: where I denotes a monitoring index and I threshold denotes the corresponding upper control limit.It is visible from Figure 9 that the combined index detects the fault at 1544 min of sampling time, providing ample time to plan maintenance activities, while T 2 and Q become sensitive to the fault only after 1614 min of sampling time.In Figures 10 and 11, both T 2 and Q struggle to cross the fault threshold, leading to a higher MDR (see Table 4), while T c appears to be more sensitive to small changes at the initial stage of the fault.Table 4 summarizes the performance of the fault detection methods studied.The bold values show the fault cases where CVR presents a superior performance than T 2 and Q statistics.It is observed that the combined index T c is more sensitive than T 2 and Q for slowly developing faults, leading to earlier fault detection times.Also, the T c index resulted in lower missed detections than the other two statistics under faulty operating conditions, thereby making it a promising alternative to existing indices.
As mentioned previously, the performance of CVA is superior to other dimension reduction techniques when validated using a multiphase flow facility [3] working under changing operating conditions.The proposed diagnostic method inherits the strength of CVA in handling varying operating conditions, leading to low false alarm rates for all the faulty cases studied (see Table 5).

Compressor Fault Detection
Similar to the procedure described in Section 3.2.1, the scale of time lags  and  were estimated through the autocorrelation analysis and were both set to 10 in this study.The optimal number of retained states  in the CVA model was estimated by inspecting the false alarm rate against different number of retained states.According to the results shown in Figure 12, the number

Compressor Fault Detection
Similar to the procedure described in Section 3.2.1, the scale of time lags a and b were estimated through the autocorrelation analysis and were both set to 10 in this study.The optimal number of retained states q in the CVA model was estimated by inspecting the false alarm rate against different number of retained states.According to the results shown in Figure 12, the number of q was set to 17 in the CVA diagnostic model.Figure 13 shows the results obtained in terms of fault detection.The combined monitoring index T c is able to distinguish normal operating conditions from real faults incurring dynamics anomalies and thereby results in the early detection of faults with a short time delay.In this case, T 2 struggles to cross the threshold between 1641 min samples and 1579 min samples, leading to a high MDR as shown in Table 4.The Q statistic, however, is insensitive to the fault and can only detect the fault at the late stage of degradation.The fault detection time and missed detection rate for different indices are demonstrated in Table 4. of  was set to 17 in the CVA diagnostic model.Figure 13 shows the results obtained in terms of fault detection.The combined monitoring index  is able to distinguish normal operating conditions from real faults incurring dynamics anomalies and thereby results in the early detection of faults with a short time delay.In this case,  struggles to cross the threshold between 1641 minute samples and 1579 minute samples, leading to a high MDR as shown in Table 4.The  statistic, however, is insensitive to the fault and can only detect the fault at the late stage of degradation.The fault detection time and missed detection rate for different indices are demonstrated in Table 4.   of  was set to 17 in the CVA diagnostic model.Figure 13 shows the results obtained in terms of fault detection.The combined monitoring index  is able to distinguish normal operating conditions from real faults incurring dynamics anomalies and thereby results in the early detection of faults with a short time delay.In this case,  struggles to cross the threshold between 1641 minute samples and 1579 minute samples, leading to a high MDR as shown in Table 4.The  statistic, however, is insensitive to the fault and can only detect the fault at the late stage of degradation.The fault detection time and missed detection rate for different indices are demonstrated in Table 4.

Fault Identification
After fault detection, the proposed CVR-based method is applied to identify the influential variables associated with the detected faults.The resultant contribution plots for the CSTR and compressor fault cases are depicted in Figure 14.In each contribution plot, the sampling time denotes

Fault Identification
After fault detection, the proposed CVR-based method is applied to identify the influential variables associated with the detected faults.The resultant contribution plots for the CSTR and compressor fault cases are depicted in Figure 14.In each contribution plot, the sampling time denotes the horizontal axis and the variable index denotes the vertical axis.The stronger the contribution of a variable is, the larger the fault-related deviations associated with the specific variable is.At each faulty condition, faulty variables will show continuously strong bands of contribution after the fault is detected by the combined health index.
CSTR fault case 1 simulates sensor drifts on the measured variable T c , and thus variable 10 is the only fault influential variable.Fault 2 and fault 3, however, simulate catalyst decay at different decaying rates; therefore the associated faulty variables are variables 7, 8 and 10.It is observed in Figure 14a that the contributions of variable 10 are higher than the normal variables for the CVR-based contribution, indicating that variable 10 is successfully identified as the faulty variable for fault 1.Based on the information provided by the CVR-based contributions for CSTR fault case 2, variables 7 and 10 show continuously strong bands of contribution throughout the degradation process, making them distinct from fault-free variables.Although variable 8 only demonstrates large contributions at around 2260-2280 min samples in Figure 14b, its contribution is still much higher on average over all faulty samples than the normal variables (see Figure 15; the fault scores of variable 8 are much higher than those of variables 1-6 and 9).Further investigation is required into the process so as to verify the observations.Energies 2019, 12, x FOR PEER REVIEW 13 of 16 contributions at around 2260-2280 minute samples in Figure 14b, its contribution is still much higher on average over all faulty samples than the normal variables (see Figure 15; the fault scores of variable 8 are much higher than those of variables 1-6 and 9).Further investigation is required into the process so as to verify the observations.It is observed in Figure 14c that variables 7, 8 and 10 are successfully identified as faulty variables because of their continuously strong contributions throughout the deterioration process.State space and residual space contribution plots for CSTR fault 3 are also shown in Figure 16a,b, respectively.Most faulty variables (except variable 8) are identified through Figure 16b, with variable 7 being the most influential variable.Figure 16a identifies all faulty variables, with variable 10 being the most influential variable.The combined contribution plot shown in Figure 14c identifies all faulty variables and enhances the contributions from variable 7, leading to a more accurate contribution map for CSTR fault 3.This observation highlights the advantages of the combined contribution plot in identifying influential variables compared with state space/residual space contribution map.State space contributions are calculated using the canonical residuals  and the first  canonical correlations ∑ as per Equation ( 22), while residual space contributions are calculated based on the last  −  columns of  as per Equation (20).Influential variables identified in the state space are associated with the large deviations of the states that present during healthy operational conditions.The residual space contributions, however, are related to the new states which are not described by the healthy CVR model.It is observed from Figure 14d   for CSTR fault 3.This observation highlights the advantages of the combined contribution plot in identifying influential variables compared with state space/residual space contribution map.State space contributions are calculated using the canonical residuals r t and the first q canonical correlations ∑ q as per Equation ( 22), while residual space contributions are calculated based on the last na − q columns of V as per Equation (20).Influential variables identified in the state space are associated with the large deviations of the states that present during healthy operational conditions.The residual space contributions, however, are related to the new states which are not described by the healthy CVR model.It is observed from Figure 14d that the influential variables are stage 3 drive end (DE) vibration sensors, which agrees with the time-domain observations.

Conclusions
The CVR-based diagnostic method proposed in this paper extended the concept of CVA in fault detection and identification of abrupt faults to the situation of diagnosis of slowly involving faults.The consideration of canonical variate residuals resulted in a more sensitive monitoring index compared with  and  statistics.When validated on simulation and industrial case studies, our proposed  index outperformed  and  statistics in terms of both fault detection time and missed detection rate.Moreover, by considering the deviations between past and future data in the canonical state space, the proposed CVR-based contribution plots successfully identified faulty variables for most of the fault cases studied.The importance of the combination of state and residual space contributions was also highlighted.
The CVR-based contribution method appeared to be less sensitive to small changes in the data because it tends to give low fault scores during the early degradation process.A consideration for future study is whether the fault scores at early degradation stages would be improved if deviations between past and future data in the residual space were used for the identification of faulty variables.The removal of the smearing effect caused by normal variables is also a future research direction.The challenge of using CVR for fault detection can be the change of the operating conditions, which are not easily discriminated from failures even though the overall false alarm rate is low.In the future, we will explore approaches that can distinguish between the change of operational conditions and system failures.

Conclusions
The CVR-based diagnostic method proposed in this paper extended the concept of CVA in fault detection and identification of abrupt faults to the situation of diagnosis of slowly involving faults.The consideration of canonical variate residuals resulted in a more sensitive monitoring index compared with  and  When validated on simulation and industrial case studies, our proposed  index outperformed  and  statistics in terms of both fault detection time and missed detection rate.Moreover, by considering the deviations between past and future data in the canonical state space, the proposed CVR-based contribution plots successfully identified faulty variables for most of the fault cases studied.The importance of the combination of state and residual space contributions was also highlighted.
The CVR-based contribution method appeared to be less sensitive to small changes in the data because it tends to give low fault scores during the early degradation process.A consideration for future study is whether the fault scores at early degradation stages would be improved if deviations between past and future data in the residual space were used for the identification of faulty variables.The removal of the smearing effect caused by normal variables is also a future research direction.The challenge of using CVR for fault detection can be the change of the operating conditions, which are not easily discriminated from failures even though the overall false alarm rate is low.In the future, we will explore approaches that can distinguish between the change of operational conditions and system failures.
are not easily discriminated from failures even though the overall false alarm rate is low.In the future, we will explore approaches that can distinguish between the change of operational conditions and system failures.

Figure 1 .
Figure 1.Illustration of how the future and past windows are updated when a new measurement becomes available.

Figure 1 .
Figure 1.Illustration of how the future and past windows are updated when a new measurement becomes available.

Figure 2 .
Figure2.Schematic of the continuous stirred tank reactor (CSTR) process[14]. is the inlet temperature of the cooling water;  is the outlet temperature of the th cooling water;  is the coolant flow rate;  is the temperature of the th reactor;  is the reactor temperature;  is the coolant flow rate;  is the concentration in the reactor.

Figure 3 .
Figure 3. Sample dataset (healthy) of system inputs from CSTR simulation.

Figure 2 . 16 Figure 2 .
Figure 2. Schematic of the continuous stirred tank reactor (CSTR) process[14].T c is the inlet temperature of the cooling water; T ci is the outlet temperature of the ith cooling water; Q c is the coolant flow rate; T i is the temperature of the ith reactor; T is the reactor temperature; Q c is the coolant flow rate; C i is the concentration in the reactor.

Figure 3 .
Figure 3. Sample dataset (healthy) of system inputs from CSTR simulation.

Figure 3 .
Figure 3. Sample dataset (healthy) of system inputs from CSTR simulation.
3 become visible at different times because the decaying rates were deliberately varied.Measured variables of the CSTR process are summarized in Table

Figure 7 .
Figure 7. Trend of two different bearing vibration sensor measurements of compressor A. DE is short for drive end.

Figure 7 .
Figure 7. Trend of two different bearing vibration sensor measurements of compressor A. DE is short for drive end.

16 Figure 8 .
Figure 8. False alarm rate for different number of retained states .

Figure 8 .
Figure 8. False alarm rate for different number of retained states q.

Figure 12 .
Figure 12.False alarm rate for different number of retained states .

Figure 12 .
Figure 12.False alarm rate for different number of retained states q.

Figure 12 .
Figure 12.False alarm rate for different number of retained states .

Figure 13 .
Figure 13.Fault detection results for compressor fault: T 2 (upper), Q (middle) and combined index T c (lower).

Figure 14 .
Figure 14.Fault identification results for (a) CSTR fault 1; (b) CSTR fault 2; (c) CSTR fault 3; (d) compressor fault.It is observed in Figure14cthat variables 7, 8 and 10 are successfully identified as faulty variables because of their continuously strong contributions throughout the deterioration process.State space and residual space contribution plots for CSTR fault 3 are also shown in Figure16a,b, respectively.Most faulty variables (except variable 8) are identified through Figure16b, with variable 7 being the most influential variable.Figure16aidentifies all faulty variables, with variable 10 being the most influential variable.The combined contribution plot shown in Figure14cidentifies all faulty variables and enhances the contributions from variable 7, leading to a more accurate contribution map Energies 2019, 12, x FOR PEER REVIEW 14 of 16

Figure 15 .
Figure 15.Averaged individual variable contributions under faulty conditions from CSTR fault 2.

Table 1 .
Slowly evolving fault scenarios in the continuous stirred tank reactor (CSTR) system.

Table 2 .
Measured variables of CSTR process.

Table 2 .
Measured variables of CSTR process.

Table 2 .
Measured variables of CSTR process.

Table 3 .
Measured variables of compressor A.
21 Stage 3 Thrust Position Axial Probe 1 11 Stage 1-2 DE Radial Vibration Overall X * 22 Stage 3 Thrust Position Axial Probe 2 * DE is short for drive end, and NDE is short for non-drive end.

Table 3 .
Measured variables of compressor A.
11 Stage 1-2 DE Radial Vibration Overall X * 22 Stage 3 Thrust Position Axial Probe 2 * DE is short for drive end, and NDE is short for non-drive end.

Table 4 .
Monitoring results for the CSTR and pump faults.

Table 4 .
Monitoring results for the CSTR and pump faults.

Table 5 .
False alarm rate of T c index.