A Novel Method for Gas Turbine Condition Monitoring Based on KPCA and Analysis of Statistics T 2 and SPE

Gas turbines are widely used all over the world, in order to ensure the normal operation of gas turbines, it is necessary to monitor the condition of gas turbine and analyze the tested parameters to find the state information contained in parameters. There is a problem in gas turbine condition monitoring that how to locate the fault accurately if failure occurs. To solve the problem, this paper proposes a method to locate the fault of gas turbine components by evaluating the sensitivity of tested parameters to fault. Firstly, the tested parameters are decomposed by the kernel principal component analysis. Then construct the statistics of T2 and SPE in the principal elements space and residual space, respectively. Furthermore, the thresholds of the statistics must be calculated. The influence of tested parameters on faults is analyzed, and the degree of influence is quantified. The fault location can be realized according to the analysis results. The research results show that the proposed method can realize fault diagnosis and location accurately.


Introduction
Gas turbines provide power for generators, ships, aircraft, etc. Gas turbines need to withstand the influence of high temperature and high pressure when working.Obviously, the harsh working condition of turbines will definitely lead to the performance degradation of components.Fault occurs when performance degradation is severe.It is essential to locate the fault in time after a fault happens [1][2][3][4].Currently, there are four categories of fault diagnosis-the turbine model-based method, the knowledge-based method, the data-driven-based method, and the techniques fusion-based method [5,6].The model-based method of the diagnosed object must establish an accurate turbine model and on-line input parameters are employed [5].Silvio Simani and Farsoni Saverio [7] established an identified fuzzy model which based on the Takagi-Sugeno prototype to detect and isolate the fault.Hector Sanchez and Teresa Escobet [8] established a model and proposed a method to check whether the measurements fall inside the output interval.A diagnosis was proposed based on this model.Method based on knowledge is to essentially formulate the diagnostic problem solving as a pattern recognition problem [9,10].Zhang, Bingham, and Gallimore [11] proposed two techniques to detect the fault.They promoted the concept of y indices based on a transposed formulation of data matrix, and residual errors (REs) and faulty sensor identification indices (FSIIs) are introduced in another method.A large number of data must be available if the method based on data-driven is adopted.The potential relationships between these data need to be extracted.Zhu, Ge, and Song [12] proposed a robust variable model driven by the hidden Markov model and a probabilistic model with Student's t mixture output was designed to tolerate outliers.Furthermore, Zhang Peng [13] studied the Kalman filter and applied it in the location of a fault.He focused on how to establish the linear and nonlinear models of turbine.Based on the models, two faulty location algorithms which apply to the steady working state and dynamic working state respectively were constructed.Vasile Palade, Ron J. Patton, and Faisel J. Uppal [14] applied a neuro-fuzzy technique in an actuator fault location of a gas turbine.Based on learning and adaptation of the TSK fuzzy model, a neuro-fuzzy model was used to generate he residual, and a neuro-fuzzy classifier for the Mamdani model is used to evaluate the residual.Che Changchang, Wang Huawei, and Ni Xiaomei [15] proposed a fault fusion diagnosis model which is based on deep learning.The model analyzes a large number of performance data and obtains fault classification confidence by extracting hidden features from the performance data, then conducts the decision fusion of multiple fault classification results.Tayarani-Bathaie and Khorasani [16] constructed two types of dynamic neural networks to learn the turbine dynamic state.For the measurable variables of the turbine, different neural networks are trained to capture the dynamic relationships.Then, construct a multilayer perception network function to isolate the fault.All model-based methods need to build models that accurately reflect the turbine state.However, due to the large number of turbine parts and the bad working environment, there are too many factors that affect the working state of gas turbine.Thus, it is very difficult to build high-precision models.In addition, the data-driven approaches require sufficient samples to be obtained to locate the fault.Furthermore, the algorithm designer must know the fault generation mechanism and the relationship between these samples.All the above conditions are difficult to meet at the same time.
To avoid the problems mentioned above, and to locate the faults successfully, this paper proposes a fault location approach based on the sensitivity analysis of tested aerodynamic parameters.This approach belongs to the category of data-driven method and the faulty samples are not needed.Firstly, when the turbine is testing, collects the measured data in real-time.Then decomposes the measured data based on the kernel principal component analysis, constructs the Hotelling-T 2 (T 2 ) statistic, which is the application of the T-statistic in multivariate analysis in the principal space and squared prediction error (SPE) statistics in residual space after data decomposition.Further, the thresholds of statistics must be calculated, determining whether the fault occurs by comparing the relationship between the T 2 statistic and its threshold.If a fault occurs during detection, we calculate the partial derivatives of the T 2 and SPE statistics to the measured parameters.The greater the values of the partial derivatives, the greater the impacts of the measured parameters to the statistics.According to the working principle of gas turbines, it can be known that the parameters at the outlet of a component will fluctuate firstly and then the fluctuation spreads to other components if a component is faulty.The amplitude of the fluctuation at the outlet of the failed component is the greatest.Obviously, partial derivatives can be used to indicate the degree of influence of the measured parameters when a component fails.

Materials and Methods
Principal component analysis is a method of data processing which is suitable for linear system and transforms the correlated data into uncorrelated ones by a series of orthogonal changes.Gas turbine is a typical nonlinear system and great error may be caused if PCA is directly used to diagnose the fault of turbine.This paper adopts the kernel principal component analysis to detect the gas turbine fault.By using the kernel function, KPCA has strong nonlinear system processing ability [17,18].The processes of KPCA are shown below [19][20][21][22][23].
For a given data sample collection, x 1 , x 2 , x 3 , . . ., x q ∈ R n , a nonlinear transformation ∅ maps the samples into a higher feature space F: where ∅(x) is the expression of samples in feature space.The covariance of ∅(x) can be expressed as: where λ is the eigenvalue of C F and v is the eigenvector of C F .Calculating the inner product of ∅(x i ) with λv and C F v respectively: The eigenvectors can be represented by a series of constants α i , as follows: Combine Equations ( 2)-( 5): Simplify Equation ( 6) into Equation ( 7): In Equation ( 7) . K is a kernel function which calculates the inner product of vectors in high-dimensional feature space.To strengthen the ability of KPCA to deal with nonlinear problem, the Gauss radial basis function is adopted and its expression is: Normalize the eigenvectors v by Equation (8): It can be seen that the vector α is normalized by Equation (9).Representing the mapped data in the feature space as t k , there is: where α k i is the i-th coefficient of the k-th eigenvalue of matrix K to eigenvector.The cumulative contribution rate of variance is used to determine the number of principal components which mapped to the feature space.The calculation equation is as follow: where l is the number of principal component and ε is a constant.The value of ε reflect the influence of noise.Usually, the value of ε is between 0 and 1. Equations ( 2)- (11) are the steps to conduct the kernel principal component analysis.To achieve the fault detection of gas turbine components, the statistics of T 2 and SPE must be constructed, as shown below: where Λ is a diagonal matrix consisting of principal component eigenvalues.t k is the mapped data of samples in the feature space.
T 2 th is the threshold of the T 2 statistic and F α (l, n − l) is upper limit value of F-ditribution with confidence level α: Based on the KPCA introduced from Equations ( 2)-( 15), a fault diagnosis algorithm is designed to determine whether the turbine component is fault.The tested parameters include all the values of total temperature and total pressure at the outlet of gas turbine components.Decomposing these parameters by kernel principal component analysis, construct the T 2 statistic, SPE statistic, and their corresponding thresholds.Determine whether a fault occurs by comparing the relationship of the T 2 statistic and its threshold.
This section focuses on how to locate the fault when the failure occurs.By calculating the partial derivatives of statistics T 2 and SPE to the tested parameters, the sensitivity of tested parameters to the fault can be expressed, and the location of the fault can be determined according to the sensitivity.For the T 2 statistic, the greater the value of sensitivity is, the more likely it is the location of the fault.Kernel function analysis is the most important step in sensitivity calculation, so we make the following changes to the kernel function: , n is the number of categories of measured parameters, x i is the i-th measurement vector consisting of different measured parameters.Calculating the partial derivative of kernel function to v k , there is: The value of partial derivative indicates the effect of parameters to kernel function.x j,k , x i,k are the k-th elements of the i-th and j-th measured parameters.The partial derivative of the product between kernels can be expressed as: x new is a vector consisting of measured parameters.Define the partial derivatives of statistics as C T 2 ,i,new and C SPE,i,new , there are: The values of C T 2 ,i,new and C SPE,i,new indicate the sensitivity level of the i-th element of the statistics.Steps of calculate the C T 2 ,i,new are as follows: There is: The calculation steps of SPE are as follows: Figure 1 shows an algorithm for fault diagnosis and location based on above research.
Processes 2019, 7, x FOR PEER REVIEW 5 of 13 There is: The calculation steps of SPE are as follows: Figure 1 shows an algorithm for fault diagnosis and location based on above research.

Results
In order to verify the effectiveness of the method proposed in this paper, certain of twin-spool aviation gas turbine is adopted as the research object.It is widely known that the working condition of the engine is very bad [13,24] (suffering from high pressure, high temperature, high stress, etc.) and the performances of gas turbine components (such as the compressor, rotator, turbine, etc.) are decreasing as working hours increase [25][26][27][28][29][30].The initial working parameters of this turbine are shown in Table 1.When these working parameters are determined, the state parameters of the engine are shown in Figures 2 and 3 when the flight altitude and speed are different.

Results
In order to verify the effectiveness of the method proposed in this paper, certain of twin-spool aviation gas turbine is adopted as the research object.It is widely known that the working condition of the engine is very bad [13,24] (suffering from high pressure, high temperature, high stress, etc.) and the performances of gas turbine components (such as the compressor, rotator, turbine, etc.) are decreasing as working hours increase [25][26][27][28][29][30].The initial working parameters of this turbine are shown in Table 1.When these working parameters are determined, the state parameters of the engine are shown in Figures 2 and 3 when the flight altitude and speed are different.In the experiment, the measured parameters include the total temperature at the outlets of low pressure compressor LPC (Tt25), total temperature of how pressure compressor HPC (Tt3), total temperature of the high pressure turbine HPT (Tt45), and total temperature of the low pressure turbine LPT (Tt5).In addition, the total pressure at the outlets of the low pressure compressor LPC (Pt25), total pressure of the high pressure compressor HPC (Pt3), total pressure of the high pressure turbine HPT (Pt45), and the total pressure of the low pressure turbine LPT (Pt5) are included.Two faults occurred at the 2600th sampling moment: one is the misalignment of the LPC rotor, and another one is the crack generation of the LPT blade.The proposed method is adopted to detect and locate the faults.Figures 4-7 are the diagrams of fault diagnosis.At the 2600th sampling time, the faults of the LPC and LPT are generated, respectively.In Figure 2, the value of the T 2 statistic is smooth and lower than its threshold before the 2600th sampling time.Due to the occurrence of fault at the 2600th sample, the curve takes a large jump and exceeds its threshold.In Figure 5, the SPE statistic approaches the threshold at some time before the occurrence of fault.Since the SPE statistic mainly contains noise information, KPCA processing cannot eliminate the noise completely.When the noise amplitude increases, the value of the SPE statistic may exceed its threshold, which has been introduced in Equations ( 13) and ( 15).This does not affect the fault diagnosis of the components.Figures 6  and 7 show the fault detection of the LPT and the detection results are similar with those of the LPC.The fault location algorithm mentioned above is used to locate the fault of the LPC, HPC, LPT, and HPT.The location of the results are shown in Table 2.In the experiment, the measured parameters include the total temperature at the outlets of low pressure compressor LPC (Tt25), total temperature of how pressure compressor HPC (Tt3), total temperature of the high pressure turbine HPT (Tt45), and total temperature of the low pressure turbine LPT (Tt5).In addition, the total pressure at the outlets of the low pressure compressor LPC (Pt25), total pressure of the high pressure compressor HPC (Pt3), total pressure of the high pressure turbine HPT (Pt45), and the total pressure of the low pressure turbine LPT (Pt5) are included.Two faults occurred at the 2600th sampling moment: one is the misalignment of the LPC rotor, and another one is the crack generation of the LPT blade.The proposed method is adopted to detect and locate the faults.Figures 4-7 are the diagrams of fault diagnosis.At the 2600th sampling time, the faults of the LPC and LPT are generated, respectively.In Figure 2, the value of the T 2 statistic is smooth and lower than its threshold before the 2600th sampling time.Due to the occurrence of fault at the 2600th sample, the curve takes a large jump and exceeds its threshold.In Figure 5, the SPE statistic approaches the threshold at some time before the occurrence of fault.Since the SPE statistic mainly contains noise information, KPCA processing cannot eliminate the noise completely.When the noise amplitude increases, the value of the SPE statistic may exceed its threshold, which has been introduced in Equations ( 13) and ( 15).This does not affect the fault diagnosis of the components.Figures 6 and 7 show the fault detection of the LPT and the detection results are similar with those of the LPC.The fault location algorithm mentioned above is used to locate the fault of the LPC, HPC, LPT, and HPT.The location of the results are shown in Table 2.        Table 2 shows that when any part of gas turbine components is fault, the sensitivity of measured parameters of faulty part to the statistics of T 2 and SPE is greater than that of normal ones.Take LPC as an example to illustrate the result.If the efficiency coefficient of LPC decreased by 1% due to the misalignment of LPC rotor, the measured parameters at the outlet of the LPC fluctuates firstly.Sensitivities of total temperature and total pressure at the outlet of LPC measured by the sensors to T 2 statistic are 0.3239 and 0.6271, which are obviously higher than those of other measured parameters.The sensitivities to the SPE statistic are both 0.125, which are also higher than those of other parameters.The fault location method can locate the fault to the each component.Figure 8 shows the sensitivity distribution spectrums of the measured parameters when the gas turbine is working.Table 2 shows that when any part of gas turbine components is fault, the sensitivity of measured parameters of faulty part to the statistics of T 2 and SPE is greater than that of normal ones.Take LPC as an example to illustrate the result.If the efficiency coefficient of LPC decreased by 1% due to the misalignment of LPC rotor, the measured parameters at the outlet of the LPC fluctuates firstly.Sensitivities of total temperature and total pressure at the outlet of LPC measured by the sensors to T 2 statistic are 0.3239 and 0.6271, which are obviously higher than those of other measured parameters.The sensitivities to the SPE statistic are both 0.125, which are also higher than those of other parameters.The fault location method can locate the fault to the each component.It can be seen that before the fault occurs in this figure, the sensitivity distribution curves are gentle and the differences between the sensitivity curves are not obvious.When a low pressure compressor failure occurs, the curves representing the sensitivity of the LPC increased sharply in a short time, and the values are significantly higher than others.In addition, according to the rule of failure caused by the degradation of gas turbine components, if the performance of a component degrades to a certain extent and is about to fail, the degradation speed will be accelerated until the failure occurs.In this process, the measured parameters at the outlet of deteriorating components will deviate from the real value as the deterioration of performance.In the sensitivity distribution spectrum, the sensitivity of the measured values of the deteriorating components will increase continuously, and the fault prediction can be realized by comparing the changes in sensitivity.
short time, and the values are significantly higher than others.In addition, according to the rule of failure caused by the degradation of gas turbine components, if the performance of a component degrades to a certain extent and is about to fail, the degradation speed will be accelerated until the failure occurs.In this process, the measured parameters at the outlet of deteriorating components will deviate from the real value as the deterioration of performance.In the sensitivity distribution spectrum, the sensitivity of the measured values of the deteriorating components will increase continuously, and the fault prediction can be realized by comparing the changes in sensitivity.In addition, due to the influence of harsh working circumstance, the sensor outputs may seriously deviate from their actual values and this may lead to the misdiagnosis.It is essential to keep output within a reasonable range.According to the working principle of gas turbines (taking an aero In addition, due to the influence of harsh working circumstance, the sensor outputs may seriously deviate from their actual values and this may lead to the misdiagnosis.It is essential to keep output within a reasonable range.According to the working principle of gas turbines (taking an aero gas engine as an example), the power and flow balance conditions must be observed when the turbine is under normal working conditions and all parameters remain unchanged or fluctuate in a small range.If the state of turbine changes due to the variation of control parameters, all the aerodynamic parameters will bound to change greatly, reflecting an anomaly of sensor measurements.Another case is that if only a few measurements are abnormal, according to the working principle of gas turbines and the balance conditions, it can be known that the anomalies are caused by noise or the fault of the sensors and the measurements must be restored.The process to detect the abnormal value and restore the measurements is shown in Figure 9.
Processes 2019, 7, x FOR PEER REVIEW 11 of 13 gas engine as an example), the power and flow balance conditions must be observed when the turbine is under normal working conditions and all parameters remain unchanged or fluctuate in a small range.If the state of turbine changes due to the variation of control parameters, all the aerodynamic parameters will bound to change greatly, reflecting an anomaly of sensor measurements.Another case is that if only a few measurements are abnormal, according to the working principle of gas turbines and the balance conditions, it can be known that the anomalies are caused by noise or the fault of the sensors and the measurements must be restored.The process to detect the abnormal value and restore the measurements is shown in Figure 9. Firstly, Grubb's method for testing is adopted to check if the parameters of sensors are abnormal or not.It is important to note that the value of detection level α must be determined according to the variation of the aerodynamic parameters.If the checked parameters are abnormal, all the parameters should be tested by Grubb's method.Then count the number of parameters which are abnormal.If the number is less than the quantity of aerodynamic parameters for fault diagnosis, it indicates that the sensors are faulty or the influence of noise increases.In such a situation, the abnormal parameters must be restored by a support vector machine (SVR) [31] to ensure the fault location proceeds Firstly, Grubb's method for testing is adopted to check if the parameters of sensors are abnormal or not.It is important to note that the value of detection level α must be determined according to the variation of the aerodynamic parameters.If the checked parameters are abnormal, all the parameters should be tested by Grubb's method.Then count the number of parameters which are abnormal.If the number is less than the quantity of aerodynamic parameters for fault diagnosis, it indicates that the sensors are faulty or the influence of noise increases.In such a situation, the abnormal parameters must be restored by a support vector machine (SVR) [31] to ensure the fault location proceeds smoothly.

Conclusions
In this paper, a novel method to locate the fault of a gas turbine is proposed.Kernel principal component analysis is adopted to detect the occurrence of fault.Based on the analysis of the fault indicator and aerodynamic parameters, the partial derivative of the T 2 and SPE statistics to aerodynamic parameters are calculated.The results are used to represent the influence degree of fault to these parameters and the fault location can be realized by different influence degrees.There are four conclusions can be drawn, as follows: (1) The KPCA is an effective way to detect the fault of a gas turbine.T 2 and SPE statistics, and their corresponding thresholds must be constructed.By comparing the size of the T 2 statistic and its threshold, the on-line fault diagnosis of a gas turbine can be realized.Furthermore, both statistics help to locate the fault.
(2) The fault location of gas turbine is realized by calculate the partial derivatives of T 2 to aerodynamic parameters.The size of partial derivatives represent the sensitivity degrees of aerodynamic parameters to fault.Based on the balance working condition of gas turbine, fault location can be achieved according to the size of partial derivatives.
(3) Sensitivity distribution spectra can be used to represent the performance degradation of the gas turbine and identify a potential fault.When a fault which resulted by performance degradation is about to occur, the partial derivatives of the aerodynamic parameters associated with this fault will change dramatically and this change is easily reflected by the sensitivity distribution spectra.
(4) The method proposed in this paper is used to locate the fault of a gas turbine and how to recognize the fault is not discussed.Whether this method can be used in fault identification needs to be verified in the follow-up work.

2 Figure 1 .
Figure 1.Process of fault diagnosis and location.Figure 1. Process of fault diagnosis and location.

Figure 1 .
Figure 1.Process of fault diagnosis and location.Figure 1. Process of fault diagnosis and location.

Figure 2 .
Figure 2. Total temperatures of components.Figure 2. Total temperatures of components.

Figure 2 .
Figure 2. Total temperatures of components.Figure 2. Total temperatures of components.

Figure 4 .
Figure 4. Value of the T 2 statistic under the condition of an LPC (low pressure compressor) fault.Figure 4. Value of the T 2 statistic under the condition of an LPC (low pressure compressor) fault.

Figure 4 .
Figure 4. Value of the T 2 statistic under the condition of an LPC (low pressure compressor) fault.Figure 4. Value of the T 2 statistic under the condition of an LPC (low pressure compressor) fault.

Figure 4 .
Figure 4. Value of the T 2 statistic under the condition of an LPC (low pressure compressor) fault.

Figure 5 .
Figure 5. Value of SPE statistic under the condition of LPC fault.Figure 5. Value of SPE statistic under the condition of LPC fault.

Figure 5 .
Figure 5. Value of SPE statistic under the condition of LPC fault.Figure 5. Value of SPE statistic under the condition of LPC fault.Processes 2019, 7, x FOR PEER REVIEW 9 of 13

Figure 6 .
Figure 6.Value of the T 2 statistic under the condition of an LPT fault.Figure 6. Value of the T 2 statistic under the condition of an LPT fault.

Figure 6 .
Figure 6.Value of the T 2 statistic under the condition of an LPT fault.Figure 6. Value of the T 2 statistic under the condition of an LPT fault.

Figure 6 .
Figure 6.Value of the T 2 statistic under the condition of an LPT fault.

Figure 7 .
Figure 7. Value of the SPE (Squared Prediction Error) statistic under the condition of an LPT fault.

Figure 7 .
Figure 7. Value of the SPE (Squared Prediction Error) statistic under the condition of an LPT fault.

Figure 8 .
Figure 8. Sensitivity distribution spectra of the tested parameters.

Figure 8 .
Figure 8. Sensitivity distribution spectra of the tested parameters.

Table 1 .
Initial working parameters of turbine.

Table 1 .
Initial working parameters of turbine.

Table 2 .
The sensitivity of measured parameters.