Long Short-Term Memory Network-Based Normal Pattern Group for Fault Detection of Three-Shaft Marine Gas Turbine

: Fault detection and diagnosis can improve safety and reliability of gas turbines. Current studies on gas turbine fault detection and diagnosis mainly focus on the case of abundant fault samples. However, fault data are rare or even unavailable for gas turbines, especially newly-run gas turbines. Aiming to realize fault detection with only normal data, this paper proposes the concept of normal pattern group. A group of long-short term memory (LSTM) networks are first used for characterizing the mapping relationships among measurable parameters of healthy three-shaft gas turbines. Experiments show that the proposed method can detect all 13 common gas path faults of three-shaft gas turbines sensitively while remaining low false alarm rate. Comparison experiment with single normal pattern model verifies the necessaries and superiorities of using normal pattern group. Meanwhile, comparison between LSTM network and other methods including support vector regression, single-layer feedforward neural network, extreme learning machine and Elman recurrent neural network verifies the superiorities of LSTM network in fault detection. Furthermore, comparison experiment with four common one-class classifiers further verifies the superiorities of the proposed method. This also indicates the superiorities of data-driven methods and gas turbine principle fusion to some extent.


Introduction
Currently, prognostics and health management (PHM) technique of gas turbine has become a hot research topic for monitoring health condition as well as ensuring the safe and reliable operation .PHM converts conventional "fail and fix" maintenance strategy to a more advanced conditional-based maintenance strategy.PHM can provide accurate condition monitoring, detect faults sensitively and timely, and thus avoid serious faults and significantly reduce maintenance costs.
With the boom of artificial intelligence and big data technique, data-driven methodsbased intelligent PHM of gas turbines is becoming increasingly popular among various PHM methods.Many famous gas turbine companies are attempting to use artificial intelligence and big data techniques in gas turbines.Rolls-Royce company has proposed the concept of IntelligentEngine as the future development trend of gas turbine industry.Pratt and Whitney company has provided EngineWise service to provide intelligent health management and predicted maintenance for aeroengines.GE company has also established Predix platform for intelligent management of gas turbine.Through Predix platform, the health conditions of GE's gas turbines are continuously monitored to detect the need for real-time maintenance.
Data-driven fault detection and diagnosis methods extract knowledge from historical data and do not require accuracy nonlinear models [4,5].Currently, data-driven methods are also becoming increasingly popular with researchers.Many data-driven methods including Bayesian method [6,7], random forest [8], finite state machine [9], rough set [10][11][12], support vector machine [13], extreme learning machine [14], artificial neural network [15] etc., have been widely used in gas turbine fault detection and diagnosis.Mast et al. [16] proposed a Bayesian belief network-based fault diagnosis method for turbofan engines.Losi et al. [17] used Bayesian hierarchical models for gas turbine fault detection.Maragoudakis et al. [8] used random forest for fault identification of an industrial gas turbine.Li et al. [9] proposed a finite state machine-based method for fault diagnosis of a single-spool industrial gas turbine.Xu et al. [10] used fuzzy rough set for vibration fault diagnosis of aircraft engines.Wong et al. [18] used extreme learning machine for fault diagnosis of gas turbine generator systems.Fast et al. [19] used artificial neural networks for modelling and diagnosis of single-spool gas turbine-based combined heat and power plant.Orozco et al. [20] used a single-hidden layer feed-forward neural network for diagnosis of an externally fired gas turbine.Liu et al. [21] proposed a method for performance prediction of a heavy-duty gas turbine based on high dimensional model representation and artificial neural network.Wang et al. [22] used support vector machine and fuzzy cmeans clustering for fault diagnosis of an industrial single-spool gas turbine.Zhou et al. [23] used support vector machine for gas turbine fault diagnosis.Loboda et al. [24] used multilayer perceptron and radial basis network for both industrial gas turbines and aircraft gas turbines.The experimental results indicate that radial basis network can obtain better performance.Loboda et al. [25] further proposed a probabilistic neural networkbased method for fault diagnosis of industrial gas turbines and aircraft gas turbines and reported good detection performance.Yazdani and Montazeri-Gh [26] combined hybrid dimensionality reduction and fuzzy logic for fault diagnosis of two-shaft industrial SGT 600 gas turbine.Fentaye et al. [15] used nested artificial neural networks for gas-path fault identification of a two-shaft gas turbine.Tahan et al. [27] proposed a multiple networks artificial neural network model for an industrial 18.7-MW twin-shaft gas turbine engine.Lu et al. [28] proposed restricted-Boltzmann-based extreme learning machine for turbofan engine fault diagnosis.
Recently, deep learning [29] is enjoying a boom.Deep learning has achieved tremendous success in computer vision [30], natural language processing [31], autonomous cars [32] etc.Many researchers have begun to attempt deep learning in the field of industrial fault diagnosis [33].Fu et al. [34] used grouped convolutional denoising autoencoders for aircraft engine fault detection and obtained good detection performance.Feng et al. [35] used information entropy and deep belief networks for aircraft turbofan engine fault diagnosis.Liu et al. [36] used convolutional neural network for fault detection of industrial gas turbines and obtained better performance than conventional artificial neural network and extreme learning machine.Mulewicz et al. [37] compared deep convolutional neural network with two conventional methods including random forest and extreme gradient boosting (XGBoost) and reported that deep convolutional neural network has better detection performance than the two conventional data-driven methods.
In the industrial scene, fault data are usually quite few or even available, especially for those gas turbines that have just been put into operation and only run for a short time.All above methods can obtain good performance when there are abundant historical fault data.However, in the case where fault data are unavailable, the above methods cannot realize fault detection due to the absence of fault information.This is the problems that this paper deals with.
Aiming to address the fault detection of three-shaft marine gas turbines in the case where only normal data are available at the beginning stage of operation, this paper proposed normal pattern group-based fault detection method for the first time.A group of long short-term memory (LSTM) networks are used for fault detection for the first time.The proposed method realizes accurate fault detection accuracy for fault data while remaining low false alarm rate for normal data, and thus effectively solves the problem of fault detection in three-shaft marine gas turbines in the case of no available historical fault data.The main contributions of this paper are summarized as follows. (1)Firstly, the concept of normal pattern group is proposed for three-shaft marine gas turbine fault diagnosis.Through normal pattern group, the intrinsic mapping relationships among measurable parameters of healthy three-shaft marine gas turbines are characterized by a group of normal pattern models.(2) Secondly, a group of long short-term memory (LSTM) networks are used in threeshaft marine gas turbine diagnosis.The superiorities of LSTM network in gas turbine fault detection are verified through comparison with other methods including support vector regression (SVR), extreme learning machine (ELM), single-hidden layer feedforward neural network (SLFN) and Elman recurrent neural network (ERNN).
To the best of our knowledge, this is the first time that LSTM network has been used in fault detection of three-shaft marine gas turbines and that the superiorities of LSTM network has been verified.(3) Thirdly, boxplot-based collaborative decision-making strategy for normal pattern group is proposed.Through collaborative decision-making of normal pattern group, accurate anomaly detection and low false alarm rate are realized.The normal pattern group is compared with single normal pattern models and common one-class classifiers and its superiorities are verified.
The rest of this paper is organized as follows.Section 2 elaborates the procedure of LSTM network-based normal pattern group fault detection method.Section 3 carries out detailed experiments to verify the superiorities of the proposed method.Section 4 concludes the paper and outlines the future research orientation.

Normal Pattern Group-Based Fault Detection
In industrial scene, the fault data of gas turbines, especially the gas turbines that have just been put into operation and only run for a short period, are quite rare or even unavailable.Current studies mainly focus on the case of abundant fault data.In the case of no available fault data, these methods cannot detect faults due to the lack of fault information.Thus, this paper will study the fault detection of three-shaft marine gas turbines in the case where only normal data are available.
The gas turbine follows the basic physical laws, such as the conservation of mass and energy etc.The gas turbine used Brayton cycle as the basic thermodynamic cycle.Thus, there exist inherent mapping relationships among all measurable parameters when the gas turbine operates normally.Thus, this paper establishes a series of normal pattern models to characterize intrinsic mapping relationships, proposes the concept of normal pattern group and detects anomaly through detecting the change of mapping relationships.
The normal pattern group is a group of normal pattern models.For a system with m input measurements ( 1 2 , ,..., m x x x ) and n output measurements ( 1 2 , ,..., n y y y ).
This paper establishes n normal pattern models with each model using one output measurement as its output and the rest 1 n  output measurements together with m inputs as its inputs.The architecture of normal pattern group is illustrated in Figure 1.Mathematically, the normal pattern group can be expressed as Equation (1).

Long Short-Term Memory Network
Normal pattern group method requires the identification of a group of normal pattern models, namely in Equation (1), using normal historical data of gas turbines.The gas turbine a nonlinear dynamic system with many dynamic behaviors, such as rotor inertia, heat inertia and volume dynamics etc.These dynamic behaviors usually manifest as a delay of time.Artificial neural network (ANN) has strong ability to represent nonlinearity, and thus is used to identify Among various ANN methods, long short-term memory (LSTM) network [38] is one of the most effective methods to deal with dynamic data.LSTM network can successfully address the long-term dependency problem well and effectively deal with dynamic information through introducing forget gate, input gate and output gate.LSTM network has been successfully used in various fields, such as time series forecast [39][40][41], remaining useful life prediction of industrial machines [42], machine translation [43], named entity recognition [44] etc. LSTM has also been widely used in identification of various dynamic systems.Literature [45][46][47] used LSTM to identify various dynamic systems and reported that LSTM network can characterize the dynamic systems well and obtain much better performances than conventional methods.Therefore, this paper uses LSTM network for identifying the nonlinear mapping relationships The structure of LSTM is shown in Figure 2, which includes three gates, namely forget gate, input gate and output gate [39].The principle of LSTM is elaborated as follows.
(1) Forget gate t f represents the ratio of historical information to be remained.
(2) Input gate t i represents the ratio of current information to be inputted.
o is the ratio of information to be the output of current LSTM unit.
Through the output gate, the current cell state t C is converted to the current LSTM output.In Equations ( 2)-( 7 Through LSTM network, the dynamic behaviors of three-shaft gas turbines can be effectively characterized and the nonlinear mapping relationships 1) can be identified precisely.

Collaborative Decision for Fault Detection
After LSTM network training, normal pattern group is established.Then this section will apply boxplot to normal pattern group and design a collaborative decision strategy for fault detection.
Boxplot is a method for graphically depicting groups of numerical data through their quartiles.Its principle is shown in Figure 3, where the lower quartile 1 Q is usually the 25th percentile and the upper quartile 3 Q is usually the 75th percentile.Interquartile range (IQR) is the distance between the upper and lower quartiles and is computed by Equation ( 10)Error!Reference source not found..For a residual vector, boxplot gives the upper threshold max u and the lower threshold min u by Equations ( 11) and ( 12) respec- tively.The data beyond the interval Normal pattern group includes n normal pattern models shown in Equation (1).Corresponding n residual vectors can also be obtained through the fitted values minus the corresponding real values.Each residual vector has an upper threshold and a lower threshold determined by boxplot.Let the number of samples in training set be N .Given a confidence interval CI , such as 95%, it is assumed that there are | * | CI N samples that have m residuals beyond the boxplot threshold, where | |  is integer-valued func- tion.Then a new instance will be normal with a confidence interval CI , if it has no more than m residual values beyond the boxplot threshold.Specifically, the fault detection process of a new sample is illustrated in Figure 4.In Figure 4, for a new instance, n fitted values are first computed by n trained LSTM networks, namely normal pattern group, and then n residual values are com- puted by n fitted values minus corresponding n actual values.The n residual values are compared with the corresponding n boxplot threshold to get the threshold binary group composed of n binary values (0 or 1).An example of threshold binary group is numbers 0,1, 0, 0 , 0,..., 0 n        .If the number of 1 in the threshold binary group exceed m , then it is detected as a fault instance, otherwise it is detected as a normal instance.

Application in Three-Shaft Marine Gas Turbine Fault Detection
This section applies the proposed LSTM-based normal pattern group to fault detection of three-shaft marine gas turbines.Three-shaft marine gas turbine has two compressors, one combustion chamber (CC) and three turbines.Two compressors are low-pressure compressor (LPC) and high-pressure compressor (HPC).Three turbines are highpressure turbine (HPT), low-pressure turbine (LPT) and power turbine (PT).Its typical configuration is shown in Figure 5.For the studied three-shaft marine gas turbines, there are 10 measurable parameters shown in Table 1 [48].
The ambient temperature 1 t and fuel flow rate f g directly affect the operational state of gas turbines.Thus, the two parameters are regarded as the input measurements of three-shaft marine gas turbines.The change of the other eight measurable parameters listed in Table 1 are caused by the change of 1 t and f g .Thus, the eight measurable pa- rameters are regarded as the output measurements of three-shaft marine gas turbines.According to Equation (1), we can establish the normal pattern group of three-shaft marine gas turbines in Equation (13).In Equation ( 13), there are eight normal pattern models, namely n equals 8.The detailed procedure of normal pattern group-based gas turbine fault detection is illustrated in Figure 6, which includes the following three steps.
Step 1: data preprocessing.This step divides the normal data into three parts.The first 70% is training set, the following 15% is the validation set and the rest 15% is the test set.
Step 2: training and validation.This step trains eight LSTM networks using training set to identify eight nonlinear mapping relationships in Equation (13).Hyperparameters of eight LSTM networks are tuned through validation set.After eight LSTM networks are trained, detection thresholds are computed through collaborative decision strategy in Section 2.3.
Step 3: test and anomaly detection.A new sample is inputted to the trained LSTMbased normal pattern group, eight residual values are obtained.Then its health condition is determined through collaborative decision strategy in Section 2.

Data Description
Nonlinear component model is a widely used method for gas path fault simulation [28], gas path fault diagnosis [23,49,50], automatic control [51] and characteristics analysis [52,53] of gas turbines.Currently, many researchers have developed mature and standard modelling method for gas turbines [54,55] and used the established nonlinear component model for gas path fault simulation, fault detection and fault diagnosis and achieved good performances.Thus, this paper uses the nonlinear component model of a three-shaft marine gas turbine developed in literature [54] to for fault data simulation.Gas path fault is one of the most frequent faults and can cause serious damages [1].Common gas path faults include fouling, erosion and foreign object damaging and can cause the drop of flow capacity and isentropic efficiency.Many literatures [1,28,56] have developed standard and widely-accepted ways to simulate gas path faults of gas turbines.According to literature [56], this paper simulated 13 common gas path faults including the fouling of LPC, the foreign object damaging (FOD) of LPC, the fouling of HPC, the FOD of HPC, the fouling of HPT, the erosion of HPT, the FOD of HPT, the fouling of LPT, the erosion of LPT, the FOD of LPT, the fouling of PT, the erosion of PT and the FOD of PT.
In the simulation, the input parameters of the simulation model are ambient temperature and fuel flow rate.The input parameters for normal data have 20,000 samples shown in Figure 7    In the following experiments, this paper uses the first 70% of normal data as the training set to train algorithms, the following 15% of normal data as the validation set for parameter tuning and the rest 15% of normal data for performance evaluation of normal data.All the fault data of 13 categories are used for evaluating the fault detection performance of fault data.Details of the generated simulation data for fault detection are illustrated in Table 2. n and 2 n are the number of normal data in test set and fault data respectively, t I is the number of actual normal data that are detected as normal data, t J is the num- ber of actual fault data that are detected as fault data.

Experiment of LSTM Network-Based Normal Pattern Group
This section performed experiment of normal pattern group to verify its effectiveness.First, eight LSTM networks are trained using training data to identify eight normal pattern models in Equation (13).LSTM networks are implemented by Keras library of Python programming language.The identification of eight normal pattern models is a typical regression task.Mean squared error is the most common loss function for regressor   After network training and validation, eight residual vectors of normal pattern group are computed through the fitted values minus corresponding actual values.Boxplot is used to determine the upper threshold and the lower threshold of the eight residual vectors.The normalized threshold of each boxplot is shown in Figure 12 and corresponding detection threshold is listed in Table 3.It is observed from Figure 13 that the percentage of threshold overshot number 0 and threshold overshot number 1 are the largest.The threshold overshot number of as many as 94.99% training samples is no more than 1.After that, although the threshold overshot number increases, the percentage does not increase much and the ability to detect fault samples can decrease significantly.Therefore, we set the parameter m in Figure 4 to be 1, which can ensure about 95% training samples to be classified correctly.For a new instance, it is first inputted to the trained normal pattern group to obtain eight residual values.If more than one of the eight residual values exceed corresponding boxplot threshold, this instance is detected as a fault instance, otherwise it is detected as a normal instance.
After establishing LSTM-based normal pattern group and determining the fault detection strategy, test set of normal data and 13 categories of fault data are used for fault detection.First, the test set of normal data are used to evaluate the fault detection performance of normal data.The fitting results and residuals of test data of normal data are shown in Figure 14Error!Reference source not found.. To evaluate the fitting performance better, mean absolute error (MAE), mean absolute percentage error (MAPE) and root mean square error (RMSE) are used.Their definitions are given in Equations ( 15)- (17).MAE describes the mean fitting errors and RMSE is sensitive to extreme fitting errors and MAPE describes the mean percentage error.For all three metrics, the smaller the better.These three describe the fitting performance from different perspectives.MAE, MAPE and RMSE of training set, validation set and test set are shown in Table 4.
From Figure 14 and Table 4, it is observed that the fitted values are close to the actual values in test set.RMSE, MAE and MAPE are all very small, which means that LSTM network can fit the normal data of three-shaft marine gas turbine well.Next, collaborative fault detection strategy is used for fault detection.Fault detection accuracy of LSTM-based normal pattern group in normal data and fault data is summarized in Table 5.The results in Table 5 show that the proposed method can detect all 13 categories of faults sensitively and remain low false alarm rate for normal data.Thus, through the proposed LSTM network-based normal pattern group and designed collaborative decisionmaking strategy, faults can be sensitively detected and the robustness to normal data is maintained simultaneously.

Comparison with Single Normal Pattern Methods
The proposed normal pattern group is a combination of eight normal pattern models in Equation (13).To verify the necessaries and superiorities of normal pattern group, this section compared it with eight single normal pattern models in Equation (13) 6.
From Table 6, it is observed that none of the eight normal pattern models can detect all 13 faults sensitively.The eight normal pattern models obtain the accuracy of less than 0.8 for some categories of faults.For example, H n normal pattern model, L n normal pat- tern model and plc normal pattern model both obtain very bad accuracy (accuracy less than 0.6) for HPT erosion fault.P normal pattern model and plt normal pattern model both obtain accuracy less than 0.6 for HPT FOD fault.The proposed normal pattern group method effectively improves the detection performance of fault data while remaining low false alarm rate for normal data through the collaborative decision of normal pattern group.Thus, the proposed normal pattern group significantly improves the fault detection performance compared with the eight normal pattern methods.
SVR uses kernel method to map the original data to a high dimensional space, so that an approximately linear regression can be used for regression in this space.Radial basis function (RBF) kernel is the most common kernel function.SLFN, ELM and ERNN are three kinds of neural networks.SLFN is a static neural network with three layers, namely input layer, hidden layer and output layer.SLFN is usually trained through backpropagation strategy.Its structure is shown in Figure 15.ELM has the same structure as SLFN.ELM random generates weights and bias between input layer and hidden layer and determines the weights of output layer by computing Moore-Penrose generalized inverse matrix instead of iteratively learning through error backpropagation.ELM can be trained faster than SLFN.ERNN introduces a one-step time delay to characterize the dynamic behaviors and its structure is shown in Figure 16.ERNN is also trained through error backpropagation.
In this paper, these methods are used to identify the normal pattern group in Equation (13).They are trained using data from training set, and their hyperparameters are tuned through validation set.Their fault detection performances are shown in Table 7.

Input layer
Hidden layer Output layer   From Table 7, it is observed that the proposed LSTM network-based fault detection method significantly outperforms other methods.For test set of normal data, LSTM improves the accuracy by 0.0596 when compared to ELM, improves the accuracy by 0.1533 when compared to SVR, improves the accuracy by 0.0383 when compared to SLFN and improves the accuracy by 0.0170 when compared to ERNN.For the fault data, LSTM can ensure the fault detection accuracy of each fault class to be at least 0.9936.By contrast, ELM, SLFN and ERNN obtain almost as high accuracy as LSTM, but SVR obtains the accuracy of less than 0.9 for some categories of faults including LPT fouling fault, HPT fouling fault, HPT erosion fault, LPC fouling fault and HPC fouling fault.Thus, LSTM can ensure the fault detection accuracy of normal data and fault data to be more than 0.9 and have more reliable fault detection performance.It is also observed that ELM, ERNN and SLFN outperforms the SVR method in the test set of normal data and some types of faults.This shows that neural network-based methods including ELM, ERNN and SLFN has better fault detection performance.Meanwhile, ERNN outperforms SLFN and ELM.This is because that ERNN considers the time-delayed relationship among gas turbine measurements to some extent.Compared with ERNN, LSTM considers time-delayed relationship better through introducing input gate, forget gate and output gate, and can characterize time-delayed relationship with much longer time lags.Thus, the proposed LSTM-based method obtains significantly better fault detection performances than ERNN and other methods in Table 7.

Comparison with One-Class Classifiers
Currently, one-class classifiers in machine learning field have also been widely used for industrial anomaly detection in the case of only requires normal data.These methods have been widely used in industrial fault detection [64,65], spam detection [66], etc.Thus, this paper compared the proposed normal pattern group method with these common oneclassifiers to further verify its supercities.The compared methods include one-class support vector machine (OCSVM) [67,68], local outlier factor (LOF) [69], isolation forest [70], principal component analysis (PCA) [71].
OCSVM uses the kernel function to map the original normal data to a high-dimensional space, where OCSVM tries to find a hyperplane that enables the normal data can be as far from the origin as possible.Let the distance between the hyperplane and the origin be  , then the samples whose distance from the origin is smaller than  is de- tected as abnormal samples.Common kernel functions include RBF kernel, linear kernel, sigmoid kernel etc., and RBF kernel is the most widely used one.LOF detects anomaly through comparing the density of the given sample and the sample density in its neighborhood.If its density is obviously smaller than the density in its neighborhood, then this sample is detected as an abnormal sample.Isolation forest isolates fault samples through constructing trees and abnormal samples are usually isolated first.PCA detects anomaly through the compression and reconstruction of data.PCA is trained using normal data, and it can ensure that the reconstruction errors of normal samples are small and that the reconstruction errors of fault samples are large.Square prediction error (SPE) and 2 T statistics [71] are two common ways for determining thresholds in PCA-based fault detection method.
In this paper, OCSVM, LOF and isolation forest were implemented by scikit-learn library [72,73] of Python programming language.PCA-based fault detection method was coded through Numpy library [74] of Python programming language.This paper uses three kernel functions including radial basis function (RBF) kernel, linear kernel and sigmoid kernel for OCSVM method.For PCA-based fault detection method, square prediction error (SPE) and 2 T statistics [71] are both used in the experiment.The parameters of these methods were selected by the validation set.Corresponding comparison results are shown in Table 8.
From Table 8, it is observed that the four one-class classifiers are not sensitive to some categories of faults.Isolation Forest, PCA, LOF and OCSVM all have bad performance (accuracy less than 0.8) for some fault categories.Meanwhile, isolation Forest, LOF, OCSVM and PCA with 2 T statistics obtains the accuracy of only about 0.8 for test set of normal data.Among these one-class classifiers, only PCA with SPE statistics obtains good accuracy for test set of normal data.The proposed method can ensure the detection accuracy of each fault class to be at least 0.9936 while remaining the accuracy of more than 0.9 for normal data.The proposed method incorporates the gas turbine prior knowledge, and thus is sensitive to all faults and remains low false alarm rate for normal data.Thus, the proposed method significantly outperforms common one-class classifiers in fault

Conclusions and Future Work
Fault detection of three-shaft marine gas turbines has great significance in increasing operational reliability and reducing maintenance costs.Current researches mainly focus on the situation where abundant fault data are available.However, fault data are quite few or even unavailable, especially for newly-run gas turbines.Aiming at the case where only normal data are available, this paper proposes long short-term memory (LSTM) network-based normal pattern group for fault detection of three-shaft gas turbines.Through experiments in a three-shaft marine gas turbine, the following conclusions can be drawn.
Firstly, this paper characterizes the healthy state of three-shaft marine gas turbines using normal pattern group composed of a group of normal pattern models.A group of long short-term memory (LSTM) networks are used to identify these normal pattern models and detect anomalies.Experimental results show that the proposed method can detect all 13 common gas path faults of three-shaft gas turbines sensitively while remaining low false alarm rate simultaneously.
Secondly, the proposed normal pattern group method is compared with eight single normal pattern models to verify its superiorities.Experimental results show that the proposed method significantly outperforms the eight normal pattern models in terms of fault detection performance.
Thirdly, the proposed normal pattern group method is compared with some common one-class classifiers including one-class support vector machine, principal component analysis, isolation forest and local outlier factor to further verify its superiorities.Experimental results show that the proposed method significantly outperforms all one-class classifiers to some extent.This can also indicate that introducing appropriate prior knowledge can improve the fault detection performance of gas turbines compared with purely data-driven one-class classifiers to some extent.
In the future, the proposed normal pattern group method can be applied in other types of gas turbines after analyzing the mapping relationships among corresponding measurement parameters.Besides, more data-driven methods will also be explored in fault detection of gas turbines.Additionally, the authors hope that LSTM network-based normal pattern group can be applied to fault detection of other industrial systems except marine gas turbines are healthy.
faults occur.Thus, accurate fault detection can be realized through normal pattern group defined in Equation (1).

Figure 1 .
Figure 1.Architecture of normal pattern group.
are the bias term.The weight matrix and bias term of LSTM network are learned automatically from training data via the backpropagation through time (BPTT) strategy.The operation  is the element-wise product (also known as Hadamard product), t x is the cur- rent input data and 1 t h  is the LSTM unit output at the previous moment.The function (.) and tanh(.)are nonlinear activation functions defined as follows.

Figure 4 .
Figure 4. Collaborative decision-making strategy for fault detection.

Figure 5 .
Figure 5.Typical configuration of a three-shaft marine gas turbine.
(a) and Figure 7 (b), which covers a wide range of operating conditions.The input parameters for fault data have 1800 samples shown in Figure 7 (c) and Figure 7 (d).The input parameters of fault data are inputted to the component model five times to simulate fault data of 5 severities.Thus, the fault data of each fault category have 9000 samples with each fault severity containing 1800 samples.All the simulated normal data are shown in Figure 8.For the simulated fault data, due to the page length, this paper only visualizes one category of fault data, namely LPC fouling fault in Figure 9Error!Reference source not found.. LPC fouling fault include five severity levels, namely fault severity 1, fault severity 2, fault severity 3, fault severity 4 and fault severity 5. Fault severity 5 denotes the most serious fault level.Due to the page length, only fault severity 1 and fault severity 5 are shown in Figure 9.

Figure 9 (Figure 7 .Figure 8 .
Figure 7. Input parameters of normal data and fault data: (a) Ambient temperature of normal data; (b) Fuel flow rate of normal data; (c) Ambient temperature of fault data; (d) Fuel flow rate of fault data.
networks, and thus mean squared error is used as the loss function of LSTM network.During the training process, the validation set is used to tune the hyperparameters of LSTM networks.The fitted results are shown in Figure 10 and Figure 11.It is observed that the fitted values are quite close to the actual values.This shows that LSTM network can characterize the normal pattern of gas turbines well.

Figure 10 .
Figure 10.Actual data versus estimated data in training set.

Figure 11 .
Figure 11.Actual data versus estimated data in validation set.

Figure 12 .
Figure 12.Boxplot of residuals in training set.
normal pattern group, each training instance has eight residual values.According to the boxplot threshold of training set, we can count the number of samples that has no residual values beyond the corresponding boxplot threshold.Similarly, we can count the number of samples that has ( 1, 2,...,8) z z  residual values beyond the corresponding boxplot threshold.Furthermore, the percentage of these samples is computed through being divided by the number of all samples in training set, which is shown in Figure 13.

Figure 13 .
Figure 13.Percentage of different threshold overshoot numbers in training set.

Figure 14 .
Figure 14.Actual data versus estimated data in test set.

Table 3 .
Detection threshold of normal pattern group.

Table 4 .
RMSE, MAE and MAPE of normal pattern group.

Table 5 .
Fault detection accuracy of the proposed normal pattern group method.
. Comparison results are shown in Table 6, H n , L n , P , phc , plt , plc , tpt and tlt denote the normal pattern model that uses H n , L n , P , phc , plt , plc , tpt and tlt as the output of LSTM network respectively.The bold values denote the best detection accuracy in Table

Table 6 .
Fault detection accuracy comparison with single normal pattern method.

Table 7 .
Fault detection accuracy comparison with other methods.