A Review on Gas Turbine Gas-Path Diagnostics: State-of-the-Art Methods, Challenges and Opportunities

Gas-path diagnostics is an essential part of gas turbine (GT) condition-based maintenance (CBM). There exists extensive literature on GT gas-path diagnostics and a variety of methods have been introduced. The fundamental limitations of the conventional methods such as the inability to deal with the nonlinear engine behavior, measurement uncertainty, simultaneous faults, and the limited number of sensors available remain the driving force for exploring more advanced techniques. This review aims to provide a critical survey of the existing literature produced in the area over the past few decades. In the first section, the issue of GT degradation is addressed, aiming to identify the type of physical faults that degrade a gas turbine performance, which gas-path faults contribute more significantly to the overall performance loss, and which specific components often encounter these faults. A brief overview is then given about the inconsistencies in the literature on gas-path diagnostics followed by a discussion of the various challenges against successful gas-path diagnostics and the major desirable characteristics that an advanced fault diagnostic technique should ideally possess. At this point, the available fault diagnostic methods are thoroughly reviewed, and their strengths and weaknesses summarized. Artificial intelligence (AI) based and hybrid diagnostic methods have received a great deal of attention due to their promising potentials to address the above-mentioned limitations along with providing accurate diagnostic results. Moreover, the available validation techniques that system developers used in the past to evaluate the performance of their proposed diagnostic algorithms are discussed. Finally, concluding remarks and recommendations for further investigations are provided.


Introduction
In today's competitive business world, one way to increase profitability of machinery equipment or a process plant is to reduce its operational and maintenance expenses while increasing productivity. Gas turbine (GT) is one of the most expensive devices in aircraft and industrial applications, where reliability and availability are the two most desirable attributes. In the past several decades, trillions of dollars was invested globally in the operation and maintenance of GTs [1,2]. However, due to their rising roles in the fast-growing industry, the market trend is still expected to be continued into the foreseeable future. According to the International Air Transport Association (IATA) report, in 2014, the world fleet count was 24,597 aircrafts. In this fiscal year, globally, airlines spent $62.1 billion on Maintenance, Repair, and Overhaul (MRO), of which about 40% was for engine maintenance. In 2024, the engines MRO is expected to reach over $36 billion, with a 3.8% increasing rate per annum [3]. One can see how large these expenses would be if they are extended to include all types of GT applications. Studies on the GT market indicated that the market for other engine groups is much bigger than the aircraft engines due to rapid industrialization across the globe and the rising demand for power generation, mechanical drives and propulsion [2,[4][5][6].
The GT fuel consumption and the likely increase in fuel price is another critical issue. For example, the US Department of Defense (DOD) alone consumes 4.6 billion gallons of fuel annually, which is 93% of the US government fuel consumption and the 34th largest fuel consumption in the world, of which about 85% is for Air Force and Navy uses [7,8]. On the other hand, in combined cycle power plants (CCPPs), the fuel cost covers 75% of the total life-cycle cost (LCC) [6]. Therefore, operating the GT as close to its clean conditions as possible may have a significant contribution to reducing the engine operating expenses. This can be achieved via an improved maintenance policy assisted by more advanced engine health monitoring (EHM) systems [9].
The gas turbine maintenance and operation costs are highly influenced by the performance of the engine. Engine overall performance relies on the performance of the gas-path components (mainly the compressor(s) and turbine(s)) and these components are major problem areas due to their exposure to different internal and external degradation causes [10]. Some of the major and most likely existing problems are drop in compressor efficiency due to fouling or erosion or object damage, loss in turbine efficiency due to blade erosion and blade creep with subsequent tip of probe and shroud damage, decrease in air flow capacity due to fouling, and an increase in flow capacity due to turbine erosion. However, these faults are not directly measurable. The gas-path diagnostic technology thus analyses the engine performance and identifies potential faults and provides an early warning before these faults develop into more complex problems. An effective and reliable gas-path diagnostic tool that could detect, isolate, and assess potential problems, based on the measurement deviations, and suggest solutions well before they develop into more complex problems is therefore very essential. This plays a major role in the investment by ensuring high levels of GT reliability and availability along with its best operating performance. There have been a variety of gas path diagnostic methods introduced so far beginning with the traditional model-based (MB) methods (such as Kalman Filter (KF) and Gas Path Analysis (GPA)) to the most advanced artificial intelligence (AI) based ones (such as Artificial Neural Network (ANN), Expert system (ES), Fuzzy logic (FL), Bayesian belief network (BBN), Deep learning (DL), and Genetic Algorithm (GA)) [9,11]. In recent years, attention has been paid to hybrid methods [12].
This paper aims to discuss the main gas-path faults that influence the GT performance, the challenges of an effective fault diagnostic system development that researchers of this field have experienced so far, and some of the most desirable attributes that an advanced system should ideally possess. The available MB, AI based, and hybrid methods are thoroughly reviewed and their advantages and disadvantages regarding how effectively the diagnostic tasks perform, undertake the challenges, and fulfill the desirable attributes are highlighted. Finally, some of the most commonly used diagnostic method validation approaches are discussed followed by conclusions and future research directions.

Gas Turbine Performance Degradation
GT performance can be degraded temporarily or permanently. The former can be partially recovered during operation and engine overhaul while the latter requires replacement [13]. Fouling, erosion, corrosion, and blade tip clearance are among temporary degradation causes, whereas airfoil distortion and untwist and platform distortions lead to permanent deterioration (meaning that residual deterioration exists even after a major overhaul). Deterioration can also be categorized as recoverable (with washing), non-recoverable (cannot be recovered by washing during operation but recoverable during overhaul), and permanent (recoverable neither by washing nor during overhaul) [14]. Relating to the service period of the engine or the evolution time frame of the deterioration, performance deterioration can also be classified into short-term/rapid and long-term/gradual deterioration [15]. Short-term/rapid deterioration happens at the early age of the GT engine as it starts its operation or may be the result of a single event like an object damage at any time during the engine's operation. Whereas long-term deterioration is formed more gradually due to the ingestion and accumulation of different contaminants and/or high operating temperature.
As shown in Figure 1, these physical faults cause changes in one or more of the performance parameters which describe an individual gas-path component's performance. The performance parameters generally include compressor flow capacity, compressor isentropic efficiency, turbine flow capacity, and turbine isentropic efficiency. Changes in the performance parameters cause consequent changes in the measurement parameters (temperature, pressure, shaft speed, and fuel flow), which are the fault indicators or symptoms in engine health monitoring.

Fouling
Fouling is the adherence of different contaminants (such as sand, dust, dirt, ash, oil droplets, water mists, hydrocarbons and industrial chemicals) on the surface of gas-path components [17,18]. It leads to an increase in surface roughness and a change in airfoil shapes [19]. The end result is performance deterioration. Compressor fouling causes a decrease in flow capacity and isentropic efficiency [20]. However, as shown in Table 1, there is no consensus on the magnitude of the percentage deviation of those parameters. For instance, according to Saravanamuttoo and Lakshminarasimha [21], compressor fouling may result in a 5% loss in flow capacity and a 2.5% loss in isentropic efficiency. Based on site test data, Diakunchak [18] reported a compressor fouling with 5% flow capacity and 1.8% isentropic efficiency reduction. In another study, it has been reported that the change in flow capacity due to compressor fouling is equal to 1.25 times the associated change in efficiency [13]. On the other hand, model simulation results reported by Aretakis et al. [22] showed that flow capacity deviation by 3.1% reduced the isentropic efficiency by 0.906%. However, all studies agreed that fouling influences the flow capacity more than the efficiency.

Compressor Fouling Consequences
Ref. ΓC ↓ by 5%, ηC ↓ by 2.5%, and power output ↓ by 10% [21,23] ΓC ↓ by 5%, ηC by 1.8 %, power output ↓ by 7%, and heat rate ↑ by 2.5% [18] A 1% reduction in Γc resulted in a 0.8% ηc reduction [13] ΓC ↓ by 3.1% and ηC ↓ by 0.906% [22] Power output reduces between 2% (under favorable conditions) and 15 to 20% (under adverse conditions) [24] ΓC ↓ by 5%, fuel consumption ↑ by 2.5%, and power output ↓ by 8% [25] Compressor fouling is responsible for 70 to 85% of the total performance loss of a GT [18]. According to Diakunchak [18], a 5% flow capacity and a 1.8% isentropic efficiency reduction due to compressor fouling, could result in a 7% loss in power output and a 2.5% increase in heat rate. Whereas, Lakshminarasimha et al. [23] reported that a 10% reduction in power output could result in a 5% mass flow rate and a 2.5% efficiency reductions due to compressor fouling. This result agreed with the result in [21]. According to Meher-Homji and Bromley [24], compressor fouling could result in a loss of power output as high as 20% under adverse conditions. These changes are immediately corrected by increasing the fuel consumption through the automatic engine control system. A 2.5% increase in fuel consumption due to a 5% flow capacity reduction was reported by Zwebek and Pilidis [25]. Compressor fouling could also decrease blade tip clearance [26] and surge margin [27] and increase turbine entry temperature (TET) [28].
Different studies on multistage axial compressor fouling declared that only the first few stages are subjected to fouling, and level of fouling is not uniform at different stages [29,30]. An experiment based studies on a 16-stage axial compressor [31] showed that the number of stages affected by the fouling reaches 5 to 6 and the degree of fouling diminishes from the suction end to the delivery end. A similar study by Aker and Saravanamuttoo [29] revealed that the first 40-50% of stages of a 16stage axial compressor are exposed to fouling. Although the first few stages of the axial compressor are subjected to the highest amount of foulant, during compressor washing the deposit moves to the rear end stages and accumulates, and thereby influences the power output [32]. The degree of compressor fouling and the extent of its impact on engine component's performance depends on several factors including the number of stages, surface roughness, airfoil loading, and the contaminant nature [33].
Fouling based performance deterioration can be reversed by compressor washing using water and/or detergents [24]. There are two types of compressor washing, namely, online and offline [34]. The former is performed during operation, while the latter needs to shut down and cool the GT. These washing regimes are discussed in detail in [35]. Although the initial stage of fouling deposit does not cause an immediate degradation, once it has been accumulated, the deposit removal task is time taking and costly [36]. Online washing is important to minimize the foulant deposit and reduce the frequency of offline washing. The online washing alone is not effective to completely remove fouling, while the offline scheme is capable. The frequency of both online and offline washing and the duration between them depends on the operating condition of the engine [37]. The washing process should be assisted by an optimized schedule taking into account economic and safety issues [38]. This is because frequent washing increases downtime and maintenance cost and sometimes it may also lead to premature blade surface erosion. On the other hand, a long duration may cause an incomplete performance recovery. Fouling-based performance deterioration is mostly recoverable if the offline washing is performed when the reduction in compressor flow capacity reaches about 2-3% [39].

Erosion
Erosion is the gradual loss of materials from the surface of gas-path components caused by the ingestion of contaminants such as sand, dust, dirt, ash, carbon particles, and water droplets [40]. Among these causes, sand is the most common due to its occurrence on most of the GT application areas. The particulates that are causing erosion are usually 20 μm or more in diameter [18]. Erosion can attack all the gas-path components although the degree of influence is higher for turbines than compressors. It can result an overall performance loss of about 5% [41]. Like fouling, performance deterioration subject to erosion can be represented by flow capacity and isentropic efficiency changes. Efficiency decreases during both compressor and turbine erosions because of an increase in blade surface roughness and tip clearance and changes in airfoil profile. Whereas, flow capacity decreases upon compressor erosion and increases upon turbine erosion [42]. According to Ref. [43], the ratio of change of flow capacity to efficiency is 2:1. The effect of erosion is less for industrial GTs than aircraft engines due to the presence of a more effective air filtration system [44].

Corrosion
Corrosion is an irreversible deterioration of components as a result of oxidation reaction or chemical interaction with inlet air contaminants (sodium and potassium salts, mineral acids and other chemically reactive elements including sodium, potassium, lead, and vanadium) and combustion gases (for instance sulfur oxides) [45,46]. It can be classified as cold and hot corrosion [47]. The corrosion due to airborne contaminants in combination with water is called cold or wet corrosion and especially affects the compressor airfoils [46]. The hot corrosion occurs due to combustion gases containing certain contaminants and/or molten salts, which especially affects the turbines [48]. Corrosion due to hot gas contaminants is more severe and highly influenced by the gas temperature [45]. Salt is the main cause of corrosion in both compressor and turbine components [49]. It decreases compressor flow capacity, compressor isentropic efficiency, and turbine isentropic efficiency and increases turbine flow capacity [50]. Corrosion effects can be prevented by a proper coating [40].

Foreign Object Damage (FOD)/Domestic Object Damage (DOD)
Gas-path components are subjected to damage due to the foreign objects being injected into the engine (such as birds or any other wildlife, stones, frost, snow, ice, and runway gravel) or domestic objects (broken out engine parts like blade sections or large carbon particles from the fuel nozzles). Foreign object damage (FOD) is one of the most common problems, usually in aircraft engines [13]. The damage from foreign objects varies from a non-recoverable deterioration to a catastrophic failure, as in the case of blade off or large object ingestion in the engine [18]. It shows a rapid shift in the gaspath measurements. In addition, engine vibration may come from unbalanced material loss or aerodynamic excitation from blade distortion due to FOD [13]. FOD highly influences the components isentropic efficiency than flow capacity due to its impact on the blade surface roughness and distortion [50]. The magnitude of the loss depends on the type and nature of the FOD/DOD. If the damage causes a material loss on the blade surface, the flow capacity will increase, or if the foreign object is blocking of the gas-path, the opposite will be experienced [51].

Increase in Blade Tip Clearance
Blade tip clearance refers to an increase in the clearance between moving blades' tips and the casing or stationary blades' tips and the rotating hub due to the removal of materials caused by particulate ingestion, thermal and centrifugal expansion, and erosion [13,14]. It can also be caused by rotor assembly vibration due to excess speed during the starting cycle [18] or the rubs between the stator assembly and rotor assembly due to thermal and centrifugal expansions [52]. It causes a nonrecoverable performance deterioration. The increase in clearances will increase the leakage and thereby a performance deterioration [53]. The performance deterioration due to this fault can be represented by efficiency and flow capacity reductions [54]. For example, it has been reported that an increase in tip clearance by 0.8% could result in up to a 3% and 2% reduction in flow capacity and isentropic efficiency, respectively [55]. According to Diakunchak [39], a 1% increase in blade tip clearance would lead to over 1% loss in power output and overall efficiency. A 1% to 3.5% increase in blade tip clearance would also cause up to 15% drop in the stage pressure ration as reported by Kurz and Brun [45]. Table 2 summarizes contaminant types and their effects on the physical and thermodynamic characteristics of the gas-path components of GTs.

Fault Diagnostics
There is inconsistency in the literature on the terminology and definition of fault diagnostics. Some of the commonly used terminologies are fault diagnostics [60,61], fault detection and isolation (FDI) [62,63], fault detection and diagnostics (FDD) [64,65], fault detection, isolation, and identification (FDII) [66], fault detection, isolation and accommodation (FDIA) [67,68], fault detection, isolation and recovery (FDIR) [69] and identification and fault diagnostics [70]. This makes it difficult to understand the goals of the contributions and to compare the different techniques. For example, the definition of the term "isolation" in FDI and FDII is different in some papers. In the former case, it refers the process of determining the fault type and location followed by estimating its level whereas in the latter case it does not include the fault level estimation. However, the broader research community, including the military and other industry sectors, defines fault diagnostics as the procedure of detecting, isolating and identifying an impending or incipient failure condition, during which the affected component is still operational, even at a degraded mode [71]. Each element in the fault diagnostic process is further defined as:

•
Fault detection: Detecting the presence of an abnormal behavior, which may gradually lead to the failure of the system or part of it.

•
Fault isolation: Determining the type and location of the fault(s).

•
Fault identification: Estimating the magnitude of the fault(s). Figure 2 shows the general conceptual model of performance-analysis-based GT fault diagnostics, adapted from [72]. Usually, complete fault diagnostics requires three basic activities; data acquisition, data processing, and diagnostics. Each of these phases are equally significant and critical in the attempt to provide a reliable and practically useful decision support mechanism. Data acquisition is the process of collecting and storing the necessary engine performance data for fault diagnosis. The second step, the data processing task, involves two basic activities: data screening and analysis. Data screening is the process of filtering outliers and reducing noises followed by validation, through an appropriate screening technique. This helps to minimize the effect of measurement uncertainties on the fault diagnostic result. Feature extraction starts from baseline establishment that represents a clean condition operation. Since the measurement deviations could be due to load or ambient condition changes, establishing the baseline requires correcting the measurements against these variations so that the deviations due to the actual engine faults and sensor problems can be determined. Regardless of the other effects, the measurement deviations due to performance degradation provide relevant information about the nature of the fault signatures in engine gas-path fault diagnostics. Fault diagnosis is the decision-making step in which algorithms are applied to detect, isolate and identify various faults.  [72]). It shows gas turbine gas-path diagnostic steps: Ingestion of gas path degradation causes, performance deterioration, measurement deviation, and fault diagnostics.
Fault detection is the very important step in the process of fault diagnostics. Trend shift detection and binary decision approaches are the two commonly applied techniques [73]. This task is performed based on the difference between the predicted and observed measurements or residuals ( Figure 3). Ideally, the residuals should be very close to zero when the engine is clean and deviate noticeably from zero when a fault occurs in the system. However, in reality, due to measurement non-repeatability and model uncertainty, a suitable threshold should be selected, to avoid false alarms. After having an appropriate threshold selection, when the engine is running in a clean condition, all the measurement residuals are expected to lie below the threshold. Conversely, when any kind of abnormal condition occurs, one or more measurement residuals will probably deviate from the selected threshold(s). On the other hand, in the case of the binary decision, the residual is considered as a signal which is zero when the system is functioning properly and different to zero when some abnormal behavior is observed. After a successful fault detection process, the location of the fault and its type should be determined. This process may include separating different sensor faults [74], distinguishing sensor and actual component faults, and classifying different component faults [62]. Like the detection, measurement residuals can be used in the isolation process based on proper threshold selection [75] or the fault isolation problem can be treated as a classification problem, as reported in [61,76,77]. However, the fault detection and isolation activities do not provide quantitative information about the health status of the engine. Hence, maintenance decision requires an understanding of the severity of the deterioration. Usually, a component's isentropic efficiency and flow capacity deviations (health indices) are used to represent the health status of engine gas-path components. Hence, the progressive deviations of these parameters can be estimated using the measurement deviations. The review of the available literature methods will be presented in the method review section.

Challenges of Successful GT Fault Diagnostics
In performance analysis-based engine gas-path diagnostics, there are different factors influencing the attempt to obtain sufficiently accurate and practically useful solutions. The most significant challenges are summarized as follows.
1. Nonlinearity of the diagnostic problem. The relationship between dependent parameters (measurements) and independent parameters (performance parameters) is highly non-linear. The complexity of the nonlinearity of the diagnostics problem increases as two or more components are affected simultaneously and/or sensor and component faults exist together. The diagnostic system to be proposed should thus be capable of dealing with the non-linear nature of the engine behavior. 2. Measurement uncertainty. In reality, the data obtained from real engine operation cannot be error-free [78]. This error may come from the sensor itself (due to improper installation, miscalibration or malfunctioning), the operating environment, or the operator itself. Measurement uncertainties provide incorrect information about the nature of the fault signatures, thereby causing misinterpretation during engine health assessment. Noise and bias are the two categories of measurement uncertainty [79]. Noise is a measurement's nonrepeatability due to the engine harsh operating environments. Whereas bias refers to a sensor fault which is the difference between the average measurement and the actual value defined by the National Bureau of Standards (NBS) [78]. It is a fixed error (can be higher or lower than the actual value) that usually occurs as a result of a flaw in the sensor itself. Sometimes, the values of these uncertainties may reach a level often comparable to the actual measurement deviations caused by component deterioration. If this effect is ignored during the diagnostic method development, the solution will be unrealistic. Conversely, engine fault diagnosis using uncertain measurements may give an erroneous result, particularly, in MB methods. Therefore, either the sensor problem should be treated and corrected prior to the component fault diagnosis or the component fault diagnostic technique should tolerate these effects. 3. Availability of limited sensors. GT engines are packed with different sensors for different purposes such as process control, health monitoring, and diagnostics. Measurement parameters which are essential for engine performance analysis are known as standard measurements [80]. For instance, these include pressure, temperature, fuel flow rate, and spool speed. The deviations of these measurements provide relevant information about the nature and severity of components' performance deterioration. A careful measurement selection is crucial for effective fault diagnostics, especially in the case of MB methods. On the one hand, an accurate gas-path analysis requires a large number of measurements since the engine model is developed based on several instrumentation suites. In order to satisfy the requirement for a determinate equation, the number of measurements (the dependent parameters) has to be at least equal to the number of performance parameters (the independent parameters). On the contrary, in real engine service, the number of instruments available are limited due to weight and bulk issues (particularly in aircraft and marine applications), sensor noise and bias problems, the need of a reduced sensors' installation and maintenance cost, and the absence of the gas generator turbine inlet sensors (since they cannot withstand the very high operating temperature) [81,82]. It is also impractical to measure the air flow rate due to the absence of the technology. Therefore, the diagnostic system is accountable to give the required solution using the available limited information obtained from the minimum sets of measurements. The performance of a gas-path fault diagnostics scheme is highly influenced by the number of simultaneous faults [83]. This is because, when two or more components/sensors are affected together, there is a chance of producing similar or obscure fault signatures, thereby masking or compensating for each other's effects. For example, in the case of double component faults (DCFs), when one of the components is lightly affected, the combined effect may result a confusing pattern with that of a single component fault (SCF). Likewise, if both components are severely affected, they may produce similar patterns with that of a triple component fault (TCF), and as a result, the DCFs may wrongly be classified as TCF or vice versa [83]. In general, as a multiple fault scenario, concurrent component faults, concurrent sensor faults, or concurrent sensor and component faults possibly exist during the engine lifetime. 5. Operating condition variations. Due to load and/or ambient condition variations, the engine operating point may not be fixed. Therefore, operating point changes should be taken into account for practicability. A common way to avoid the influence of operating conditions variations is to form a "baseline" model, compute measurement deviations, and use them as network inputs instead of measurements themselves. Usually, this requires the model of the normal state to figure out the "baseline" [74,84]. Different GTs have different baselines based on their configuration and application environment. Hence, for a reliable fault diagnosis, an accurate baseline establishment is critical. 6. Lack of standards in defining and representing fault diagnostic problems [85]. In the literature there is no consistency in defining and representing GT fault diagnostic problems. The majority of the available methods in the open domain are considered to be different platforms with different levels of complexity and applied different performance evaluation metrics. This inconsistency causes difficulties in exchanging diagnostic ideas, information fusion between fault diagnostic results of different engine systems, and a one-to-one comparison of different techniques. 7. Unavailability of data in the required type, quality and quantity. Fault diagnostic method developers require relevant and reliable operational data, which can sufficiently represent the healthy and unhealthy engine conditions, to demonstrate and verify new algorithms. However, because of the very limited access to engine operational data (owing to proprietary and liability issues) and lack of deteriorated engine data due to the frequent washing actions, it is difficult to obtain the required data [81]. Performance data can be generated by either intentionally ingesting different physical fault causes/contaminants into the operating GT or implanting artificial fault patterns to the engine performance model [86]. The former alternative is not recommended since it is not technically and economically feasible. Whereas the latter, which is the most widely used alternative in this field, requires an accurate model.

Absence of Diagnostic Methods Validation Techniques: GT users need a practical tool to
evaluate the performance and effectiveness of a newly proposed algorithm in order to incorporate to their plant. Up to now, there are no standards to effectively evaluate the technical and economic feasibility of new algorithms [81]. The general procedures used by the research community so far will be presented later in this paper.

Desirable Attributes of a Fault Diagnostic System
According to the previous studies on machinery health monitoring and diagnostics including GTs [87][88][89], an effective fault diagnostic system is ideally expected to fulfill the following characteristics. These desirable attributes could also be used as selection criteria or as standards of various diagnostic approaches. i.
Fault diagnostic accuracy: For a correct maintenance decision, the fault diagnostics technique should able to detect, isolate and identify gas-path faults successfully. A fault detection task commits two types of errors: false alarms and missed detections. Both detection errors are equally harmful. A false detection leads to an increased maintenance cost, which is the opposite of the aim of fault diagnostics. Conversely, a missed detection may cause a significant performance loss or even system/component failure. Hence, in the detection step, the so-called normal class has to be distinguished from the abnormal class with reasonably acceptable accuracy. This is very important to avoid unnecessary or unexpected downtimes and enhance reliability. As well as fault detection, the diagnostic system should successfully determine the fault type and location. In particular, a GT fault isolation algorithm is accountable to separate sensor faults from actual engine component faults followed by classification of different component faults. All the possible single and multiple sensor and/or component fault cases are required to be isolated correctly using the minimum instrumentation suite. For a final maintenance decision, an accurate fault-level estimation is highly desirable so that the operator can make a strategic maintenance schedule of possible maintenance actions. ii.
Robustness: For a practical implementation, diagnostic systems are highly required to be robust/tolerant against measurement uncertainties. iii.
Explanation facility: To support engine users in the maintenance decision process, the fault diagnostic tool is required to be able to explain the nature of the faults (i.e., their root cause, current situation, and propagation) and justification of the recommendations. iv.
Simplicity/user-friendliness: The method should be simple to use and easy to understand by the operators so that an urgent decision can be made without the presence of any expert. It should thus be capable of providing a user-friendly interface. v.
Adaptability: GT performance is sensitive to ambient condition changes or load variations. Therefore, a performance-based GT fault diagnosis system should be able to adapt to those variations so as to maintain its performance. vi.
Memory and computational requirements: The storage capacity and computational requirements (computational speed, time, and complexity) are the two basic features of a GT fault diagnosis algorithm, particularly for online applications. vii.
Reliability. Concerns about the practicability of the method for an engine with limited numbers of sensors and measurement errors. It should also be simple and cost-effective with minimum downtime for repair and maintenance. viii.
Comprehensiveness. This is the measure of the ability of the method to incorporate improvements when it is necessary and to be interfaced with other engine health management systems through data fusion in order to obtain a complete condition-based maintenance framework. ix.
Flexibility. It measures the degree of capability of the method, optimizing its configuration and adapting/extending the system to work on different engines or on the same engine running at different operating conditions. A low set-up time is desirable to implement this feature.

State-of-the-Art: GT Gas-Path Diagnostic Methods
In the field of GT diagnostics, several methods have been devised by engine manufacturers and the research community over the years [90]. As shown in Table 3, different authors categorized these methods into different groups. In the present review, based on the type of information used in modeling, the available methods are categorized into two main groups; MB and AI-based. Accordingly, state-of-the-art gas-path diagnostic methods under each group has been undertaken. Different issues related to their working principles, applications for gas-path diagnostics, capability of undertaking the challenges (Section 3.1) and fulfilling the desirable attributes (Sections 3.2), and their advantages and limitations are reviewed and summarized.

Model-Based Diagnostic Methods
MB diagnostics methods are the first-generation GT CBM methods and they rely on the thermodynamic model of the engine. According to this approach, the relationship between the gaspath measurements and the performance parameters is determined by explicit mathematical and thermodynamic equations. GPA and KF are the two most intensively investigated MB methods [91]. Engine manufacturers and military sectors have been using these methods for the past four decades [95].

Gas-Path Analysis
A GPA is a mathematical procedure that used to diagnose gas-path components based on the measurement deviations. In this strategy, the diagnostic problem requires the search for a best match between measurement changes and the associated performance parameter changes that cause the measurement changes. According to [96,97], the thermodynamic relationship between gas-path measurements and components performance parameters can be expressed as: is the measurable parameter vector and M is the number of measurement parameters, is component performance parameter vector and N is the number of performance parameters, w  is the ambient condition and power setting parameter vector (called input vector), and h( ) is a vector valued function determining the relationship between the dependent and independent parameters, usually non-linear.
Linear GPA (LGPA) LGPA was first introduced by Urban [96] upon the assumption of a steady state process with no ambient condition and load variations and negligible measurement uncertainty effects (Equation (3)). The relationship between the dependent and independent parameter changes was assumed to be linear. Mathematically it can be expressed as: is the vector of measurement deltas, ICM is the so-called influence coefficient matrix, and X  Δ is the vector of performance parameter deltas.
The estimation of X  Δ is a reverse process performed using the inverse of the linear ICM which is referred to as Fault Coefficient Matrix (FCM), as given in Equation (4).
The relationship between ICM and FCM in matrix form can be presented as:

Degradation
Based on the number of dependent and independent parameters, the estimation of FCM will have three different cases [81].
• Case 1. (When M = N): When the number of measurements and performance parameters are equal, the number of unknowns and equations will be equal, and thereby the problem will be determinable. In this case, the ICM is a square matrix and invertible. • Case 2. (When M > N): When the number of measurements is greater than the number of performance parameters to be estimated, the problem will be over-determined. In this case, the solution can be found applying the least square estimation method by replacing H −1 with the socalled pseudo-inverse.
• Case 3. (When M < N): In the real situation of a GT operation neglecting the effect of sensor noise and bias leads to an unrealistic solution. Conversely, considering all these issues including model uncertainty would result in an undetermined set of equations. The suitable solution for this problem scenario is given by Volponi [81].
After Urban, LGPA has been studied by several researchers like those in [41,[98][99][100]. During the early ages of gas-path diagnostics, it was used by engine manufacturers like Rolls-Royce [101]. It has been shown that for deviation values higher than 1%, the LGPA provides an unreliable solution [102]. The reliability of this method highly influenced on the accuracy of the ICM, the level of noise and bias, and the number of instrument suite considered [91].
In a real GT engine health condition, the assumption of a linear relationship between measurements and performance parameters becomes increasingly unrealistic, especially when the component's deterioration level exceeds the value assumed for LGPA and/or while the number of gas-path faults increases [9]. The NLGPA scheme is capable of undertaking the nonlinearity of the engine behavior. The thermodynamic relationship between the dependent and independent parameters for a non-linear engine behavior is given as Equation (6) [81]. where: is vector of measurement delta and can be expressed as: is performance parameter delta vector and can be expressed as: • H is the ICM, which determines the relationship between ∆ ⃑ and ∆ ⃑ . It is the percentage delta in each measurement parameter for the corresponding percentage change in each performance parameter. For an infinitesimal change in the independent parameters, the corresponding ICM is the Jacobian.
Then, the corresponding performance change can be computed using the equation: To consider the non-linear behavior of the engine, an iterative Newton-Raphson method could be applied to the LGPA until the solution converges [99]. This is done by minimizing the error objective function (Equation (8)), which is the difference between the predicted measurement vector ( Z   ) and the actual measurement vector ( Z  ). For the first iteration, a small delta on the component performance is introduced and the corresponding ICM is generated. The FCM is then determined by inverting the ICM. The performance parameter deviation vector is computed by multiplying the FCM with the deteriorated engine measurements. From the calculated results, a new ICM and FCM are generated and the procedure is repeated until the solution converges. The output of the first iteration is the baseline for the second iteration, the output of the second iteration is the baseline for the third iteration and so on, until the last iteration.
The convergence of the solution can be evaluated using the error root mean square (RMS) value as given in Equation (9) [103]. When the RMS value reaches the target value, the iteration will be terminated. The iterative procedure is illustrated in Figure 4. Figure 4. Schematic illustration of Newton-Raphson based on gas path analysis (GPA) methods (adapted from [104]).
The NLGPA approach was introduced by Escher [99]. Since then, several diagnostic algorithms with some improvements have been contributed by other authors [9]. Its effectiveness is highly influenced by the number and location of measurements on the gas-path. Ogaji et al. [86] used this approach to investigate the effect of measurement selection on engine fault diagnostic accuracy and suggested the best measurement sets corresponding to different fault scenarios. Recently, Li [105] developed a novel GT performance and health status estimation method for a single-shaft aero turbojet engine using adaptive GPA. He used nine gas-path measurements to assess five performance parameters. The test results showed that the proposed method is capable of identifying gas-path faults accurately even in the presence of measurement noise. The diagnostic effectiveness of three different GPA methods have been investigated using different test fault cases for the double shaft GT engine by Stamatis [106]. Similarly, the fault diagnostics effectiveness of GPA and AI approaches have been compared and their pros and cons identified based on case studies by Kong [93]. Larsson [107] developed a systematic design procedure to construct non-linear MB fault diagnosis method for industrial GTs. In another study, Jasmani et al. [80], devised a new measurement parameter selection scheme by combining analytical approach and measurement subset concept. Likewise, Chen et al. [108] proposed an approach that can select the optimal number of engine measurements for engine GPA purpose. However, GPA techniques can diagnose GT faults if, and only if, noise and bias does not exist [93].

The Kalman Filter
KF is a MB iterative algorithm that uses a set of equations and consecutive data inputs to estimate the true value of the system parameter being measured when the measured values contain a certain amount of uncertainty. It was initially developed by Rudolf Kalman [109], in 1960, and is basically a predictor-corrector technique by which the state of a system is determined at time tk using only the state at previous time step tk−1. The discrete time KF [109] and the continuous time KF [110] are the two types of KF algorithms [111]. The complete KF procedure is composed of two phases; the prediction phase and the correction or measurement update phase. In the prediction phase, the KF produces estimates of the current state variables, along with their uncertainties. Once the outcome of the next measurement is observed, in the correction phase, these estimates are updated using a weighted average, with more weight being given to estimates with higher certainty. Figure 5 represents the block diagram of the discrete time KF method. The problem is defined mathematically as follows: System equation: where X ∈ R N is the system state vector, k is the time index, Φ ∈ R N×N is the transition matrix/measurement matrix, u ∈ R M is the control vector, G is the input translation matrix, wk is the system error matrix, Z ∈ R M is the measurement vector at time k, vk ∈ R M is the measurement error (noise) matrix, and Hk ∈ R M×N is the model matrix.
The aim of the KF is to estimate the system X based on prior system knowledge and the available noisy measurement, as a linear combination of all observations up to time k. The following assumption should be satisfied: [ ] 0 = k w E (14) [ ] 0 represents the expectation operator. o The initial system state, system noise, and measurement noise are uncorrelated o The system noise and measurement noise are white, independent, and Gaussian distributed with known covariance matrices.
Although the predicted state is given by: According to [100,113], a complete discrete KF scheme to solve this problem consists of the following five equations: 1. State estimate extrapolation: 2. Covariance of the estimation error (State Covariance Extrapolation): 3. Kalman Gain (KG) Computation:  (18) and (19) represent the prediction part of the algorithm, while Equations (20)-(22) represent the correction part of the algorithm. The prediction part simply consists of the dynamic model, which predicts the next data of the system (at time k + 1) based on the last data (at time k − 1) or the current data (at time k). The correction part takes the error between the current estimate and the predicted output and uses it to correct the state estimates to obtain the best estimate of the system state Xk based on an old observation data at the time k. The mixture of the prediction and correction is determined by the Kalman Gain. It is the ratio of the error in the estimate divided by the sum of the errors in the estimate and in the measurement. This Gain determines the extent to which the filter follows the model or the measurement. The overall result is the best guess of the parameter to be determined, which is obtained by combining these two different sources of information. The adjustment to the previous estimate to come up with the new estimate depends upon the Gain. Based on the previous estimate, the Gain will decide the relative weightage of the new measured value and the previous estimate to update the new estimate. Once the current estimate is determined, the error in the estimate should be determined so as to use in the next time round.
KF methods were introduced as a fault isolation and assessment technique in the late 1970s, and the overall architecture is shown in Figure 6 [114]. It was brought it into practice with an aim to overcome the two most GPA limitations: poor robustness against measurement uncertainties and the underdetermined problem due to the presence of limited numbers of measurements. The success attained in these early programs encouraged the use of these techniques in subsequent years [97,100,115]. The linear KF (LKF) has reliability limitations on non-linear gas-path diagnostic problems. However, the modified versions of this method (or the nonlinear KF (NLKF)) such as extended KF (EKF) and Iterated EKF (IEKF), can solve the problem by linearizing the current mean and covariance using Taylor series expansion [116,117]. The well-known engine manufactures (General Electric, Pratt & Whitney, and Rolls-Royce) have been utilized modified KF based fault diagnostic methods since 1987 [100]. It is also integrated with the currently available GT gas-path diagnostic tools such as Auto Analysis, MAPIII, TEAMIII, a self-tuning onboard real-time model (STORM), a state variable engine model (SVM), GEM, COMPASS, an engine health management (EHM) and ADEM [9,111]. KF based fault diagnostic techniques are effective for engine problems where performance influence coefficients are available as the model [111]. However, those methods have reliability limitations. Most MB techniques which are relatively coping with measurement noise and bias are developed utilizing this technique [118]. The potential of KF for a single gas-path component fault isolation was evaluated by Volponi et al. [113]. Multiple KF models were used for sensor and actuator fault detection and isolation purpose together with a component fault detection in an aircraft engine by Takahisa et al. [119]. The effectiveness of KF on sensor selection for a reliable engine performance diagnostics was also investigated by Simon and Rinehart [120] in comparison with a maximum a posteriori (MAP). They considered a liner engine model affected with single component faults and sensor biases. The fault detection and classification performance of the method using seven, eight, and nine sensors associated with eight health parameters were tested. Borguet et al. [121] attempted to dealt with one of the difficulties of MB methods, i.e., the existence of model biases, using simulated transient data. A modular KF based single and double fault FDI algorithm was proposed by Meskin et al. [122] for a jet engine application. Recently, the sensor FDII performance of multiple hybrid KF based system was investigated by Pourbabaee et al. [66]. In this method, nonlinear mathematical model of the system and multiple piecewise linear (PWL) models are combined to accomplish the sensor FDI task followed by estimating the fault level using modified generalized likelihood ratio (GLR) method. The capability of EKF to solve underdetermined engine diagnostic problems was also evaluated by Lu et al. [123]. They compared the performance of three different EKF estimators; basic EKF, underdetermined EKF, and resultant EKF. The test results indicated that the method was able to solve the underdetermined problem with a promising accuracy and robustness than the conventional linear KF scheme.

Advantages and Limitations of MB Methods
Every method has its own advantages and limitations. Table 4 describes advantages and limitations of GPA and KF methods related to GT fault diagnostics. In general, MB methods have more advantages in terms of early fault detection and online fault diagnostics. They can also perform both quantitative and qualitative fault diagnosis with an adequately good accuracy. Moreover, they apply the real gas-path physics and have low model complexity and computational time. Nevertheless, they suffer from model uncertainties, measurement noise, and sensor bias (even if this problem is partially addressed by KFs), and smearing effects which may lead to a misinterpretation and false alarms. They require a large number of measurements on the gas-path to provide accurate diagnostic solutions. Installing additional sensors is almost impossible due to the reasons mentioned in Section 3.1. Besides, since a very limited information is available on the public domain due to proprietary issues, an accurate GT performance model is very difficult to obtain. It is a cost-effective method with good early fault detection capability [11]. -It has high computational speed [91]. The diagnostic accuracy relies on the accuracy of the engine performance model [62].

KF
-It can provide sufficiently accurate estimation results for linear problems. -It has relatively low computational complexity. -It has low storage and computing requirements.

-
Since MB, the physical knowledge of the gas-path system can be applied to solve the diagnostics problem. -Unlike GPA, the concern of measurement uncertainty is undertaken. The actual sensor noise can be represented by white Gaussian distribution, as it is desired by KF. They are good at outlier reduction and noise minimization [111]. -It is coping with sensor noise and bias [9] -Unlike GPA, it has the potential to solve an underdetermined diagnostic problems [123].
-Even the extended KF (EKF) based methods can only handle problems with a limited amount of nonlinearity. The estimates for a nonlinear diagnostic problems are often biased and suboptimal [124].

-
Prior knowledge and Tuning: The effectiveness of KF is affected by the unknown performance deterioration and measurement noise covariance matrices. Choosing the appropriate covariance matrix (called tuning) for an optimized KF performance, based on prior knowledge, is a random and challenging task [125]. -"Smearing" effect: although in practice, most of the time, only a limited number of gas-path components and sensors are affected, the KF oppositely leans to spread of ("smear") the faults over multiple components' performance parameters and measurement parameters. An attempt to estimate all component faults and sensor faults together using the available measurements results a highly nondeterministic problem [95].
-Solution convergence problem due to model uncertainty and large sensor noise [126].

AI based Methods
The drawbacks of the MB methods forced the research community to focus on AI methods. According to Konar [127], AI is defined as "the simulation of human intelligence on a machine, in order to make the machine efficient to identify and use the right piece of "Knowledge" at a given step of solving a problem". There are many different AI methods such as ANN, DL, BBN, ES, FL, and GA. The most powerful and popular types of fault diagnostic algorithms are from AI methods [12]. Demonstrating and validating AI based algorithms requires operational data with appropriate quality, quantity, and type or model simulation data, in the absence of operational data. Figure 7 illustrates the conceptual framework of AI-based engine fault diagnostics. It has two parts; developing the diagnostic mechanism and its implementation. The task of developing the method includes acquisition of the required data and preprocessing of it, training the algorithm using the processed data, and evaluating its performance by applying the appropriate evaluation approach, for instance, using a blind test case data as proposed by Simon [128]. The potential of AI methods on GT FDII have been widely studied over the past several years. A comprehensive survey of these methods including their strengths and weaknesses is presented hereafter.

Artificial Neural Networks
An ANN is an artificial structure that processes information like a biological neuron does, except this paradigm is mathematical instead. Its model contains sets of neurons connected to each other in layers. Knowledge is acquired from input information (examples) through a learning process, and weights of connections between neurons are used to store the acquired knowledge. There are three popular ANN learning paradigms, namely supervised, unsupervised and self-supervised [125]. If the training is taken place using input and output examples it is called supervised, if it is performed using the information derived from input data only it is called unsupervised, and if it is done utilizing the same input and output information the learning is named self-supervised. There are various ANN algorithms in literature such as multilayer perceptron (MLP), autoassociative neural network (AANN), radial basis function network (RBFN), probabilistic neural network (PNN), and selforganizing map (SOM). These methods have been applied to solve different engineering problems including prediction, pattern recognition/classification, and clustering [129]. They are capable of providing efficient and reliable models if a sufficient amount of data is available [90]. ANN is a powerful tool in GT modeling for performance prediction and diagnostics due to its capability to undertake the nonlinearity of the engine behavior [130]. This is done without the need of the complex thermodynamic equations that relate the dependent and independent parameters.

Multilayer Perceptron
A MLP network is a feed-forward neural network consisting of input and output layers with one or more hidden layers in between [131]. It is called feed-forward because information from the input neurons is passed to the next layer neurons and then they compute an output based on the logistic equation and pass it to the next layer of neurons and so on until the end. The general architecture of an MLP consists of an input layer, one or more hidden layers, and an output layer. The first layer is the input layer that the input data goes in while the output layer, located at the end, computes the output value using the information coming from the hidden layers. The optimal number of hidden layers and neurons is determined based on a convergence criterion and the inputoutput mapping relationship characteristics.
Over the past decades, several studies have been done on GT diagnostics based on an MLP [113]. An ANN-based user friendly GT fault identification system was provided by Kong et al. [132]. A multiple fault detection system was developed by Matuck et al. [133] using this approach which is trained on simulation data. They considered single, double, and triple component faults together with sensor noise. However, this work was limited to fault detection only. Fast et al. [90] proposed a GT fault diagnostic schemes using MLP in order to optimize the compressor washing schedule. They have also indicated that ANN is a suitable approach to develop a performance prediction and fault diagnostic techniques if an operational data is available in the required quality and quantity. To answer the question of why neural networks are more popular than the other AI methods, Patan et al. [134] conducted a research work on two different feed-forward MLP algorithms taking into account the nonlinear behavior of the GT together with modelling uncertainty. It was pointed out that those fault diagnostic algorithms were having better early detection ability with smaller false alarms, higher fault classification rate, and more efficient fault identification than the other AI techniques. Recently, Tayarani-Bathaie et al. [135], Mohammadi et al. [136], Kiakojoori and Khorasani [137], and Vanini et al. [62] proposed a dynamic neural network (DNN) fault diagnostic techniques for aircraft engine applications More recently, an ensemble GT fault diagnosis system was devised by Amozegar and Khorasani [138] using different types of MLP networks. Nested MLP networks were also used to a fault detection and isolation application by Tahan et al. [139]. However, these methods are limited to single and double faults only. Moreover, they used efficiency and flow capacity deltas separately as a single component fault or in a pair as a double component fault although different studies on GT performance degradation like [19,21], indicated that deterioration can be most significantly represented by changes of these parameters together.

Autoassociative Neural Networks
AANNs, also known as auto-encoder, bottleneck, replicator network, or sand-glass type network [140], are an important family of ANNs with three hidden layers; a mapping layer, a bottleneck layer, and a de-mapping layer (Figure 8). They can reduce high dimensional data to a lower dimension with an insignificant information loss based on the concept of principal component analysis (PCA). Usually, the input and output layers contain equal numbers of neurons. The bottleneck layer is the middle layer with the smallest number of neurons, where the important features of the input data are captured. The number of hidden neurons associated with each hidden layer depends on the problem type and system complexity. In this regard, more detailed information is available in [141]. Compression and decompression are the two sub-networks of the general AANN structure, where the former used to compress a high dimensional input data to a low-dimensional feature and the latter tries to reconstruct the original data from the compressed version with minimum information loss. AANNs are widely used and very suitable for sensor data validation applications. Kramer [142] introduced an AANN based technique for sensor validation that is capable of coping with the nonlinearity of the data. He used the network residuals to detect and estimate sensor faults. An AANN based sensor validation technique for a turbofan engine was proposed by Guo et al. [143]. Lu et al. [144,145], evaluated the performance of AANNs for sensor noise reduction and bias detection and correction. While training the noise filtering networks, they used noisy data as an input and noise-free data as an output. The networks, therefore, tried to provide an output as close to the desired noise-free data as possible. Besides, the effect of the number of measurements on the accuracy of the proposed methods was tested using 4 and 9 parameters and achieved almost similar success rates. AANNs are better at outlier removal and noise reduction than the conventional filtering techniques [111]. A multiple sensor fault detection and isolation method using a bank of AANNs together with a MLP based fault identification technique was developed by Zedda and Singh [118] for a low-bypass-ratio turbofan engine purpose. It has been shown that AANNs are capable of successful sensor failure diagnosis, even in the presence of component faults. The effectiveness of using multiple hierarchical AANNs to diagnose single and double sensor faults in a 2-shaft industrial GT engine was analyzed by Ogaji et al. [74]. They used three networks: the first one to separate faulty and fault-free measurements, the second to differentiate sensor and component faults and the last to estimate and accommodate the amount of sensor faults. This kind of diagnostic task division is important to share diagnostic tasks, which may improve the accuracy significantly. A combined discrete wavelet transform and AANN based diagnostic system was also developed by Tamiru et al. [146] for oil system, vibration system, control system diagnostics purpose. Recently, an AANN was used for single and double sensor and component fault diagnosis by Vanini et al. [75]. However, the sensor validation performance of AANN based techniques is influenced by the amount of the noiselevel and the threshold characteristics. Minimizing the number of false alarms and missed detections is equally important, but they have opposite correlation (i.e., decreasing the number of false alarms by increasing the threshold level may oppositely increase the rate of missed detections) [111].

Probabilistic Neural Network
Probabilistic neural network (PNN) is a statistical pattern classification multilayer feed-forward network based on the Bayes pattern classification strategy (based on radial basis functions (RBFs)) [147]. The architecture of a typical PNN, as shown in Figure 9, consists of three layers; an input layer, a pattern layer and an output/summation layer. The input layer passes the input patterns to be classified to each of the nodes in the pattern layer. This layer contains individual neurons corresponding to each pattern/example in the training data set. The neurons in the pattern layer compute their responses based on Equation (23) and feed into the output layer neurons (Equation (24)). The output neurons stand for the desired output groups that the network is expected to classify the input patterns into. Thus, all the connection weights between output layer and pattern layer have a value of 1. PNN can be characterized as: they are simple in design, have a similar training nature with backpropagation algorithm, have quite a significant pattern layer, and low computational speed [124]. If Xij ϵ R MN (i = 1, 2, … M and j = 1, 2, … N) is the input vector with N number of input parameters and M cases and K is the number of target classes, for an input pattern x and assuming Gaussian probability density function (PDF), the pattern layer and output layer neurons' outputs can be computed, respectively, as [148]: where β is the smoothing parameter.
The capability of PNN for GT fault diagnostics was checked for the first time by Eustace and Merrigton [149] by implementing it to a GE low-bypass F404 military engine. Romessis and Mathioudakis [150] also used this method for sensor fault diagnosis in a deteriorated engine condition. In another study, PNNs were used for sensor fault diagnostics in a jet engine for on-board application [151]. Like the other ANNs, PNN applies the concept of pattern recognition technique for fault isolation and identification tasks [152]. It uses a probabilistic measure to decide the type and location of the fault and to assess its magnitude. Nested PNNs have been used by Ogaji et al. [83] for sensor and component fault diagnostics in a 2-shaft aircraft engine. For this purpose, five PNN nets are used: the first is to separate the fault and no-fault patterns; the second is dedicated to sensor and component fault classification. The remaining three nets are used for component fault classification. Then, sets of radial basis networks are integrated to quantify the magnitude of the faults. The results revealed that the proposed scheme was capable of diagnosing all the considered fault scenarios with sufficiently high accuracy. Recently, the fault classification performance of PNN was compared with MLP and RBF by Loboda and Robles [153], and obtained similar accuracies. In general, as per this review, most of the previous PNN based GT diagnostic techniques were utilized for fault classification tasks.

Radial Basis Function Networks
Radial basis function networks (RBF), also called a kernel function, is a multivariate approximation function whose value depends only on the distance from the origin or center c, i.e., the Euclidean distance [152]. It performs a nonlinear transformation over the input vector before it is fed for classification. By using such a nonlinear function, it is possible to convert a linearly nonseparable problem to a linearly separable one. RBF increases the dimensionality of the feature vector. The most commonly used types of RBFs include Gaussian, multiquadric, and inverse multiquadric [154].
As shown in Figure 10, the general structure of an RBFN composed of three layers: an input layer, a hidden layer, and an output layer [131]. Like other ANNs, the number of nodes of the input layer is equal to the dimension of the input vector. The task of the hidden layer neurons is to project the input vector into a higher dimensional vector. Hence, the number of neurons in the hidden layer must be greater than the input layer. This is because if the feature vectors are linearly non-separable in the input dimensional space, then it is more likely that those feature vectors will be linearly separable when we cast them into a higher dimensional space. Once the feature vectors are linearly separable in the M-dimensional space of the hidden layer, the linear combination of the outputs of the hidden layer is likely to give the class it belongs to. The class combination is decided by the connection weights from the hidden layer nodes to the output layer nodes. In general, in this process, there are two important tasks: determining the receptor and the distribution of the function and the connection weights. Every neuron in the hidden layer represents an RBF. Due to its useful analytic properties, in addition to localization, the Gaussian function is more commonly used [154]. The number of nodes at the output layer is the same as the number of desired classes. Like the other MLP nets, RBF output units use linear summation functions. Although RBF networks can perform different tasks like function approximation, pattern classification, and dynamic system modelling, they have been widely used in function approximation [125]. RBFs are capable of approximating any arbitrary function of the network layers utilizing the training input dataset. The use of RBFs involves many advantages over backpropagation based feedforward neural networks, for example, rapid training, very low computational expense, and very good at interpolation, generality and simplicity [125]. They are highly localized and need a huge quantity of training data, thereby creating a large amount of nodes that allows for rapid training. In addition, they work better than any other training technique and are able to approximate any continuous function [155]. However, after training, their computational speed to perform classification or approximation is low.
The Euclidean distance is computed from the point being evaluated to the center of each neuron, and an RBF is applied to the distance to compute the weight for each neuron [152]. The further a neuron is from the point being evaluating the less influence it has. Leonard et al. [156] suggested a Kmeans clustering technique to determine cluster centers, a K-nearest heuristic technique to determine the width of the RBF and multiple linear regressions to determine connection weights of the layers.
Assume a set of training feature vectors and suppose C number of classes are required, for Gaussian function the network output can be computed as [157]: (25) where yk is the kth output, wkj is the weight of the connection between the jth hidden unit and the kth output unit, µ is the receptor/center of the function, σ (as shown in Equation (26) μ μ σ (26) where p is the nearest number of RBF classes, r is the number of entries, µki and µkj represents the receptors of the ith and jth hidden units of the nth entries.
Previously, a RBFN was often used for data cleaning prior to a fault diagnosis [157]. In this study, it has also been shown that their proposed RBFN based noise filtering technique reduced the measurement noise by 75-81%. This performance is much better than the conventional linear filters. The problem of measurement outliers and noise was undertaken by Roy et al. [158] using a RBFN. The result showed that from 59-73% of the data outliers and noises were removed, which is much better than the traditional filtering methods. Ogaji et al. [5] used sets of RBFNs for gas-path components fault approximation application. The proposed method comprises of three steps: fault detection, fault isolation, and fault identification. The first two tasks were performed using sets of PNNs. After the detection and isolation stages, RBFN based techniques were applied to estimate the level of component faults using their corresponding fault patterns coming from the associated isolation networks. The applicability and performance of RBFN for a GT fault identification were compared with a MLP by Loboda et al. [159]. They concluded that the RBF network resulted in a little more accurate results than the conventional MLP network, however, the former requires much more storage capacity and computational time. Recently, a similar work has been conducted using RBFs [160]. As per [158], RBFNs work better than many other training techniques and require much less training computational time and cost than backpropagation based algorithms.

Self-Organizing Map
A self-organizing map (SOM) is an unsupervised learning neural network algorithm that transforms high-dimensional input data into one or two-dimensional outputs [161]. As seen in Figure  11, its structure typically consists of an input layer and a Kohonen layer connected with a set of weights. The colored groups on the feature map indicate the neighbor nodes arranged according to their similarity. While training, the weights are adjusted based on the input data samples, with no target vector available, in an unsupervised manner. The detailed mathematical expression of an SOM model and its learning mechanism can be found in [162]. Previously, it was mainly utilized for data clustering/classification and visualization [161,163]. It has also been applied for many other practical applications including rotating machinery diagnostics [164][165][166] and industrial and medical diagnostics [167][168][169]. Due to the lack of labeled engine operational data, implementation of such unsupervised learning networks in the process of gas turbine fault diagnostics may yield significant benefits.
Roemer [170] proposed a modular mechanism that can detect and classify developing engine faults in a Rolls-Royce F405 gas turbine. He integrated both supervised and unsupervised (SOM) neural network learning paradigms for life, vibration and performance monitoring purposes of the target engine. A SOM-based clustering technique was devised by Kim et al. [171] that was used to diagnose three different faults (combustor liner burn through, bleed band leakage, and EGT sensor failures) in a mid-sized jet propulsion engine utilizing measurement residuals from three sensors (core speed, exhaust gas temperature, and fuel flow). Come et al. [172] applied SOM for aircraft engines data visualization in combination with two other modules, one to normalize the effect of ambient condition variations on the measurements (based on a linear regression approach) and the other for fault detection (based on the joint use of a recursive least squares (RLS) and GLR algorithms). In another study, Cottrell et al. [173] used SOM to visualize an aircraft engine health evolution based on preprocessed data through a General Linear Model (GLM). A fault diagnostic algorithm for gas turbine fuel systems was also introduced by Cao et al. [162] based on an improved SOM approach. In this analysis, eight different fault cases were taken into account, which may cause the failure of three components: oil gauge, needle valve, and delivery valve. The performance of a hierarchical clustering (HC) and SOM based fault detection and diagnosis method has been evaluated by Zhang et al. [64] using measurement deviations obtained from a group of 19 and 16 sensors of a single shaft industrial gas turbine. As demonstration case, fault scenarios, sensor faults, bearing tilt pad wear, and early-stage pre-chamber burnout were considered. It has been stated that their developed tools are being used as part of a practically incorporated engine health monitoring systems for industrial gas turbine engines operating across the globe.

Deep Learning
Deep learning (DL) is a sub-field of machine learning (ML) that is used to learn feature hierarchies of data through many layers of non-linear information processing [174]. The learning can be supervised, unsupervised or hybrid [174]. In the past few years, DL has attracted remarkable research attention with proven computational performance in several application domains including signal and information processing [175], big data analysis [176], speech recognition [177], medical image analysis [178], biomedicine [179], system health management [180] and others [181].
Despite its success in the other domains, few studies have been published on the application of DL for gas turbine fault diagnostics. A combined autoencoder and Gaussian distribution based engine gas-path anomaly detection algorithm was proposed by Luo and Zhong [182] for civil aircraft engine applications. The autoencoder is used to denoise data uncertainties and extract the important features for the Gaussian distribution-based anomaly detection module. In this assessment, measurement deviations from 15 sensors are used in order to detect anomalies from the target engine gas path components, although no information is given about the fault scenarios considered. Yan and Yu [183] introduced a DL-based anomaly detection technique for a heavy-duty industrial gas turbine combustor. In their proposed method, stacked denoising autoencoder (SDAE) [184] is used as a data processor to avoid measurement noise and extract features for the combustor fault classification. Then extreme learning machine (ELM) neural network-based classifier is developed based on the features learned from the SDAE module. Recently, Xuyun et al. [185] used multiple convolutional denoising autoencoders to develop a fault detection technique for aircraft engines. The reported results in the above-discussed studies encourage more utilization of DL techniques for gas turbine fault diagnostics than the conventional approaches.

Bayesian Belief Network
A Bayesian Belief Network (BBN) is a graphical representation of a probability distribution AI based method that represents the cause and effect relationships among predisposing factors, faults and symptoms [125]. The graph consists of nodes which represent a set of random variables and directed edges indicating their dependencies. The degree of relationship between the variables is expressed in terms of conditional probability. In Figure 12, an example of a BBN structure referred to a gas turbine fault diagnostic is presented. In this structure, the parent nodes are dedicated to the engine performance parameters and the child nodes to the measurement parameters. The given performance parameters and the measurement parameters of the case engine are related through sets of directed connections along with their associated probability values. According to the BBN approach, the engine gas path diagnostic problem can be expressed mathematically as: where P(x/z) is the probability of x given z, P(z/x) is the probability of z given x, x is the independent parameter (performance parameter), z is the dependent parameter (measurement parameter), P(x) is the probability of the independent parameter x, and P(z) is the probability of the dependent parameter z. The application of BBN for gas turbine diagnostics was started in the early 1990s by Breese et al. [187]. Their proposed technique relies on a model-based method that integrates a BBN with an expert system. It was implemented to assess failures on the engine oil cooling system, bearing, and bearing temperature sensors. A few years later, Palmer [188] developed a BBN based fault diagnostic system for the CF6 engine application, although model details were not provided. A more detailed BBN based aircraft engine gas path fault diagnostic procedure was provided later by Kadamb [186], Romessis et al. [189], Mathioudakis et al. [190], and Romessis and Mathioudakis [191] with the aid of an engine performance model. They also showed the capability of their proposed method dealing with engine diagnostic problems with measurements less than the performance parameters to be assessed. Lee et al. [192] suggested hierarchically arranged multiple BBN models based on an offline fault diagnosis method for industrial gas turbine engines under steady-state operating conditions. It has been reported that the proposed method is capable of carrying out both qualitative and quantitative diagnostics under measurement's uncertainty.
In the presence of large data samples, BBN can be trained in a supervised or unsupervised learning manner, although the majority of the past attempts focused on the supervised one [193]. However, developing a BNN classifier based on expert knowledge is highly complex and time consuming, and it is subject to errors as it requires a classified training data sample based on the expert's prior knowledge that can be used to model the BBN structure and generate its conditional probability table (CPT). In many gas turbine applications, it is difficult to obtain the required large data samples with the fault class information. For isolation of engine fault classes with a data that has no labels assigned to, the unsupervised learning-based BBN might thus be preferable. Different algorithms exist for this purpose such as the score-based, constraint-based, and a hybrid of these two [194].

Expert Systems
Expert systems (ESs) are software programs which are used to capture human expert knowledge in the form of facts and rules to solve problems or give advice as a human expert [91]. The architecture of ESs ( Figure 13) comprises of four basic elements: user interface (used to acquire information and display results), inference engine (deals with all the reasoning operations of the system based on known facts and rules), knowledge base (contains facts and rules about the problem to offer the appropriate decision) and developer (stores information about current education) [93]. The knowledge from the expert is first prepared in the form of a knowledgebase by a knowledge engineer or programmer. The knowledge base contains data and facts in that specific area of application or knowledge domain. The information in the knowledgebase is intended to replace the human expert. The user interface presents questions to users, accepts information from them and then provides answers and sometimes the reasoning for those answers too. The interface engine has the job of matching the user's input from the user interface with the data contained in the knowledgebase to find appropriate answers. This is done using interface rules, which describe how different items from data relate to each other and sometimes using probabilistic rules. ESs are programmed with a series of logical rules to find a solution. Very basic ESs use Boolean logic or decision trees. Boolean logic has two possible values, true and false, yes or no, etc. The problem is that Boolean logic has only two values, making it difficult to represent real life problems. To avoid the problem of decision trees, ESs typically use inference rules and chaining to reach conclusion. Inference rules are written as IF-THEN statements which describe rules for a knowledge domain.
Numerous studies have been done using ESs in previous years, and many different ES based GT diagnostic techniques are available in the open literature [91,195,196]. They can be broadly categorized into rule-based, MB, and case-based techniques [91]. The earlier forms of ESs-based GT gas-path diagnostics applied pattern recognition/matching techniques by comparing patterns of measurement deltas with performance parameter deltas/fault signatures obtained from original equipment manufacturers (OEM) [9]. As listed in [91], enormous ES based tools specific to different GT models and configurations were introduced by GT manufacturers and researchers such as TEXMAS (Turbine Engine EXpert Maintenance Advisor System), HELIX (HELicopter Integrated eXpert), XMAN (A Tool for Automated Jet Engine Diagnostics), TIGER (Testability Insertion Guidance Expert System), IFDIS (Interactive Fault Diagnosis and Isolation System), and SHERLOCK. However, knowledge from domain specific experts is usually inexact and reasoning on knowledge is often imprecise. An ES dealing with uncertainty and proved to be very efficient in fault diagnosis is Bayesian Belief Network (BBN). However, these systems require precise inputs and rely entirely on knowledge of experts and extensive database of rules.

Fuzzy Logic Methods
Fuzzy logic (FL) is a nonlinear mapping of an input feature vector into a scalar output [197]. It is one of the most widely used AI methods to approximate the relationship between dependent and independent parameters based on a set of IF-THEN statements. The general FL approach consists of four basic components: Fuzzy Rules (sets of IF-THEN statements), Fuzzifier (the mechanism which maps numbers of input signals into the fuzzy set), Inference Engine (the technique used to determine the ways in which the fuzzy sets are combined with each other) and Defuzzifier (the mechanism used to calculate the output values) [93]. The schematic representation of a rule-based FL system is shown in Figure 14. For gas turbine diagnostics, sets of measurement parameter deltas are used as an input to the FL system in order to compute the performance parameter deltas. After the earliest use of FL by Fuster et al. [198] in 1997, several FL based gas turbine diagnostics techniques have been devised by other researchers. Among these, Marinai [124,199] contributed a diagnostic model for a Rolls-Royce Trent 800 engine that can isolate both single and multiple component faults in the presence of sensor noise and bias. Simulated data for clean and faulty GT cases were used to test the fault detection performance of the model and the results showed that the detection based on filtered data was very accurate with negligible missed alarms and no false alarms. However, the investigation of the method for multiple fault diagnosis was limited to dual component faults only. Similarly, Ganguli [200], developed GT measurements' trend shift detection mechanism using median filters and FL. The test results revealed that the detection based on filtered data was very accurate with negligible missed alarms and no false alarms. In order to undertake the problem of availability of limited numbers of sensors on a real GT service, Ganguli [197] developed a FL based single fault isolation system for a jet engine using only four commonly available sensors. The proposed method can isolate 95% of the faults successfully. The accuracy increased with the number of sensors and reached 100% for eight sensors. He has also stated that FLs can work with poor quality data. In another study, Ogaji et al. [201] proposed a diagnostic system for a modern military turbofan engine that can identify single component faults with an accuracy of 92.5%. Recently, Kyriazis et al. [202] developed a FL based GT compressor fault diagnostic system. Its effectiveness was compared with pattern recognition and PNN methods. The results showed that the FL method has as good generality and effectiveness in fault diagnostics as the other two methods.

Genetic Algorithm
A genetic algorithm (GA) is a method which mimics nature on the basis of Darwin's evolutionary theory of "survival of the fittest" [197]. It begins with the population of randomly generated structures where each structure encodes a solution to the task attempt and proceeds to evolve generations. In each generation, the quality of every individual within the population is assessed by its fitness, multiple individuals are selected from the current population (based on their fitness) and modified (recombined and possibly randomly mutates). Since the fitness is a function of the objective function (OF) to be optimized, the optimum solution is reached when the OF is approaching zero and the best fitness is associated with the value of one. It is an iterative procedure, each of which consists of three operators; selection, crossover and mutation [203]. During each generation, the GA improves the structures in its current population by performing selection followed by crossover followed by mutation. It is looking for best solutions rated against fitness criteria. Thus, it avoids local optima and searches for a global fitness. During selection, it duplicates higher fitness structures and deletes structures with lower fitness. Crossover results in good components of good structures combining to yield the better structures, and then recombines elements of good chromosomes from different genomes. Mutation creates new structures that are similar to current structures with a small pre-specified probability. Figure 15 shows the schematics of the cyclic process of the three GA operators. GAs differ from the traditional optimization techniques through the following three significant points [204]. Firstly, since they search parallels from a population of points, they have the ability to avoid being trapped in local optimal solution. Secondly, GAs work on the chromosome which is an encoded version of potential solution parameters rather than optimizing the parameters themselves. Thirdly, GAs use fitness score which is obtained from object functions (OFs) without an artificial over engineer black-box mathematics. In the end, like the other methods, the user typically choses the best structures of the last population as the final solution. A more detailed description of this method can be found in [205].
GA is often applied as an effective optimization tool to obtain a set of component parameters that produce a set of predicted dependent parameters, through a nonlinear GT model that leads to predictions that best match the measurements [91]. The solution is obtained when the OF (which is the measure of the difference between predicted and measured values) achieves the minimum value. A simplified illustration of GA based GT fault diagnostics strategy is given in Figure 16. Figure 16. Schematic diagram of GA based diagnostic strategy (adapted from [203]).
According to Zedda and Singh [118] and Singh [1], when measurement noise is assumed to be Gaussian distribution, the suitable OF to be optimized is given as: Or, if the absolute deviation is considered (this is suitable when measurement error distribution is assumed to be other than Gaussian distribution and when modeling errors are inevitably present) Equation (29) is more suitable.
where J is the OF, M is the number of measurements, zj is the value of the jth measurement, h is a vector valued function, w is power setting parameters, zodj is the value of the jth measurement in the off-design clean condition, σj is the standard deviation of the jth measurement (noise value).
The application of GAs for a GT fault diagnosis was started in 1999 by Zedda [95], and has been investigated by several scholars since then [206]. The problem of obtaining an accurate MB GT fault diagnosis method in the presence of a limited number of measurements, far less than the number of performance parameters, is undertaken by Zedda and Singh [118] using GA. The performance of this method was evaluated and tested on Rolls-Royce (RR) RB-199, RR RB-211 and ICR-WR21 engine types. For this purpose, the performance models for each case engine were developed utilizing the well-known GT engine performance code TURBOMATCH. Sampath et al. [203], developed a GAbased sensor and component fault diagnostic scheme that can deal with the nonlinearity nature of the diagnostic problem, for a double shaft GT engine application. In this work, the effects of sensor noise and bias and the number of sensor and component faults on the accuracy of component fault diagnosis were analyzed using simulation data from a GT performance modelling tool, called Rolls-Royce Aerothermal Performance (RRAP). Six gas-path components (inner and outer Fans, high-pressure compressor (HPC), high-pressure turbine (HPT), low-pressure turbine (LPT), and Nozzle), when affected individually or in pair (as a single component fault (SCF) and double component fault (DCF) cases), were analyzed using 16 gas-path measurements in the presence of two and four concurrent sensor faults. For this purpose, six SCF and 15 DCF classes were considered. A GA based GT fault diagnosis method was used to find an optimal combination of a set of performance parameters and the corresponding set of best match measurement parameters, through a non-linear performance simulation model [108]. A generalized GA based GT fault classification method was proposed by Loboda et al. [207] using a thermodynamic model that is applicable to multiple operating point conditions for both steady state and transient cases. Li and Pilidis [103] and Li et al. [208] applied GA for a GT performance adaptation in order to assess the engine's health status. For this purpose, the information from the measurements was used to estimate the component faults at a specified design point and off-design operating condition.
Recently, the fault diagnosis effectiveness of an NLGPA and a GA based method was compared by Kong et al. [209], applied on a 2-spool turbofan engine. They showed that the diagnostic MB on GA is better than the NLGPA, particularly when sensor noise and bias are considered. In a similar manner, Kong [93] investigated the diagnostic effectiveness of GA in comparison with NLGPA and fuzzy-neuro techniques taking in to account measurement uncertainty and sensor fault effects. The test results indicated that the GA based method showed a reliable accuracy than the NLGPA, especially in the presence of sensor noise and bias.

Hybrid AI Methods
In spite of the fact that it is impossible to find a single-technique which can undertake all the concerns of diagnostics, it would be fascinating to combine two or more methods in an attempt to offset the limitations of one with the advantages of another [1]. With regard to this, many combined AI techniques are proposed by different authors including Genetic-neural network, Genetic-fuzzy logic, fuzzy-neural networks (FNNs), and neural-fuzzy systems [210,211]. Green and Allen [212] discussed the benefits of combining ANN with other AI methods for diagnostics and prognostics. GA is applied as an effective optimization tool to obtain a set of component parameters that produce a set of predicted dependent parameters, through a non-linear GT model that leads to predictions which best match the measurements [91]. In FL-based techniques, the optimal selection of fuzzy sets with the appropriate membership functions (MFs) is essential. However, in the conventional FL applications, there is no defined function to determine the number of fuzzy sets and MFs [111]. It is indicated in here that GAs are capable of selecting the optimal number of fuzzy sets and MFs automatically, thereby enhancing the performance of the FL. Kobayashi and Simon [213] investigated the effectiveness of a MB hybrid neural network GA technique for a turbofan engine application. The neural network scheme was utilized to diagnose GT component faults while the GA scheme is used for sensor fault evaluation. The sensor and component fault diagnostic potential of a hybrid ANN and GA based model was also developed and implemented on advanced cycle Intercooled Recuperated (ICR) WR21 engine by Sampath and Singh [211]. In this model, the task of the ANN module was to pre-process and validate the GT data, whereas the GA module was to isolate and identify faults. It has been shown that the accuracy, reliability, and consistency of the results obtained from the hybrid technique are better than ANN and GA based algorithms. A RBFN was applied for a GT fault identification purpose, integrated with a MB KF fault diagnosis scheme by Simani and Fantuzzi [155]. In this algorithm, the KF based scheme was accountable to detect and isolate the faults using measurement residual, while the RBFN quantified the magnitude of the faults based on pattern recognition approach. Table 5 presents a summary of the advantages and limitations of AI-based methods reviewed above. This may help beginners of this field in selecting the appropriate diagnostic approach for variety of engine diagnostic problems. In general, unlike MB methods, AI-based methods can handle the effect of sensor noise and bias, the possible existence of multiple faults simultaneously, the fault identification problem using a limited number of instrumentation suite, and the nonlinear relationship between the measurement parameters and the performance parameters. However, most AI-based techniques cannot give confidence limits on the output. In addition, they are not capable of diagnosing faults outside the domain of the data to which they have been exposed during training. Table 6 presents the summary of these methods regarding their capability of undertaking the challenges discussed in Section 3.1 and fulfilling the desirable attributes of an effective diagnostic system stated in Section 3.2. It gives a clear view of the pros and cons of the aforementioned diagnostic methods, and it is important to choose an appropriate diagnostics system for a particular diagnostics problem. They are relatively easy to implement for both engine performance modeling and diagnostics [214].

Strengths and Weaknesses of the Gas-Path Diagnostic Methods
-Since they are unable to perform reliably outside the range of the data they are trained on, a huge amount of data that can sufficiently represent the entire life of the engine condition is required. -They require high computational time (during training).

-
If operational data is used to develop the methods, retraining will be required during engine overhaul and/or its operating condition changes [5]. -Diagnostic error increases with an increase in the number of operating points. But this problem can be solved by correcting the data towards ambient condition and load variations [5].

-
The process in the hidden layers is not visible (they are black-box models). They are able to deal with the non-linear nature of gas-path problems [11]. -They are coping with measurement noise and sensor bias [91]. -Like ANNs, FLs could perform data-fusion [2]. -Like ANN, FL-based systems could be utilized for online diagnosis application due to the fast computational capability in inference mode [11].

FL
-Fuzzy rules depend on the knowledge of subject expert and diagnosis accuracy depends on the available rules. GA -It is able to deal with measurement noise and bias. -It provides good results when integrated with other MB as well as DD systems [1].
-It requires long computational time than traditional optimization techniques, especially as population and generation numbers grow [105].
-Like ANN, it can deal with engine diagnostic problems with limited instrumentation suites available. -It can optimize the engine performance functions without the need to solve complex equations like the other mathematical optimization approaches [208]. -It can perform both qualitative and quantitative diagnostics [5]. -It can perform simultaneous fault analysis [215]. -It can deal with the non-linear nature of the engine behavior [208].
-As the number of simultaneous faults increases, the convergence time increases [125].

-
In order to check the consistence in the GA optimization results, multiple runs are often required [5].
BBN -It is graphical, and it is easy to visualize the model variables and its diagnostic results.

-
It is capable of performing multiple simultaneous fault analysis [188].

-
It can perform data fusion [2]. -It can provide better, more flexible and robust diagnostic solutions [191]. -Unlike ANN, it requires a short retraining time because gas turbine model hardware changes can be easily entered [124]. -Unlike ANN, BBN can include generic faults that are not considered during the training process of the diagnostic system [188].

-
After completing the BBN model utilizing a gas turbine simulation model, it does not need to run the model gaining in computational speed [124].

-
It is more realistic to make diagnosis expressing the probability of whether a fault occurred or not, than expressing a deterministic answer [10].

-
Gathering the required information for setting a BBN model is tedious and time consuming [125].

-
It requires a well-trained developer to set it up [188].

-
As the number of nodes and edges increase the model complexity and computational and storage requirements increase [216].
-Measurement uncertainties and operating condition variations affect its accuracy [217].

Expert System (ES)
-It is preferable when the diagnostic problem is well understood, stable, and human experts are available to develop the knowledgebase. -It is relatively simple to develop and easy to understand. -It is a more suitable technique for stable and predictable GT operating conditions, if potential faults can be defined easily.

-
The knowledgebase does not represent only a single human expert knowledge, rather a group as a whole. This helps to eliminate biases from individuals [59]. -It is more important for applications in remote areas that maintenance experts cannot be found in nearby distance [59].
-The knowledge base may suffer from measurement uncertainties and incomplete or missing data. This problem could have a significant effect on reliability and accuracy of answers provided by the system. -Like any software program, the inference engine running missing rules or incorrect data process influences its accuracy and reliability. -It is highly dependent on the knowledgebase, which covers only a small amount of knowledge, and are incapable of dealing with problems outside this domain. If experts are wrong, the output will also be wrong. Thus, it requires up-to-date knowledge of the human experts along with a significant set of rules.

-
The knowledge acquisition task is time consuming. -It requires high establishment cost due to luck of human experts [59].

GT Diagnostic Methods Validation Techniques
In the field of GT diagnostics, method validation is a critical issue. GT users need a practical tool to evaluate the performance and effectiveness of any algorithm before they decide to incorporate into their plants. However, there are no standardized commercial tool for this purpose. The evaluation approaches that the research community was using until now are: 1. Performance Metrics Approach: Performance metrics can be used to measure the detection, isolation, and identification performance of a fault diagnostics algorithm [218]. The detection metric measures how accurately the detection algorithm detects abnormal operating conditions. The isolation metrics evaluates how successfully the isolation part of the diagnostic framework distinguishes the fault types and their locations. Also, the identification metrics measures how accurately the diagnostics system estimates the magnitude of the faults. A more detailed description about this concern together with sample performance metrics is available in [219]. The majority of the fault diagnostic methods available in the open domain are evaluated based on this approach. A fault diagnostics performance metric associated with the fault detection and isolation reliability of four different methods for aircraft engines is presented by Simon et al. [220].

Benchmark Fault Cases Approach:
The generally accepted and implemented solution to obtain the required performance data for diagnostic method development and validation is implanting fault cases corresponding to the possibly existing faults into the GT performance model [113,221]. However, there exists some inconsistencies concerning the range of the performance parameter losses that different gas-path faults are represented by [222]. The issue of using benchmark fault cases has also been thoroughly studied by the OBIDICOTE (On Board Identification, Diagnosis and Control of GT Engines) Project conducted by the European research community [223]. The project identified a set of benchmark fault cases, which have been used by several researchers so far to evaluate their engine fault diagnostic methods [191,211,224]. The effectiveness of using some sets of benchmark fault cases to evaluate the performance of a diagnostic system is further investigated by the engine health management industry review (EHMIR) established under The Technical Cooperation Program (TTCP) [224]. TTCP is a collaboration forum for defense science and technology (DST) between five nations, namely, UK, USA, Canada, Australia and New Zealand. The effort of this forum resulted in a reference engine problem, together with a recommendation of an evaluation environment for different diagnostics algorithms. Based on the recommendation of the aircraft engine health monitoring community, recently, NASA's research team developed a public benchmark gas-path fault diagnostics techniques' performance evaluation software referred to as the Propulsion Diagnostic Method Evaluation Strategy (ProDiMES), applicable to aircraft engines [220]. In this software, four different methods are included; Weighted Least Squares, PNN, Performance Analysis Tool, and Generalized Observer/Estimator. Moreover, the field GT condition monitoring and diagnostics has been studied for many years by the research team of the Laboratory of Thermal Turbomachines of the National Technical University of Athens (LTT/NTUA) [12] and Cranfield University (CU) [1] and proposed many different techniques. These groups showed the effectiveness of using benchmark fault cases to develop and evaluate the performance of diagnostic algorithms. 3. Comparison of Methods Approach: As a third alternative, there are also some self-conducted comparative evaluation (a one-to-one comparison of methods) based research works to assess the diagnostic performance of different themes [113,221]. In this category, previously published papers are used as benchmark methods to compare the performance of the newly developed algorithm. However, this approach has limitations due to the reason that most of the available GT diagnostic methods targeted different engine problems and degree of complexity under variety of diagnostic conditions (i.e., operating modes, measurement system, deterioration profile etc.) [224].

Conclusions and Future Research Directions
Gas turbine (GT) gas-path fault diagnostics is a key element of an overall engine condition-based monitoring (CBM) system providing enhanced safety, reliability, and availability along with optimal operation and maintenance costs. In consideration of this role, an effective and reliable gas-path diagnostic technique is critical. This paper was devoted to discussing various issues related to gaspath diagnostics including engine physical faults, challenges and desirable attributes of a gas-path diagnostics, state-of-the-art methods, and verification and validation approaches. The past efforts on gas-path diagnostics have focused on the aspects of data filtration, sensor validation, and component fault diagnostics. Variety of methods associated with these aspects, beginning with the conventional methods to the most sophisticated artificial intelligence (AI) based ones, were reviewed. Due to their remarkable capability of handling the available challenges and meet the majority of the desirable attributes, recent efforts paid more attention to AI methods. Particularly, artificial neural networks (ANNs) have been widely used for both qualitative and quantitative diagnostic applications, although the majority of the investigations were limited to single sensor fault and/or single component fault analysis. However, in order to avoid the barriers between system developers and engine users and get their interest to invest and incorporate gas-path diagnostic technologies into their plants, there are two main issues that the gas-path diagnostics research community should give attention to: improving the effectiveness and reliability of the available fault diagnostic systems and developing practical tools to evaluate the effectiveness of the proposed techniques. With that in mind, the following further studies should be carried out.

•
The need for a standardized gas-path diagnostic problem definition. According to this survey, there is no consensus between researchers in defining and representing gas-path diagnostic problems (terminologies, component fault representation, ranges of sensor/component faults, and the number and type of faults corresponding to different engine configurations that possibly exist in the engines lifetime). This inconsistency may confuse young researchers of the field, create barriers in exchanging gas-path diagnostic related ideas/solutions and performing a oneto-one comparison of different algorithms.

•
The review on GT performance deterioration revealed that the degradation profile corresponding to each gas-path's faults is not consistent. This may lead to an incorrect representation of components deterioration, and thereby unreliable fault diagnostic results. Hence, there should be more investigations in this regard.

•
Most of the devised techniques for simultaneous fault analysis were restricted to qualitative solutions (i.e., detecting and isolating without estimating the level of the fault, which is a very important step in the maintenance decision). Moreover, the accuracy of the available limited quantitative approaches requires improvement for multiple fault scenarios. Development of an effective gas-path diagnostic system that can perform both qualitative and quantitative diagnostics of both single and multiple fault scenarios thus needs further investigation. • Development of efficient hybrid methods. Most of the available gas-path diagnostic methods are single-technique-based and it is difficult to find single-technique which can address all gas-path diagnostic related challenges along with providing accurate diagnostic results. It is recommended to combine two or more methods based on their merits. • Development of integrated platforms. Although a large number of diagnostic methods have been devised so far, the majority of those methods considered different platforms with different levels of complexity and applied for different engine system monitoring (such as sensor, component, vibration, controller, and fuel and oil systems). Integration of verity of methods into a diagnostic tool being capable of addressing the entire GT system problems is required.

•
Establishment of a practical approach to verification and validation. Engine users need practical tools to objectively assess the effectiveness (i.e., the technical and economic feasibility) of newly proposed solutions and determine its advantages over the existing maintenance practices before incorporating into their plants. However, there are no internationally accepted standards or unified frameworks that can be applied for this purpose. Hence, the establishment of practical verification and validation approaches requires attention in this field. • Development of user-friendly gas-path diagnostic software. Regarding engine performance simulation, there are some powerful commercial software available. Conversely, other than the traditional techniques, there are no advanced software tools based on AI methods. Therefore, user-friendly gas-path diagnostic software that can acquire, preprocess and validate performance data, assess the condition of engines and suggest the appropriate maintenance actions is required. Acknowledgments: The authors would like to acknowledge Universiti Teknologi PETRONAS (UTP) and Mälardalen University (MDH) for supporting this research financially as well as for all the facilities provided.

Conflicts of Interest:
The authors declare no conflict of interest.