A Survey on Active Fault-Tolerant Control Systems

Faults and failures in the system components are two main reasons for the instability and the degradation in control performance. In recent decades, fault-tolerant control (FTC) approaches have been introduced to improve the resiliency of control systems against faults and failures. In general, FTC techniques are classified into active and passive approaches. This paper reviews fault and failure causes in control systems and discusses the latest solutions that are introduced to make the control system resilient.The recent achievements in fault detection and isolation (FDI) approaches and active FTC designs are investigated. Furthermore, a thorough comparison of several different aspects is conducted to understand the advantage and disadvantages of various FTC techniques to motivate researchers to further developing FTC and FDI approaches.


Introduction
In the classic control theory, it is assumed that all the components work properly and precisely. However, experience has taught us that this assumption cannot be guaranteed all the time, and on many occasions, system components might face some faults or failures in their task. These accumulative faults would endanger the controller stability and its performance that cannot be tackled by robust control theories. With the increasing demand for having a reliable and safe controller, the fault-tolerant control (FTC) systems became one of the most attractive topics in the field of advanced control theory, which received a great deal of attention among researchers. The ongoing achievements in this field of control lead to several valuable review articles to overview the most recent techniques.
One of the earliest review papers in the field of FTC was published in 1991 by Stengel [1], which investigated the basic concepts of FTC and artificial intelligence application in FTC systems. In 1997, Patton presented a comprehensive review of FTC techniques and analyzed the key issues of FTC design [2]. Luze and Richter presented an introductory tutorial for FTC design based on reconfiguration and reviewed the state-of-the-art achievement in the field [3]. Alwi et al. reviewed different kinds of possible faults and failures in the control system and briefly overviewed fault detection and isolation (FDI) and FTC approaches [4]. The survey papers in [5][6][7][8] have reviewed the development of active and passive FTC systems and investigated the challenges and advantages of them. In [9], FDI and FTC approaches in the aerospace system have been briefly reviewed, and the combination of active and passive FTC was investigated. Some survey papers reviewed designing of FTC for a specific application, for instance, FDI and FTC approaches for attitude control in spacecraft [10], single-rotor aircraft (e.g., helicopters) [11], electric speed drive systems [12,13], photovoltaic (PV) systems [14], and power electronics systems [15,16]. In [17,18], FDI approaches have been extensively investigated, and they classified FDI techniques to four subcategories: model-based, signal-based, knowledge-based, and hybrid/active approaches.
Despite the valuable efforts in recent decades to provide a comprehensive FTC and FDI approaches, most of the works were only reviewed hardware-redundancy-based FTC approaches. At the same time, analytical redundancy, which has received a great deal of attention in recent years, has not been investigated to the best of our knowledge. In addition, most of the works reviewed FDI and FTC separately, and the link between active FTC and FDI to obtain a united active FTC system was not technically investigated. Furthermore, the ongoing achievements in this field of control and the increasing need to develop a reliable control system are another reason to review the latest works in the field. These reasons motivated us to prepare the current work. In this paper, the latest achievements in the field of the active FTC system are reviewed, and the control performance of passive and active FTC is technically compared. Also, unlike the previous survey papers, the design philosophy of active FTC systems based on real-time FDI is extensively investigated. Furthermore, the recent trend in the application of analytical redundancy for designing resilient controllers against faults is investigated.
The remainder of this work is organized as follows. Section 2 discusses the fault definition, causes, and types separately. Section 3 reviews the latest works in the field of passive and active FTC systems and scrutinize the advantages and possible challenges in their design and application. Section 4 reviews the latest achievements in the field of FDI. The conclusion and remarks are presented in Section 5.

Fault Definition, Causes, and Types
In this section, the basic concepts of fault, failure, and FTC are illustrated to make it easier for readers to differentiate them with uncertainty, disturbances, and a robust controller.
Faults and failures in the components of a control system can endanger the system stability and degrade its performance. Fault in a dynamical system can be described as a deviation of the system structure or system parameters from a nominal situation [17]. The overall effect of a single fault in a system can be varied from performance degradation to a total failure [8]. It is worthwhile to mention the difference between faults and system uncertainties and external disturbances. The faults are those elements that should be detected and whose effects should be removed by remedial actions. Disturbances and model uncertainties are nuisances, which are known to exist but whose effects on the system performance are handled by appropriate measures like filtering or robust design. In theory, it has been demonstrated that controllers can be designed to attenuate disturbances and tolerate model uncertainties up to a certain size, while faults are more severe changes, whose effects on the plant behavior cannot be surpassed by a fixed controller. The difference between faults and failures should also be illustrated. A fault causes a change in the characteristics of a component such that the mode of operation or performance of the component is changed in an undesired way. Hence, the required specifications for the system performance are no longer met. In general, a fault can be "worked around" by fault-tolerant control, so the faulty system remains operational. In contrast to this, the notion of a failure describes the inability of a system or a component to accomplish its function. The system or component has to be shut off because the failure is an irrecoverable event. Therefore, redundancy is the main solution in the presence of a failure in the system. Fault, based on its location, can be classified to sensor, actuator, and plant (component or parameter) faults [17].

Fault Types and Causes
In general, most of the classical control techniques assume that all the components in the system work correctly and the controller is designed based on this assumption. Hence, the occurrence of faults in the system component would send incorrect information to the controller, and subsequently, the controller will be misled based on the received false data. Figure 1 shows possible faults in a control system. As can be seen in Figure 1, faults based on their component type can be classified into three categories: Plant Faults: These faults change the dynamic I/O properties of the system. Sensor Faults: The plant properties are not affected, but the sensor readings have substantial errors. Actuator Faults: The plant properties are not affected, but the influence of the controller on the plant is interrupted or modified.
Faults and failures in a control system may occur due to various reasons, such as

•
Interruptions in communication between the actuator/sensor and the control unit due to severe vibration, improper connections, metal flakes separating, and short circuit. • Noise effect on the actuator/sensor due to environmental noise like a magnetic field. • Denial of service for a period of time due to processor speed and network bandwidth [19,20]. • A fall in a supply voltage/current of the electrical actuator/sensor since they normally need a separated power supply [21]. • False data injection by a malicious intruder. In this kind of fault, an attacker penetrates to the system communication and injects false data to actuators/sensors to mislead the control system [22][23][24][25][26][27].

•
Actuator runaway/hardover where the control surface moves at its maximum rate limit and reaches its saturation limit. The actuator runaway can occur due to failure in an electronic component which leads to sending a random large signal to the actuators causing the actuator to be deflected at its maximum rate to its maximum deflection. This kind of failure has been reported as the main reason for several aircraft crashes such as Flight 85 (B-747, Alaska, 2002) [30] (where the lower rudder runaway led to full left deflection and caused the excessive roll of the aircraft).

•
Actuator lock/stuck can be caused by a mechanical jam due to lack of lubrication or being locked by ice for example. This type of failure led to flight incidents such as Flight 1080 (Lockheed L-1011, San Diego, 1977) [30], and flight 96 (DC-10, Ontario, 1972) [30].
Faults can be categorized based on their severity as well. In this study, we categorized faults into three major groups:

•
Abrupt Faults: Abrupt faults can be defined as changes in parameter values, which are faster than the nominal dynamic process. Since tracking fast changes is a difficult process based on residuals, the ability to detect these abrupt changes is a great challenge for most of the fault detection algorithms [17]. Reference [31] considered the occurrence of three types of abrupt faults, severe vibrations, metal flakes separating, and short circuit.
• Incipient Faults: The problem with incipient faults is their small effects on the residuals, which could be hidden from the detection system. The sources of these faults are sensor/actuator inaccuracy or partial failure. • Intermittent Fault: This kind of fault is a malfunction that occurs at irregular intervals. This kind of fault, common in most systems, can be caused by various contributing factors, i.e., improper connection of electrical wires to the sensors, actuators, etc. The complexity of the system increases the chance of the occurrence of intermittent faults. Due to the inconsistent nature of the intermittent faults, their detection is a great challenge for most of the detection algorithms. Figure 2 presents a graphical description of abrupt, incipient, and intermittent faults in the system.

Definition of Fault-Tolerant Controller
The FTC approaches are developed to improve the safety and reliability of control systems against fault and failures. A control system that can automatically compensate for a fault (and sometimes failures) effect in the system components while maintaining the system stability along with the desired level of overall performance is called an FTC system [4,6,7,11]. Generally, based on the dependency on the fault information, FTC systems can be categorized into two main classes: passive FTC and active FTC. Passive FTC is an FTC system that does not rely on faulty information to control the system and is closely related to robust control where a fixed controller is designed to be robust against a predefined fault in the system [4,8]. In general, redundancy is integrated into the passive fault-tolerant control design to make them resilient against faults [7].
In contrast with passive FTC systems, active FTC systems perform based on the occurred fault in the system. In such control systems, FDI unit is used to find the fault location and measure its size; then, a supervisory controller decides how to modify the control structure and parameters to compensate for the occurred fault in the system. Such modification can be varied from control reconfiguration [32,33] to managing redundancies [34], and analytical redundancy [35][36][37][38][39][40][41]. Both active and passive approaches use different techniques for the same purpose; however, due to their difference in their design approach, each approach may result in some unique properties.
In the following section, a brief illustration of the passive FTC and active FTC are presented.

Passive and Active FTC
This section reviews the recent research works in the field of passive and active FTC theory and highlights their advantages and disadvantages.
FTC techniques can be divided into two main categories: active and passive [5,42]. Active FTC uses detection techniques to find the fault, then, a supervisory system will decide how to modify the control structure and parameters to compensate for the effect of the faults in the system [7]. However, in passive FTC, a robust compensator is used to reduce the fault effects in the system or at least stabilize the system in the presence of a fault in the system.

Passive FTC
Passive FTC systems do not rely on the fault information, and their design is directly integrated with the concept of redundancy. The concept of hardware redundancy in passive FTC systems can be defined as the application of identical components with the same input signal so that the duplicated output signal can be compared with the main component to switch between redundant actuators in case of performance degradation to mitigate the fault effect [17]. As can be seen in Figure 3, in passive FTC design, redundancy can be considered in the controller, actuators, plant components, and sensors that the FTC system can switch to them in the presence of a fault in the system.
The main challenges of passive FTC can be summarized as (1) The extreme dependency on hardware redundancy: despite the advantage of having redundant hardware in improving the reliability of the system, having redundant hardware increases the product cost, and also increases the needed space (product size) and the weight of the product. It is obvious that the key components need redundancy to avoid breakdown, but applying redundancy for the whole system would be costly and difficult to be applied considering the weight and space limits.
(2) Passive FTC strategies rely on the assumption that the system will maintain its asymptotic stability of the closed loop under specified fault/failure scenarios. However, this assumption may not be sufficient to prevent system break down in the presence of a large number and unforeseen faults.
(3) Due to the fact that in passive FTC design, the normal and fault/failure conditions should be considered simultaneously, in the performance aspect, they are more conservative compared to active FTC design. In other words, passive FTC systems focus on the robustness of the system considering all the scenarios rather than the optimal performance for each scenario, i.e., to guarantee the stability of the system in the presence of a fault, the settling time of the controller would be increased even in a normal situation.

Active FTC
In contrast with passive FTC systems, active FTC systems react to each fault differently. This reaction is based on the control approach used in the active FTC design and information received from the detection system. Generally, an active FTC design has three main steps: (1) Detection, (2) Supervision, (3) Control. Figure 4 shows the three main steps and their roles in designing active FTC systems. Generally, in designing an efficient active FTC system, three major factors should be considered: First, the detection unit should be accurate. False fault alarm and inaccurate fault measurement have a direct impact on the performance of the active FTC system. This inaccuracy will lead to a negative reaction to the fault and would even endanger the system stability. Second, the designed active FTC should be robust against the imperfect fault detection information. Third, the time spent for fault recovery should be less than the available time for recovery. In other words, control reconfiguration/fault compensation should be fast enough to guarantee system stability and performance.
In fact, the most important part of an active FTC system is its FDI unit; thus, we categorized active FTC systems based on the FDI approach used in their design. In the following section, we reviewed different approaches for FDI design.
Active FTC approaches are mainly categorized based on the FDI unit used in their design. However, the strategy used for the compensation of fault might be different. Here, a brief review of different fault compensation approaches used in active FTC design is presented.

Switching-Based Active FTC
This kind of controller relies on a set of predefined candidate controllers and the system switches among them based on the fault type and severity. Figure 5 shows the overall block diagram of a switching-based active FTC system. An important factor in designing a switching-based controller is the dwell time [66]. The dwell time is the lower band on the length of the time interval between the consecutive switching instances. It should be noted that the upper-bound is the detection interval (DI) which is the length of time in which the controller performance does not change after the fault occurrence. Allerhand and Shaked introduced an active FTC technique considering the dwelling time among switches which guarantees the stability of the system by solving linear matrix inequalities [67]. In [68], a switching-based controller without any extra models or filters is developed. In their design approach, the bounds of the state were guaranteed during the switching delays.

Hierarchical Structure Active FTC
Hierarchal structures are applied in the integration of FDI and FTC in active FTC systems. In this strategy, after detection and isolation of the fault in the system, the controller can be reconfigured either by adaptive control strategies [69,70] or receding horizon control [71].

Safe Parking Active FTC
The concept of safe parking was first introduced by Gandhi and Mhaskar [72]. This concept is based on the idea of maintaining the system at a proper temporary equilibrium (safe parking) point in the presence of fault until the active controller pushes the states of the system to a nominal equilibrium point. Later, this work was further developed to choose a safe parking point using the FDI information [73]. Similarly, Paolo and Lafotune used this concept to propose a safe controllability method [74].

Analytical Feedback Compensation Active FTC
Analytical feedback compensation strategies are based on the real-time fault detection and isolation [75][76][77][78]. These approaches need very accurate FDI information with minimum delay. The overall structure of the analytical feedback compensation method for an active FTC design is depicted in Figure 6.

Controller Controller
Actuator Actuator In [75], an adaptive neural network (NN) approach was used to detect faults in pressure valves of a proton exchange membrane (PEM) fuel cell. A nonlinear observer based on the nonlinear model of the observer was designed and was combined with an NN which its gains were updated using extended Kalman filter (EKF). This approach helped to find different faults in real time with sufficient accuracy and fault data was used as a feedback signal to eliminate the fault from the actuators.
In [76], an online recursive identification method was used for FDI and was integrated into a proportional-integral-derivative controller (PID) through a feedback signal to compensate for the fault in the system analytically. An active FTC system for a multi-agent leader-following system based on wavelet neural network (WNN) was designed [77]. In their work, a robust leader-follower controller based on graph theory was designed for the multi-agent system, then, the WNN-based FDI was used to compensate for the fault in the actuators through a feedback structure.
In [78], an integral-type robust sliding mode controller was designed based on the feedback data from the iterative learning FDI unit which could map analytical redundancy in an optimal manner.

Hybrid FTC
Hybrid FTC systems are introduced to leverage the advantages of passive and active FTC at the same time [59]. Based on this idea, the passive controller is used as a safe controller until a reliable controller based on the information received from the FDI unit is achieved. Based on this concept, the controller has more amount of time to obtain accurate fault information, and optimal control reconfiguration can be performed without the concerns about system safety.

Fault Detection and Isolation
In the previous sections, the definition of fault, failure, FDI, and FTC were discussed and the reasons for designing a reliable controller were investigated. As it was discussed in the previous section, the performance of the active FTC system relies on the accuracy of the FDI unit used in its design structure; thus, an accurate FDI design with the capability of online fault detection and isolation is a must for the active FTC design. This section reviews state-of-the-art FDI and active FTC techniques and investigates their advantages and disadvantages. Furthermore, the solutions to improve their performance are suggested.
Although the terms fault isolation and fault detection are sometimes used synonymously, fault detection means determining that a problem has occurred, whereas fault isolation pinpoints the size and location of the fault [79].
The first step in designing FDI is to design an observer to estimate the system states and output. For simplicity, consider a linearized state-space model of a system as followṡ where x(t) ∈ R n is the system states, u ∈ R m is the control input, y(t) ∈ R p is the output, f a ∈ R l a , f c ∈ R l c , and f s ∈ R l s are actuator, component, and sensor faults, respectively. A ∈ R n×n , B ∈ R n×m , and C ∈ R p×n are the matrices of the state-space system. d(t) and D(t) are unknown disturbances and uncertainties in the system states and the output. The observer for the system described in (1) can be defined asˆẋ wherex(t), andû(t) are the estimates of system states and control input, and H is the observer gain to be tuned by the designer to reduce the error or residual (e(t)) in the system. In other words, the residual shows the inconsistency between the measured data and the expected data and is obtained through a recursive estimation process as follows This well-known framework is the generalized likelihood ratio approach that was introduced by Willsky and Jones [80] and have been used with different observer approaches for estimation of the system states and then fault detection. Based on this framework, a fault detection will be designed which alarms if the residuals exceed a predefined threshold. Figure 7 shows the general structure of the FDI system.
In this study, FDI approaches are classified based on observer design in three main categories: model-based approaches, knowledge-based approaches, and combined model-knowledge-based approaches. Figure 8

Model-Based FDI
Model-based FDI is one of the oldest strategies in fault diagnosis which was introduced in 1971 [81]. Detailed investigation of model-based methods can be found in well-written books [82,83] and survey papers [84][85][86][87]. In model-based techniques, the mathematical model of the operation system (plant) is required. The mathematical model of the system can be nonlinear, linearized or simplified. This model can be obtained through physical approaches or system identification methods; then, an observer will be designed based on this model to estimate the system output and monitor the consistency between the estimated output and the practical system output. Fault can be detected and isolated by subtracting the system output from the predicted output. Different model-based approaches have been used to design the observer such as Kalman filter [88][89][90][91][92][93], H ∞ [94][95][96][97][98], and sliding mode observer (SMO) [99][100][101]. Here, we categorized model-based approaches based on the approaches that were used in designing the FDI system.

Kalman Filter-Based
Kalman filter observers are efficient recursive filters which estimate internal states of the linear dynamic system from a series of noisy measurements based on minimization of the mean square variance of the estimation error. This minimization of error is based on the linear quadratic Gaussian (LQG) optimization problem. Mannandhar et al. used a Kalman filter to estimate the system states and then a χ 2 detector was used for fault alarm in the presence of a fault and false data injection attack in the system [88]. Other statistical tools can also be used for alarming the system if a particular fault occurs, e.g., generalized likelihood [80], multiple hypothesis test [102], and cumulative sum algorithms [103]. Figure 9 shows the general structure of the Kalman filter observer in designing the FDI system.

Industrial Process FDI Performance Model
Unit Delay Kalman Filter Kalman filter-based Observer Kalman filters are commonly used for linear systems, and it can obtain an unbiased minimum estimation error variance if only the noise in their measurements satisfies the Gaussian assumption. To overcome these deficiencies of the Kalman filters, they have been modified, e.g., EKF [91,92], unscented Kalman filter (UKF) [93], and Hybrid Kalman filter (HKF) [89].
One solution to apply Kalman filter to the nonlinear system is to linearize the system around its state using the Taylor expansion series and modify the filter to use this linearized version of the system as a model which is called EKF [105]. An adaptive two-stage EKF with covariance matrix adoption was designed to detect faults in an inertial measurement unit (IMU) sensor of an aircraft [91]. In [92], adaptive EKF was used for FDI in a series of the battery pack to prevent overcharge and over-discharge. In their approach, the states of each battery cell were estimated using EKF, and they identified a fault in the system by comparison of the measured values with the estimated values.
However, the computation burden on EKF is heavier than the Kalman filter, and sometimes this linearization is not accurate which subsequently converges to the wrong solution. In order to tackle these deficiencies, instead of linearization of the nonlinear system, an alternative solution was introduced which is unscented transformation. The unscented transformation is a deterministic sampling technique that picks a minimal set of sample points (called sigma points) around the mean. Then, by the propagation of the sigma points through the nonlinear functions, a new mean and covariance estimation will be formed which is called UKF [106]. In addition, UKF removes the need for direct calculation of Jacobian and obtains it through sigma points propagation without the need for analytical differentiation; this will reduce the computational burden for complex functions. However, similar to the EKF, this filter may become unstable in some highly nonlinear systems and converges to the wrong solution. In [93], UKF was used to detect faults in the reaction wheels of a spacecraft where particle swarm optimization was used off-line for tuning the initial parameters of the UKF. Pourbabaee et al. introduced a multiple-model (MM) HKF approach for FDI which incorporates a nonlinear mathematical model of the system with several piecewise linear models [89].

Unknown Input Observers
An unknown input observer (UIO) is a type of observer that its state estimation error vector (e(t)) converges to zero asymptotically independent of the unknown input (disturbance) in the system. UIO is primarily used for state estimation in linear systems [107]. Luenberger observer is one of the main UIO observers used for FDI [108]. The major advantage of this kind of observer is its ability in the complete decoupling of the estimated states from the unknown inputs under certain conditions. The reason that this kind of observers is called UIO is the fact that the state estimation error is free from any unknown inputs [83].
Recently, standard UIO approaches have been developed to be applied to the nonlinear systems using linear parameter varying (LPV) process. These developments can be categorized in two aspects: (1) using linear matrix inequalities (LMI) technique to obtain required conditions of UIO observers for LPV systems [109][110][111][112][113], (2) integration with other robust control techniques to improve FDI performance [114]. For example, an H ∞ filter was integrated to UIO to detect a failure in the inertial measurement unit (IMU) of a robot manipulator [114]. The H ∞ filtering helped to guarantee the boundedness of the estimation error against input noise.

Robust Fault Detection
In the FDI process, the residual signals should be robust to uncertainties while being sensitive to faults to have accurate fault detection. To this aim, robust control techniques were combined with model-based observers to improve the robustness of the FDI. Among robust approaches, H ∞ and sliding mode have received a great deal of attention in designing robust FDI systems. In [94,96,98], H ∞ norm has been used to reflect the maximum influence of disturbances, and the residual generation in FDI process was formulated as the H ∞ optimization problem. In addition, the norms H ∞ , H − , and H 2 can be used to measure the sensitivity of the residuals to fault occurrence, and then, between the sensitivity to fault and robustness to disturbance an optimization problem can be defined for the performance indices of H ∞ /H ∞ , H − /H ∞ , and H 2 /H ∞ . Based on this theory, the FDI design can be converted to a multi-objective optimization problem [97,115,116].
It should be mentioned that the online implementation of the above optimization problem depends on the complexity of its Riccati equation which has a direct relation to computation time. Recursive algorithms have been proposed to simplify the online implementation and reduced computational load. One of these recursive algorithms is the Kerin space technique which is a special type of indefinite-matrix space and can share many properties with Hilbert space and at the same time preserving the required characteristics of the H 2 and H ∞ [117]. In particular, for estimation problems (Kalman-like filters), it allows the application of recursive algorithms to solve H 2 and H ∞ problems [118]. Due to these advantages, the Kerin space have been used broadly in designing optimal FDI based on different recursive algorithm [119][120][121][122].
The sliding mode observer (SMO) has been widely used for state estimation of uncertain and nonlinear systems in recent decades. This wide application is because of SMO robustness to uncertainties and its capability of reconstructing uncertainties based on the concept of the equivalent injection of the exogenous input [123]. In this concept, SMO observers estimate the system states and output based on an exogenous input which forces the estimation error to be converged to zero in finite time steps [101,123]. Walcott and Zak introduced one of the first Lyapunov-based SMO for dynamic systems with bounded disturbances [124] and developed to be applied to a more general class of nonlinear systems in [125,126]. Drakunov and Utkin introduced the first SMO observer based on the idea of equivalent control concept [127] which was later applied to a class of nonlinear systems with triangular input form [128]. The sliding mode term was incorporated in the high-gain observer (HGO) to design a robust nonlinear observer for a class of nonlinear Lipschitz system [129], where the unknown disturbances can be identified by sliding surfaces.
The implicit or explicit use of differentiation in the SMO process makes them relatively restricted to one degree. The high order sliding mode observers were introduced to remove the relative degree restriction and obtain a better FDI accuracy [123,[130][131][132][133]. The application of sliding mode observers has been widely investigated for linear time-invariant (LTI) systems, and the required conditions for SMO were defined as matching and minimum phase conditions [134]. However, satisfying these two conditions is difficult when it comes to FDI for linear discrete time-varying (LDTV) systems; hence, high order SMO came as a solution to relax these restrictions [123]. Based on these achievements, different SMO approaches have been modified for FDI in linear and nonlinear systems [132,133].
The main advantage of the SMO observer is its strong robustness to bounded robustness. However, this strong robustness makes them insensitive to incipient faults when they have small size during the initial phase; consequently, the system might be vulnerable to the incipient fault. To tackle this problem, in [135,136], a Luenberger observer was integrated into SMO to improve its ability in the detection of an incipient fault in sensors. Incorporating adaptive thresholds in the residual analysis is another way to improve the performance of SMO [99].
Overall, the dependency of the model-based FDI approaches to the accuracy of the dynamic model can be considered to be their main weakness; other than that, the summarized advantages and disadvantages of the model-based FDI techniques are presented in Table 1.

Knowledge-Based Approaches
On the contrary of model-based approaches that require a prior known model of the system, knowledge-based approaches are not dependent on the system model and need a large volume of historical data of the system performance. Various artificial intelligence methods have been applied for the detection of a fault in the current historical data set of the industrial systems. Figure 10 shows the overall block diagram of a knowledge-based FDI algorithm. Most of the knowledge-based FDI approaches are formulated to solve the diagnostic problem as a pattern recognition problem. Therefore, the FDI problem can be solved either by using statistical or non-statistical techniques. Thus, we classified the knowledge-based FDI approaches to statistical-analysis-FDI and non-statistical-analysis-FDI.

Statistical-Analysis-FDI
In the statistical FDI framework, most of the methods are based on the principal component analysis (PCA), Independent component analysis (ICA), partial least squares (PLS), statistical pattern classifiers, and support vector machine (SVM) algorithms. These algorithms require a large amount of historical data of the system performance to be trained for FDI application.
PCA is one of the most popular monitoring technique, which is used to find factors with a much lower dimension than the original data set, hence the main variations in the original data set can be correctly noticed [18]. FDI design based on PCA methods have been successfully applied to complex systems, e.g., diagnosing faults in diesel engines [137], rolling bearing [138], electrical power drive [139], and power inverters [140]. However, the fact that PCA can be only applied to Gaussian distribution data limits their application for application in non-Gaussian distribution.
PLS is another statistical tool for FDI in industrial systems. Recent achievements in the implementation PLS-based FDI can be found on [141][142][143][144][145]. In [141], a dynamic PLS algorithm was used for dynamic modeling which was able to capture the dynamic correlation between the measurement block and the quality data block; then, for FDI purpose, a total PLS (T-PLS) was introduced to the further decomposition of the obtained PLS structure. The proposed T-PLS method was able to detect quality-related anomalies in the system process. Ding et al. introduced an improved PLS-based FDI scheme based on the key performance indicator [142]. In comparison with standard PLS, their approach significantly reduced the computation load. An FDI based on a combination of the PLS method and pseudo-sample projection for the complex nonlinear system was presented in [143]. Further investigation to improve PLS-based FDI method was presented in [144,145].
In statistical-based approaches, ICA has a key role in real-time monitoring due to its ability to not restricting latent variables in following a Gaussian distribution. In [146], a kernel-ICA-based FDI was introduced for non-Gaussian nonlinear systems. ICA has been successfully implemented for designing FDI in industrial processes, e.g., rolling-element bearing [147], induction motor [148]. Recently, an FDI design based on a combination of ICA and PCA was introduced in [149]. In their approach, because PCA cannot deal with the non-Gaussian process, ICA was used for monitoring the non-Gaussian part of the process, and PCA was used for monitoring the Gaussian part. Then, a Bayesian network classifier was used to detect the fault in the system.
Among statistical-based FDI techniques, SVM is relatively new and due to its ability to work with large and low input features, it has the potential to achieve more generalization. Another advantage of the SVM is its ability by working with both Gaussian distribution and non-Gaussian distribution data. In [150], most of the SVM schemes in designing FDI dated to 2006 were reviewed. Recent studies of the SVM-based FDI can be found in [151][152][153][154][155][156]. In [151], an SVM-based FDI was developed by integrating a kernel function with cross-validation. They showed that their proposed FDI algorithm has more accuracy than conventional PLS algorithms. In [152], a genetic algorithm was used to tune the SVM parameters to improve FDI performance. Yi and Etemadi introduced an FDI design for a photovoltaic system based on multi-resolution signal decomposition and two-stage SVM [153]. In their design, the multi-resolution signal decomposition algorithm was used for feature extraction, and the two-stage SVM was used for decision making. Various applications of the SVM for FDI design have been investigated, e.g., wireless network sensor [154], ultrasonic flow meters [155], and ship propulsion system [156].

Non-Statistical-Analysis-FDI
The artificial neural network (ANN) is a powerful tool for the approximation of nonlinear systems and adaptive learning. Due to this ability, the ANN is the most popular non-statistical data-driven tool for designing FDI systems. Based on the learning strategy, the ANN can be classified into supervised FDI and unsupervised FDI. The unsupervised learning ANN uses the historical data to obtain the knowledge of emulating the normal behavior of the system, which will be used to compare with the behavior of the real-time process to check if any deviation from the expected behavior occurs or not. The supervised learning ANN has the knowledge of the normal condition and faulty condition which will be used for FDI in a real-time process. ANN-based FDI have been applied for various application, e.g., combustion engine [157], railway track circuits [158], wind turbine drive-train [159], and microgrid system [160,161].
Fuzzy logic (FL) is another non-statistic approach that can be used for FDI. In FL-based FDI, space features are partitioned into fuzzy sets, and then, fuzzy rules are designed based on human reasoning to analyze the system behavior. Figure 11 shows the overall block diagram of an FL-based FDI system.  The FL has been successfully implemented for designing the FDI system. For example, in [162], an FL-based FDI design is introduced for the detection of intermittent loss of firing pulses in the inverter power switches pulse-width modulation voltage source inverter (PWM-VSI). In this design, the fault modes of the current analysis of the PWM-VSI were used to extract the fuzzy approximation model. Among various FL approximation models, the Takagi-Sugeno (T-S) is the most efficient method which received a great deal of attention [122,[163][164][165]. In particular, in [163][164][165], a T-S dynamic modeling technique was used to design the FDI system for general nonlinear processes. To this end, a universal T-S fuzzy observer-based residual generator was created by using fuzzy Lyapunov functions; then, it was integrated to an embedded dynamic threshold to be applied for real-time FDI in an industrial process. Luo et al. introduced a T-S approach for FDI in a class of fuzzy network systems with multiplicative noises on measurement and states [122]. In their design, the T-S FL FDI was constructed to generate the residual signal, and an adjustable threshold was defined for efficient detection of the fault.
The summarized advantages and disadvantages of the reviewed knowledge-based FDI strategies are presented in Table 2.

Combined Model-Knowledge-Based Approach
Model-based and knowledge-based approaches have their distinguished advantages and various restrictions. In particular, model-based FDI approaches can diagnose the fault with the minimum computation load which makes them suitable for the real-time application. However, the accuracy of the detection depends on the accuracy of the mathematical model of the system. On the other hand, knowledge-based approaches are not dependent on the system model which makes them suitable for complex industrial systems that their models are unavailable or difficult to obtain. However, knowledge-based approaches need a large amount of data for training and suffer from the high computational load, and they may not be able to detect undefined fault types. To leverage the advantages of these two types of FDI approaches and reduce their dependency on model accuracy and computational power, a combination of these two methods was suggested. Talebi et al. introduced an integrated recurrent ANN and nonlinear observer to design an FDI system for sensor and actuator faults in a satellite [166]. The general structure of a combination of the model-based and ANN-based FDI strategy is presented in Figure 12.
Abbaspour et al. introduced an EKF approach to improve the accuracy and speed response of the ANN observer, then by the integration of ANN observer with the nonlinear observer, an FDI system for detection of a fault in sensors and actuators in nonlinear systems were developed [27,167]. A combination of the SVM and model-based observer was introduced to design FDI for chemical reactors by Sheibat et al. [168]. They found out using the only SVM for FDI would have difficulty detecting faults due to the highly nonlinear behavior of the system and transitional dynamics. Thus, they used a model-based observer based on a simplified initial model and the SVM was used to correct the uncertainties and nonlinearities of the system behavior. Their proposed approach showed effectiveness in the detection and isolation of the faults. Based on the design procedure and criteria, different combinations of model and knowledge-based approaches can be applied to achieve an effective FDI, e.g., PLS and inverse dynamic model observer [169], and hidden Markov model-based and ICA approach [170].

Conclusions
FTC systems, as an emerging and attractive topic in the field of automatic control, has received more and more attention in recent years. In this work, the concepts of fault, failure, and fault-tolerant controllers were investigated and illustrated. Passive and active FTC approaches were analyzed and compared fundamentally and reasons for the growing interest in active FTC among researchers were summarized. An active FTC approach is generally more efficient in dealing with different types of faults; however, the controller performance is primarily dependent on their FDI unit in providing timely and accurate fault information. Thus, the FDI systems were deeply investigated and fundamentally illustrated, then, we categorized FDI system based on the approaches used in their design into three main categories: model-based, knowledge-based, and combined model-knowledge-based approaches. The model-based approaches are simple to implement; however, their performance is highly dependent on the accuracy of the mathematical model of the system. The knowledge-based approaches are not dependent on the mathematical model of the system; however, they need a huge historic data about the system performance for training purposes. The combined model-knowledge-based approach has less dependency on model accuracy and needs less training data; however, the design complexity would increase and the designer should have knowledge of both approaches to design an efficient system. Furthermore, the recent achievements in FDI approaches and active FTC techniques were reviewed, and their advantages and disadvantages were discussed.
Author Contributions: A.A. prepared an initial draft and overall structure of this survey paper for fault tolerant control techniques and invited other authors to contribute and expand the research scope. S.M. and A.S. were involved in reviewing the most updated approaches in the field of FTC. K.K.Y. supervised the research and co-wrote the paper. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest.