Fault Detection and Isolation for Redundant Inertial Measurement Unit under Quantization

Fault detection and isolation with redundant strapdown inertial measurement unit is critical for ensuring the reliability of the guidance or navigation system in the fields of both aeronautics and astronautics. Although the parity space approach is used widely, it cannot detect the soft fault which affects navigation performance under pulse quantization. This paper develops the three-channel filters to detect the soft fault and conducts theoretical implementation. The constraint conditions of their parameters are explored and the influence of the weight of different ratios is analyzed. The Monte Carlo simulation is carried out in order to verify the validity of the fault detection and isolation method. The simulation results and their analysis provide a theoretical reference for fault detection and isolation with redundant strapdown inertial measurement unit.


Introduction
In many aerospace applications, the reliability of an inertial sensor is critically important [1].Two methods are used to enhance the navigation system's reliability: one is increasing the precision of a single inertial device, whose convergence, however, is limited; another is integrating multiple inertial sensors (more than three) by using appropriate redundant configurations in order to greatly improve the commonly-used navigation system's reliability and precision [2].The Delta 4 rocket of the USA uses two sets of laser strapdown inertial measurement unit (IMU) with the relative-rotation installation.The European Space Agency's (ESA) Ariane 5 rocket uses two sets of laser gyroscope strapdown IMU with the coaxial installation.Japan's H-2A rocket uses three orthogonal axes plus an inclined axis configuration of the strapdown IMU.To provide information on redundant measurement and improve the reliability and accuracy of their navigation systems, most of China's new launch vehicles also use the redundant strapdown IMU; for example, its CZ-6 rocket uses double-eight strapdown inertial sensors, and its YZ-1 rocket uses ten redundant strapdown inertial sensors.However, if an inertial sensor fails during flight, the information on failure to be fused may pollute all of the information on navigation, which is a serious threat to flight safety and harms the navigation system's performance.For example, the inertial navigation system of China's CZ-3B rocket failed in 1996.The CZ-3B rocket and the satellite mounted on it were destroyed due to sensor failure and caused huge damage to personnel and property.Therefore, it is necessary to conduct timely and accurate fault detection and isolation (FDI) of the redundant IMU in a rocket.
Many scholars developed the parity space approach (generalized likelihood method [3,4], the optimal parity vector method [5][6][7] and the singular value decomposition (SVD) method [8]) for the fault detection and isolation of redundant IMU, whose difference mainly lies in the decoupling matrix construction method.At present, the scholars' research mainly focuses on improving the parity space approach in order to enhance the navigation performance of the redundant IMU.For example, Kwang [9] and Lee [10] proposed a new FDI method which is suitable for two faulty sensors based on the extended parity space approach.In 2012, Lee [11] applied the parity space analysis method to principal component analysis (PCA): he used six single-axis gyroscopes along the conical surface of a strapdown inertial navigation system (SINS), but this method cannot detect soft fault (small magnitude, which affects navigation performance) successfully.Seung [12] used the parity equation and the discrete wavelet transform of the hybrid fault detection method for an unmanned aerial vehicle's (UAV) inertial navigation sensors, which achieved good detection results.Élcio [13] proposed a fault detection method based on PCA and SVD mainly for minimal-redundancy systems, which divide FDI into two parts and achieves good results on the low-level and step-biased fault.Zhang [14] found that the decoupling matrix vector of the optimal parity method is linearly relevant and leads to a high false alarm rate.To reduce the high false alarm rate and high isolation error rate of the SVD method presented in [8], Wen [15] improved the decoupling matrix construction method with the parity vector unit, obtaining a lower false alarm rate.
However, the existing approaches cannot detect soft fault correctly after quantization.The angle increment is converted into the number of pulses of the laser gyroscope which has a measurement output.This output is not a real angular velocity but rather the angular increment of the sampling period and the quantization value of the pulse.Similarly, the accelerometer's output is not a real-time apparent acceleration but the sampling period of the speed increment and its quantization value.The quantization of the laser gyroscope and the accelerometer may disturb the error structure of the original information and bring many transient errors for low maneuvers, which may cause difficulty in fault detection.Yi [16] only introduced the low-pass filter concept and combined the generalized likelihood test (GLT) method for fault diagnosis.
To date, no researchers have put forward a perfect filtering method and theoretically analyzed the FDI of redundant strapdown inertial measurement unit (RIMU).This paper developed the three-channel filters and conducted their theoretical implementation.The method can be utilized well in micro-grid applications such as faults in solar and wind systems [17][18][19][20][21].The paper proposed the constraint conditions of their parameters and analyzed the performance of the weight of different ratios.Thus, this paper is organized as follows: Section 2 provides a theoretical background to the parity space approach, analyzes the existing problems and presents the constraint conditions of parameters of three-channel filters.Section 3 gives simulation results and their analysis.Section 4 presents the conclusions.

The Parity Space Approach
A RIMU usually has more than three inertial sensors (such as a rate gyroscope, accelerometer and magnetometer).This paper mainly focuses on the IMUs of at least five inertial sensors for fault detection and isolation.A measurement equation is defined as follows: where ε ∈ R n×1 is the measurement noise vector, which satisfies E(ε) = 0 and E εε T = σ 2 I. n is the number of gyroscopes.H ∈ R n×3 is the observation matrix defined by the installation structure of the RIMU.X ∈ R 3×1 is the information on inertial state.Z ∈ R n×1 is the measurement.We define the decoupling matrix V, which satisfies the following conditions: Only the decoupling matrix V satisfies the above expressions.We can use the observation matrix and the Potter [22] algorithm to solve the decoupling matrix.They suggest that the decoupling matrix V should be the upper triangular matrix that has the right diagonal elements and that its orthogonalization can fully determine its elements.The details of Potter's algorithm are as follows: With the decoupling matrix V, the following linear independent parity equation set can be obtained: p = VZ, where p is the parity vector.We define the detection function as follows: If DF d > T d , then we decide that a fault occurs.
If DF d ≤ T d , then we decide that no fault occurs.
Here, T d is the time-varying threshold value set beforehand by the Monte Carlo method.For example, we generate 100 detection judgment sequences based on digital flight trajectory without fault.For each sampling time, we arrange the values from small to large and the (100-ζ)th data is the threshold when a false alarm rate ζ is given.After a fault is detected, the fault isolation function is needed and defined as follows: where V j is the jth column of the decoupling matrix V. We calculate DFI j (j = 1, 2, . . .m) to find out the maximum DFI j , for example, DFI k j .Then, we assume that the kth sensor has a fault.

Problem Brought by Quantization
The output of the laser IMU is the number of quantized pulses; the quantization process is shown in Figure 1.The measurement Z is divided by the equivalent pulse ∆, which is the numerical value of the angular velocity corresponding to one pulse.The integer part is the measurement output Q, and the remainder part accumulates to the next sampling period.The error between Q × ∆ and Z is the quantization error.Taking a gyroscope as an example, the pulse equivalency is set to 1" (4.8 × 10 −6 rad) and the sampling period T is 20 ms.When the angular increment in the sampling period is less than one pulse equivalent, the output of the cycle would be 0. The angular increments within the k period would be accumulated to the k + 1 cycle for outputting.The output of pulse equivalency is equal to pulse equivalency 1"/T = 50 • /h in the sample period.That is to say, after quantization, the instantaneous error is likely to be more than 50 times, which would cause great difficulties in fault detection.The quantization causes the problem that quantization errors make it difficult to detect fault correctly.The fault-tolerant air data inertial reference system (FT/ADIRS) developed by Honeywell is the most successful engineering product.The core component is a strapdown inertial reference system consisting of six GG1320 laser gyroscopes and six accelerometers (Hexad system) [23].We take the hexad system as an example, which is shown in Figure 2.After experiments on quantization, we obtain the data on the inertial sensor, whose gyroscope bias is 0.05 / h  .We add faults of different amplitudes to the 1st gyroscope in 500 s.The simulation results are shown in Figure 3.
Figure 1.The quantization process.
The quantization causes the problem that quantization errors make it difficult to detect fault correctly.The fault-tolerant air data inertial reference system (FT/ADIRS) developed by Honeywell is the most successful engineering product.The core component is a strapdown inertial reference system consisting of six GG1320 laser gyroscopes and six accelerometers (Hexad system) [23].We take the hexad system as an example, which is shown in Figure 2.After experiments on quantization, we obtain the data on the inertial sensor, whose gyroscope bias is 0.05 • /h.We add faults of different amplitudes to the 1st gyroscope in 500 s.The simulation results are shown in Figure 3.The quantization causes the problem that quantization errors make it difficult to detect fault correctly.The fault-tolerant air data inertial reference system (FT/ADIRS) developed by Honeywell is the most successful engineering product.The core component is a strapdown inertial reference system consisting of six GG1320 laser gyroscopes and six accelerometers (Hexad system) [23].We take the hexad system as an example, which is shown in Figure 2.After experiments on quantization, we obtain the data on the inertial sensor, whose gyroscope bias is 0.05 / h  .We add faults of different amplitudes to the 1st gyroscope in 500 s.The simulation results are shown in Figure 3.As Figure 3a shows, the function value does not exceed the threshold value in the whole fault detection phase when the amplitude of fault is 5 / h  .However, when the amplitude of fault increases to 100 / h  , the fault can be detected with this method.At this time, the detection performance is essentially the same as the measurement not quantified.That is to say, the quantization error reduces the fault amplitude by about 20 times with the equivalent pulse.This  Observation matrix:

Pulse equivalent
As Figure 3a shows, the function value does not exceed the threshold value in the whole fault detection phase when the amplitude of fault is 5 • /h.However, when the amplitude of fault increases to 100 • /h, the fault can be detected with this method.At this time, the detection performance is essentially the same as the measurement not quantified.That is to say, the quantization error reduces the fault amplitude by about 20 times with the equivalent pulse.This paper develops the three-channel filter to solve the problems brought about by quantization.

Fault Classification
Only when the amplitude of a gyroscope's fault increases to 100 • /h is the detection performance improved.There are three categories of fault according to its amplitude: hard fault, medium fault and soft fault [24].The fault of a laser gyroscope has the following categories, as shown in Table 1 (gyroscope bias: 0.05 • /h).

Fault Type Fault Magnitude
Hard fault >100 Hard fault has a comparatively large amplitude which primarily affects flight control performance.Medium fault has a medium amplitude which affects pilot display performance.Soft fault has a comparatively small amplitude which affects navigation performance.

The Three-Channel Filter
For a navigation system, faults of different amplitudes have different fault tolerance capabilities.The inertial sensor is integral and its errors accumulate over time.The hard fault must be detected immediately.The medium fault allows a fault tolerance system to have a short time delay.The soft fault has a slightly larger fault tolerance capability.Therefore, faults of different amplitudes should have different detection strategies.We use the three-channel filter to detect the three categories of fault in order to achieve better performance.The fault detection method based on the three-channel filter is shown in the following figure.
As is shown in Figure 4, each channel operates independently and does not affect each other.The procedures are classifying the faults according to their amplitude first, and then sending them to different fault detection channels.The original channel filter detects hard fault, the first-order channel detects medium fault, and the second-order channel detects soft faults.
three categories of fault in order to achieve better performance.The fault detection method based on the three-channel filter is shown in the following figure.
As is shown in Figure 4, each channel operates independently and does not affect each other.The procedures are classifying the faults according to their amplitude first, and then sending them to different fault detection channels.The original channel filter detects hard fault, the first-order channel detects medium fault, and the second-order channel detects soft faults.

The Filter Design
Based on the research and analysis of a low-pass filter, we put forward the constraint conditions that affect the fault detection performance: namely the cut-off frequency of the filter and its rising time.The cut-off frequency mainly affects false alarm rate and false isolation rate, while the rising time mainly decides the delay time of the FDI system.However, the cut-off frequency and the rise time are contradictory; therefore, we take the following compromise approach and theoretically analyze the parameters of the filter.

Original Channel
The large amplitude of hard fault does not allow the original channel filter to have time delay; therefore, this channel does not need preprocessing.

First-Order Channel Filter
First-order channel filter uses the first-order low-pass filter (time constant: T) The rise time of the system represents rapidity.
The low-pass filter can eliminate high-frequency noise so that only low-frequency signals can pass.Therefore, in order to achieve this goal, we make the cut-off frequency become small because when the signal frequency is smaller than the cut-off frequency, amplitude attenuation and phase attenuation are relatively small, and when the signal frequency exceeds the cut-off frequency, amplitude attenuation and phase attenuation both become quite large.Therefore, suppose there is an indication function: where 1 k and 2 k represent the weights of rising time and cut-off frequency, respectively, which are decided by the performance of the FDI system.The reasonable choice of 1 k and 2 k can make the value (T) y minimal.We obtain the following with the derivation of time constant T on both sides:

The Filter Design
Based on the research and analysis of a low-pass filter, we put forward the constraint conditions that affect the fault detection performance: namely the cut-off frequency of the filter and its rising time.The cut-off frequency mainly affects false alarm rate and false isolation rate, while the rising time mainly decides the delay time of the FDI system.However, the cut-off frequency and the rise time are contradictory; therefore, we take the following compromise approach and theoretically analyze the parameters of the filter.

Original Channel
The large amplitude of hard fault does not allow the original channel filter to have time delay; therefore, this channel does not need preprocessing.

First-Order Channel Filter
First-order channel filter uses the first-order low-pass filter (time constant: T) The rise time of the system represents rapidity.
The low-pass filter can eliminate high-frequency noise so that only low-frequency signals can pass.Therefore, in order to achieve this goal, we make the cut-off frequency become small because when the signal frequency is smaller than the cut-off frequency, amplitude attenuation and phase attenuation are relatively small, and when the signal frequency exceeds the cut-off frequency, amplitude attenuation and phase attenuation both become quite large.Therefore, suppose there is an indication function: where k 1 and k 2 represent the weights of rising time and cut-off frequency, respectively, which are decided by the performance of the FDI system.The reasonable choice of k 1 and k 2 can make the value y(T) minimal.We obtain the following with the derivation of time constant T on both sides: Let y (T) = 0, and then we can obtain: where α expresses the ratio of cut-off frequency to rising time in the first-order channel filter.At the same time, it also indirectly indicates the ratio of false alarm rate to delay time for medium fault.

Second-Order Filter Channel
The transfer function of the critical damping system of the second-order channel filter can be defined as follows: The time constant of the second-order channel filter should be determined by the fault tolerance time of its soft fault.As we know, the smaller the time constant is, the faster the second-order channel filter responses.Thus, we should select a reasonable time constant and make the channel filter have a short response time and a low cut-off frequency.The weights of the performance indices of the response time and the cut-off frequency are k 3 and k 4 , respectively.The rising time of the critical damping system is The cut-off frequency is inversely proportional to the time constant T. The index function of the second-order channel filter is The next step is the same as the first-order channel filter, and therefore we obtain According to the performance of the FDI method, we select k 3 and k 4 reasonably and obtain a proper time constant, where β expresses the ratio of cut-off frequency to rising time in the second-order channel filter.At the same time, it indirectly indicates the ratio of false alarm rate to delay time for soft fault.

Brief Description
The first two parts present the calculation formula for analyzing the channel filter's parameters, which are mainly affected by the weight of the ratio of cut-off frequency to rise time.Therefore, the channel filter's parameters are chosen mainly through analyzing the changes in the ratio.Taking the second-order channel filter as example, the large ratio causes large parameters of the second-order channel filter; this means that the FDI system using the large parameters will have a lower false alarm rate and false isolation rate but a longer delay time.To detect soft fault, it is reasonable for the second-order channel filter to have large parameters.

Experiments Scheme
The experiments for step, ramp and square wave faults are carried out to evaluate the method performance.We add the faults to the 1st gyroscope in 500 s.The gyroscope error contains bias, scale factor error, installation error, random walk.The parameters used in experiments are shown in Table 2.According to the analysis in Section 2, the parameters of the channel filter are not unique.Therefore, their comparison is made by using the step fault amplitudes of 5 • /h and 0.5 • /h in order to verify the correctness of the method proposed in this paper.Two fault amplitudes are added to the laser gyroscope.The parameters selected for the first-order and second-order channel filters are 0.434 and 3.85 (from Table 5) respectively.
Figure 5 shows that the fault detection function of the original channel filter has never exceeded its threshold value, indicating that no fault has been detected.Figure 6 shows that the fault detection function exceeds the threshold value after 500 s, indicating that the first-order channel filter has detected a fault.Furthermore, its fault isolation function has been separated after 500 s, indicating that this channel filter is also able to isolate fault.Figure 6 also shows that fault is both detected and isolated correctly.The comparison of Figures 6 and 7 indicates that the first-order channel filter's detection delay time is relatively short but that its detection accuracy is not high (the curve is rougher).The second-order channel filter's detection delay time is long but its detection accuracy (the curve is smoother) is high.Therefore, it is concluded that fault detection delay and detection accuracy are contradictory.

The Comparison of Different Parameters
According to the analysis in Section 2, the parameters of the channel filter are not unique.Therefore, their comparison is made by using the step fault amplitudes of o 5 /h and o 0.5 /h in order to verify the correctness of the method proposed in this paper.Two fault amplitudes are added to the laser gyroscope.The parameters selected for the first-order and second-order channel filters are 0.434 and 3.85 (from Table 5) respectively.
Figure 5 shows that the fault detection function of the original channel filter has never exceeded its threshold value, indicating that no fault has been detected.Figure 6 shows that the fault detection function exceeds the threshold value after 500 s, indicating that the first-order channel filter has detected a fault.Furthermore, its fault isolation function has been separated after 500 s, indicating that this channel filter is also able to isolate fault.Figure 6 also shows that fault is both detected and isolated correctly.The comparison of Figures 6 and 7 indicates that the first-order channel filter's detection delay time is relatively short but that its detection accuracy is not high (the curve is rougher).The second-order channel filter's detection delay time is long but its detection accuracy (the curve is smoother) is high.Therefore, it is concluded that fault detection delay and detection accuracy are contradictory.Figures 8-10 show that the original channel filter and the first-order channel filter cannot detect a fault that has such a small amplitude.Figure 10 show that the fault detection function has exceeded the threshold value before 500 s, indicating that it has detected a fault.Figures 8-10 show that the original channel filter and the first-order channel filter cannot detect a fault that has such a small amplitude.Figure 10 show that the fault detection function has exceeded the threshold value before 500 s, indicating that it has detected a fault.Figures 5-10, using the same sets of parameters of the channel filters, show that the first-order channel filter and the second-order channel filter can detect and isolate the fault that has a large amplitude.But the original channel filter and the first-order channel filter cannot detect the fault that has a small amplitude; only the second-order channel filter can.

The Monte Carlo Simulation
The first-order channel filter and the second-order channel filter use the same principles for parameter selection.Take the second-order channel filter as an example.The parameters of time constant, fault magnitude and sample time are considered in the Monte Carlo analysis.
With the different parameters (based on the ratio of cut-off frequency to rising time) of the second-order channel filter, its false alarm rate, false isolation rate and positive detection rate are calculated as 1000 times the Monte Carlo simulation.The statistical results are shown in Table 3 (typical fault).In this table, the false alarm rate is defined as the probability of failure alarm without fault injection; the false isolation rate is defined as the probability of correctly detecting fault but not isolating it correctly.Their formulas are as follows.PCD 1 refers to the probability that a fault is detected within 1 s.PCD 2 refers to the probability that a fault is detected within 1 to 10 s.PCD 3 refers to the probability that a fault is detected within 10 to 100 s.PCD 4 the probability that a fault is detected within more than 100 s.Figures 5-10, using the same sets of parameters of the channel filters, show that the first-order channel filter and the second-order channel filter can detect and isolate the fault that has a large amplitude.But the original channel filter and the first-order channel filter cannot detect the fault that has a small amplitude; only the second-order channel filter can.

The Monte Carlo Simulation
The first-order channel filter and the second-order channel filter use the same principles for parameter selection.Take the second-order channel filter as an example.The parameters of time constant, fault magnitude and sample time are considered in the Monte Carlo analysis.
With the different parameters (based on the ratio of cut-off frequency to rising time) of the second-order channel filter, its false alarm rate, false isolation rate and positive detection rate are calculated as 1000 times the Monte Carlo simulation.The statistical results are shown in Table 3 (typical fault).In this table, the false alarm rate is defined as the probability of failure alarm without fault injection; the false isolation rate is defined as the probability of correctly detecting fault but not isolating it correctly.Their formulas are as follows.PCD 1 refers to the probability that a fault is detected within 1 s.PCD 2 refers to the probability that a fault is detected within 1 to 10 s.PCD 3 refers to the probability that a fault is detected within 10 to 100 s.PCD 4 the probability that a fault is detected within more than 100 s.Table 3 helps draw the conclusion that with the increase of the parameters of the second-order channel filter, its false alarm rate and false isolation rate decreases, whereas the channel filter cannot detect any fault in a short time.Hence, we design and select the appropriate time constant according to actual conditions of RIMU.
When the parameter of the second-order channel filter is 3.85, we conduct the Monte Carlo simulations 1000 times to verify the detection effect of the same parameter on detecting the fault of different amplitudes.
Table 4 shows that the same parameter of the second-order channel filter has different FDI effects on the faults of different amplitudes.The fault of a smaller amplitude leads to a worse FDI effect.In other words, to detect or isolate the fault of smaller amplitude, the FDI method may need a larger parameter.The amplitudes of square wave fault is o 5 /h , and the frequency is 5 Hz.The FDI results are shown in Figures 14-16.
It is found that the fault detection function of the original channel filter has never exceeded its threshold value in Figure 14, indicating that no fault has been detected.Figures 15 and 16 show that the fault detection function exceeds the threshold in 500 s, indicating that the first-order channel filter and the second-order channel filter have detected the fault.
Take the second-order channel filter as an example (the Monte Carlo simulation 1000 times).The statistical results are shown in the following Table 7.It is found that the ramp fault detection function of the original channel filter has never exceeded its threshold value in Figure 11, indicating that no fault has been detected.The Figure 12 shows that the fault detection function exceeds the threshold nearing 600 s, indicating that the first-order channel filter has detected a fault.However, its fault isolation function has been separated, nearing 600 s.Furthermore, Figure 13 shows that the fault detection function exceeds the threshold, also nearing 600 s, indicating that the second-order channel filter has detected a fault.Its fault isolation function has been separated nearing 600 s and the curve is smoother.
Similarly, we take the second-order channel filter as an example (the Monte Carlo simulation 1000 times).The statistical results are shown in the following Table 6.tolerance and we can choose a bigger parameter for the second-order channel filter according to the application environment.
(2) The relationship between time constant and the effect on detecting step fault is also studied in this paper using enumeration.Additionally, the parameters for detecting or isolating the typical step fault in different sampling times are recommended.
(3) The extended experiments have shown that the method could also detect and isolate the ramp and square wave faults.The time constant of the filters should be prolonged to achieve a low false alarm rate and low false isolation rate.Furthermore, the time constants of the ramp fault and the square wave fault require further investigation.

Figure 4 .
Figure 4.The fault detection method based on the three-channel filter.

Figure 4 .
Figure 4.The fault detection method based on the three-channel filter.

Figures 8 -
Figures 8-10 that the original channel filter and the first-order channel filter cannot detect a fault that has such a small amplitude.Figure10show that the fault detection function has exceeded the threshold value before 500 s, indicating that it has detected a fault.

Table 3 .
The relationship between time constant and the effect on detecting faults (step fault ).

Table 3 .
The relationship between time constant and the effect on detecting faults (step fault 5 • /h ).

Table 4 .
Calculation results of Monte Carlo simulations.

Table 5
recommends the parameters of the channel filters to detect or isolate a typical fault in different sampling time (the Monte Carlo simulation performed 1000 times; the step fault is 5 • /h; the false alarm rate and false isolation rate are 0.05 respectively).

Table 5 .
Parameters recommended for fault detection and isolation (FDI) with the typical fault in different sampling times.

Table 6 .
The relationship between time constant and the effect on detecting faults (ramp fault).