Infrared-Inertial Navigation for Commercial Aircraft Precision Landing in Low Visibility and GPS-Denied Environments

This paper proposes a novel infrared-inertial navigation method for the precise landing of commercial aircraft in low visibility and Global Position System (GPS)-denied environments. Within a Square-root Unscented Kalman Filter (SR_UKF), inertial measurement unit (IMU) data, forward-looking infrared (FLIR) images and airport geo-information are integrated to estimate the position, velocity and attitude of the aircraft during landing. Homography between the synthetic image and the real image which implicates the camera pose deviations is created as vision measurement. To accurately extract real runway features, the current results of runway detection are used as the prior knowledge for the next frame detection. To avoid possible homography decomposition solutions, it is directly converted to a vector and fed to the SR_UKF. Moreover, the proposed navigation system is proven to be observable by nonlinear observability analysis. Last but not least, a general aircraft was elaborately equipped with vision and inertial sensors to collect flight data for algorithm verification. The experimental results have demonstrated that the proposed method could be used for the precise landing of commercial aircraft in low visibility and GPS-denied environments.


Introduction
Landing is the most accident-prone phase of flight for both military and civil aircraft. This is due to the manoeuvring sequence required to exhaust a large amount of aircraft kinetic energy in a relatively small area. Fixed-wing aircraft usually descend smoothly at a constant angle, pointing in the direction of the runway centerline, and touch down at the beginning of the runway. If low visibility conditions (e.g., fog or haze) are encountered, the pilots have no choice but to manipulate the aircraft to land using navigation instruments. If the conventional radio navigation systems are disturbed or disabled, they can mislead the pilots and cause a Controlled Flight Into Terrain (CFIT) accident. In addition, most airports are equipped with simple and coarse radio beacons rather than expensive and precise ground-based guidance systems. Nowadays neither avionics systems, nor airport infrastructures are perfectly designed to support precision landing. Faced with these challenges, an autonomous, accurate and affordable landing navigation mechanism is extremely necessary for most fixed-wing aircrafts.
The traditional landing aid systems include the Instrument Landing System (ILS), and Global Position System (GPS). However, these systems themselves have deficiencies for fixed-wing aircraft precision landing. ILS can only guide an aircraft to the decision height (DH, usually DH = 100 ft) and

Methodology
This section explicitly details the framework of the proposed visual-inertial navigation approach. Vision observations with the camera pose deviations are designed elaborately, then the visual-inertial fusion based on SR-UKF is constructed, and its observability is analyzed.

Framework of Infrared-Inertial Landing Navigation
In general, a complete landing procedure of a commercial aircraft includes two parts: an instrument flight segment and a natural vision segment. The instrument portion of an instrument landing procedure ends at the Decision Altitude (DA), and the visual segment begins just below DA and continues to the runway. Prior to reaching DA, the pilot's primary references for maneuvering the airplane are the aircraft instruments and onboard navigation system. As the pilot approaches the DA, he or she looks for the approach lighting system, if there is one, as well as the runway threshold and touchdown zone lights, markings, surfaces. These visual references help the pilot align the aircraft with the runway and provide position and distance remaining information. At 100 feet above the Threshold Elevation (THRE), the visual transition point, the pilot makes a determination about whether the flight Sensors 2019, 19, 408 4 of 24 visibility is sufficient to continue the approach and distinctly identify the required visual references using natural vision. If the requirements identified above are met, the pilot may continue descending below DA down to 100 feet height above THRE [39]. Otherwise, the pilot should pull up the aircraft at once, as shown in Figure 1. In order to land commercial airplane safely in GPS-denied and low visibility environments, the pilot needs to obtain accurate navigation information, especially the flight altitude. In the present paper, the proposed method is aimed at landing above 100 feet. pilot should pull up the aircraft at once, as shown in Figure 1. In order to land commercial airplane safely in GPS-denied and low visibility environments, the pilot needs to obtain accurate navigation information, especially the flight altitude. In the present paper, the proposed method is aimed at landing above 100 feet. Among several flight parameters, the flight height is one of the most important ones for the pilot's decision in a landing procedure. Usually the height measured by barometer or radio altimeter is inaccurate, while the altitude captured by GPS is unreliable. In this paper, the FLIR camera and IMU can complement each other in nature and fuse well in a filtering framework. This paper proposes a novel visual-inertial landing navigation approach based on the SR-UKF, in which visual observation and inertial measurements are integrated to estimate aircraft landing motion. This novel visual-inertial navigation system (VINS) is composed of a FLIR camera, an IMU, a barometer (BARO), a radio altimeter (RALT) and a processing unit that is in charge of motion estimation of the aircraft.
As shown in Figure 2, the inertial measurements are used to propagate the system states, whereas the homography is chosen as the visual observation. The proposed visual-inertial integration can be used for commercial aircraft precise landing in GPS-denied and low visibility. This method involves three key issues: process modeling, measurement modeling, and its observability. Firstly, this novel vision observation is designed in Section 2.2. Then the visual-inertial navigation based on SR-UKF is proposed in Section 2.3. Finally, the observability of the proposed algorithm is analyzed in Section 2.4. Among several flight parameters, the flight height is one of the most important ones for the pilot's decision in a landing procedure. Usually the height measured by barometer or radio altimeter is inaccurate, while the altitude captured by GPS is unreliable. In this paper, the FLIR camera and IMU can complement each other in nature and fuse well in a filtering framework. This paper proposes a novel visual-inertial landing navigation approach based on the SR-UKF, in which visual observation and inertial measurements are integrated to estimate aircraft landing motion. This novel visual-inertial navigation system (VINS) is composed of a FLIR camera, an IMU, a barometer (BARO), a radio altimeter (RALT) and a processing unit that is in charge of motion estimation of the aircraft.
As shown in Figure 2, the inertial measurements are used to propagate the system states, whereas the homography is chosen as the visual observation. The proposed visual-inertial integration can be used for commercial aircraft precise landing in GPS-denied and low visibility. pilot should pull up the aircraft at once, as shown in Figure 1. In order to land commercial airplane safely in GPS-denied and low visibility environments, the pilot needs to obtain accurate navigation information, especially the flight altitude. In the present paper, the proposed method is aimed at landing above 100 feet. Among several flight parameters, the flight height is one of the most important ones for the pilot's decision in a landing procedure. Usually the height measured by barometer or radio altimeter is inaccurate, while the altitude captured by GPS is unreliable. In this paper, the FLIR camera and IMU can complement each other in nature and fuse well in a filtering framework. This paper proposes a novel visual-inertial landing navigation approach based on the SR-UKF, in which visual observation and inertial measurements are integrated to estimate aircraft landing motion. This novel visual-inertial navigation system (VINS) is composed of a FLIR camera, an IMU, a barometer (BARO), a radio altimeter (RALT) and a processing unit that is in charge of motion estimation of the aircraft.
As shown in Figure 2, the inertial measurements are used to propagate the system states, whereas the homography is chosen as the visual observation. The proposed visual-inertial integration can be used for commercial aircraft precise landing in GPS-denied and low visibility.  This method involves three key issues: process modeling, measurement modeling, and its observability. Firstly, this novel vision observation is designed in Section 2.2. Then the visual-inertial navigation based on SR-UKF is proposed in Section 2.3. Finally, the observability of the proposed algorithm is analyzed in Section 2.4. This method involves three key issues: process modeling, measurement modeling, and its observability. Firstly, this novel vision observation is designed in Section 2.2. Then the visual-inertial navigation based on SR-UKF is proposed in Section 2.3. Finally, the observability of the proposed algorithm is analyzed in Section 2.4.

Homography between Synthetic and Real Images
Before proposing the measurement model, we need to analyze the vision measurement mechanism. During the landing, the aircraft descends along the glide slope, and the optical axis of the FLIR camera is aligned with the airport runway. The camera pose is composed of the calibrated IMU/camera relative pose and the measured IMU pose. Ideally the measured camera pose should be equal to the real camera pose. The synthetic image is derived by the terrain data and the measured camera pose, and the real image is captured by the FLIR camera. Therefore, in the image plane the synthetic runway features should be in coincidence with the real detected features accurately. However, the random errors of inertial sensors bring a deviation between the measured camera pose and the real camera pose and further lead to the mismatch between the synthetic runway features and the real runway features. The relationships between the measured camera pose Φ n ,P n and the real camera pose (Φ n , P n ) in the navigation reference frame are described as follows: whereΦ n andP n denote the measured attitude and position of the FLIR camera in the navigation reference frame, respectively, Φ n and P n represent the real attitude and position of the FLIR camera in the navigation reference frame separately, ∆ψ n and ∆P n are the attitude and position measurement deviations of the FLIR camera in the navigation reference frame individually. As shown in Figure 3, at the time t the transformation from the synthetic image to the real image satisfies the homography H R M (t), so the synthetic runway features and the real runway features can be understood as two independent visual projections of the same runway from the geodetic coordinate system to the pixel coordinate system, respectively, derived by the real camera pose and the estimated pose. R R M (t) and T R M (t) represent the relative rotation and translation of the FLIR camera from the measured pose to the real pose separately. N M (t) is the unit normal vector of the airport plane with respect to the FLIR camera in the measured pose, d M (t) denotes the distance from the airport plane to the optical center of the FLIR camera in the measured pose.

Homography between Synthetic and Real Images
Before proposing the measurement model, we need to analyze the vision measurement mechanism. During the landing, the aircraft descends along the glide slope, and the optical axis of the FLIR camera is aligned with the airport runway. The camera pose is composed of the calibrated IMU/camera relative pose and the measured IMU pose. Ideally the measured camera pose should be equal to the real camera pose. The synthetic image is derived by the terrain data and the measured camera pose, and the real image is captured by the FLIR camera. Therefore, in the image plane the synthetic runway features should be in coincidence with the real detected features accurately. However, the random errors of inertial sensors bring a deviation between the measured camera pose and the real camera pose and further lead to the mismatch between the synthetic runway features and the real runway features. The relationships between the measured camera pose ( ), whereˆn Φ andˆn Ρ denote the measured attitude and position of the FLIR camera in the navigation reference frame, respectively, n Φ and n P represent the real attitude and position of the FLIR camera in the navigation reference frame separately, where R C n b is the attitude matrix of the FLIR camera in the real pose, and M C n b is the attitude matrix of the FLIR camera in the measured pose. Obviously, the homography matrix contains the deviation between the real camera pose and the measured camera pose, which can be calculated by the line features of synthetic runway and real runway. Furthermore, the synthetic runway features can be derived by geo-information and inertial measurements, and the real runway features can be extracted from FLIR images in real-time.

Synthetic Runway Features
In the proposed VINS, a FLIR camera and an IMU are installed on the aircraft. As shown in Figure 4, these reference frames obey the right-hand rule in this paper.
where R n b C is the attitude matrix of the FLIR camera in the real pose, and M n b C is the attitude matrix of the FLIR camera in the measured pose. Obviously, the homography matrix contains the deviation between the real camera pose and the measured camera pose, which can be calculated by the line features of synthetic runway and real runway. Furthermore, the synthetic runway features can be derived by geo-information and inertial measurements, and the real runway features can be extracted from FLIR images in real-time.

Synthetic Runway Features
In the proposed VINS, a FLIR camera and an IMU are installed on the aircraft. As shown in Figure 4, these reference frames obey the right-hand rule in this paper.   image are derived by the runway geographic information and the measured pose of IMU. This vision projection process involves five coordinate transformations as follows: (1) Transformation between geodetic and Cartesian coordinates in the ECEF reference frame The geodetic coordinate that contains longitude L i , latitude λ i and ellipsoidal height h i of any point can be transformed to the Cartesian coordinate in the ECEF reference frame by the following equation: where R n is the radius of curvature in the prime vertical, and e is the first eccentricity of the Earth.
(2) From {E} to {G} Any known point in the ECEF can be projected into the geographic coordinate system with the IMU center as its origin: where E P f denotes the Cartesian coordinates of any point f on the runway surface, E P a represents the Cartesian coordinates of the IMU. In order to facilitate the coordinate transformation, the geographic coordinate system {G} is selected as the navigation coordinate system {N}.
The navigation coordinate system {N} has the same origin with the body coordinate system {B}, the former rotates yaw-pitch-roll angle round X N − Y N − Z N axis to the latter in sequence, as follows: where C n b denotes the attitude matrix.
The rigid connection between aircraft body and camera contains a relative rotation R C B and translation T C B that has been accurately calibrated before flight: According to the pinhole imaging model [42], the homogeneous coordinate projection of any point in the pixel coordinate system is: where Z c is the normalization coefficient, dx and dy represent the pixel sizes in image u and v axes respectively, (u 0 , v 0 ) are the coordinates of the principal point, s is the skew parameter, and f is the focal length of the FLIR camera. Equations (10)- (14) give a complete transformation from the runway plane to the pixel plane of airborne FLIR camera, as shown in Figure 5. Therefore, a marking point E P f on the airport runway can be projected onto the pixel plane as a point P P f ∈ 2 . Line features of airport runway can be generated by the projection model combining IMU pose and runway geo-information. Consequently, the pixel coordinate of line features can be described as:

Real Runway Features
Visible images have high spatial resolution and rich texture details, but these images can be easily influenced by severe conditions, such as poor illumination, fog, and other effects of bad weather. Visible images capture reflected light, whereas infrared images capture thermal radiation. In general, infrared images are resistant to these disturbances. In the present paper, we adopt the SWIR camera to capture FLIR images with important airfield features in low visibility. However, infrared images typically have defects of low resolution and poor texture [31]. Existing runway detection algorithms [32][33][34] cannot satisfy the requirements of robustness and accuracy in airborne extract runway features from FLIR images accurately and robustly. Improvements have been made on the basis of our recently proposed method [35]. In the presented paper, the detection result of the previous image is used as the prior knowledge of the next image to detect and extract four runway edges instead of left and right edges from the FLIR image, as shown in Figure 6.  Line features of airport runway can be generated by the projection model combining IMU pose and runway geo-information. Consequently, the pixel coordinate of line features can be described as: where P P f = r s c s T is the pixel coordinate projected from the starting point E P s , P P t = r t c t T is the pixel coordinate projected from the terminal point E P t .

Real Runway Features
Visible images have high spatial resolution and rich texture details, but these images can be easily influenced by severe conditions, such as poor illumination, fog, and other effects of bad weather. Visible images capture reflected light, whereas infrared images capture thermal radiation. In general, infrared images are resistant to these disturbances. In the present paper, we adopt the SWIR camera to capture FLIR images with important airfield features in low visibility. However, infrared images typically have defects of low resolution and poor texture [31]. Existing runway detection algorithms [32][33][34] cannot satisfy the requirements of robustness and accuracy in airborne extract runway features from FLIR images accurately and robustly. Improvements have been made on the basis of our recently proposed method [35]. In the presented paper, the detection result of the previous image is used as the prior knowledge of the next image to detect and extract four runway edges instead of left and right edges from the FLIR image, as shown in Figure 6.
This improved method adopts a coarse-to-fine hierarchical idea in which the runway region of interest (ROI) is preliminarily estimated in the FLIR image and the runway edges are finely extracted from the ROI. At the coarse layer, the runway ROI can be calculated by the aircraft pose parameters and airport geo-information in the first few frames. Then, the detected runway is used as the prior knowledge of the next image. Meanwhile, considering the errors of aircraft pose parameters, the runway ROI based on special confidence interval can be estimated. The higher the confidence level is, the larger the runway ROI will be. Therefore, surrounding useless objects and complex background texture can be excluded from ROI so as to reduce interference and image processing time. Especially the errors transfer equations of vision projection model can be given as follows: where ∆r is the error of pixel row and ∆c is the error of pixel column. J r is the Jacobian of row pixel r with respect tox, and J c is the Jacobian of column pixel c with respect tox. ∆L a , ∆λ a , ∆h a , ∆ψ, ∆θ, and ∆φ are the measurement deviations of longitude L a , latitude λ a , ellipsoidal height h a , yaw ψ, pitch θ, and roll φ of the IMU respectively.
infrared images typically have defects of low resolution and poor texture [31]. Existing runway detection algorithms [32][33][34] cannot satisfy the requirements of robustness and accuracy in airborne extract runway features from FLIR images accurately and robustly. Improvements have been made on the basis of our recently proposed method [35]. In the presented paper, the detection result of the previous image is used as the prior knowledge of the next image to detect and extract four runway edges instead of left and right edges from the FLIR image, as shown in Figure 6.  At the fine layer, EDLines detector [43] is used to extract straight line segments from ROI, then fragmented line segments generated by EDLines are linked into complete runway edge lines based on the morphology of synthetic runway in the ROI. Due to the less texture and low resolution of the FLIR image, the detected edges are divided into small segments and scattered in the ROI disorderly. However, each synthetic runway line has a neighborhood which is determined by the pixel errors ( r − ∆r ≤r ≤ r + ∆r, c−∆c ≤ĉ ≤ c + ∆c) of its endpoints. If one of the fragmented line segments locates in the neighborhood of any synthetic runway line and the angle between them is less than 3 • , it belongs to the set of the synthetic runway line candidates. Therefore, in the ROI four sets of lines are extracted from the detected line segments individually, and other lines are abandoned. In view of these facts, our method calculates the weight of each line segments according to its length and width. In each set, a number of points are randomly selected from these small line segments according to the line weight value. Obviously, the large weight line segment contributes greatly to the fitting of the line segments. Finally, each set of the line segments can be fitted into an edge line by using the RANSAC method. The detection and extraction results of runway features are given in Section 3.2.

Visual-Inertial Navigation
The UKF adopts a deterministic sampling technique to estimate the state and covariance of the non-linear models directly. Compared with the EKF, the UKF can predict the state of the non-linear system more accurately rather than calculate the Jacobian and Hessian matrices of the process and measurement models. However, the UKF need calculate the square root of state covariance matrix during sigma points update, it may occasionally generate a negative definite state covariance matrix which will cause the program to abort. The SR-UKF requires less numerical computations and has more accuracy by using a Cholesky factorization of the error covariance matrix in propagation directly [44]. The proposed visual-inertial navigation approach adopts SR-UKF to integrate nonlinear visual observation and inertial measurements to estimate aircraft motion.

Process Modeling
Firstly, we define the system state as: where ψ n ∈ 3 , δv n ∈ 3 and δp n ∈ 3 are the attitude, velocity and position errors of INS respectively. ε n ∈ 3 denotes the gyroscope drift, ∇ n ∈ 3 represents the accelerometer bias. Then the continuous-time system process model is given by: .
Considering the discrete-time, the model can be written as follows:

Vision Measurement Model
Because the homography matrix contains the deviation of aircraft pose, four groups of possible solutions can be obtained by decomposing the homography matrix according to the traditional method [40,41], and then a set of solutions which are closest to the true value, i.e., the deviation of aircraft pose, can be selected by prior knowledge as UKF measurement. However, the homography matrix decomposition not only increases computation, but also introduces computation errors. In this paper, the measured homography matrix is transformed into one-dimensional column vector, which is used as visual measurement to participate in UKF.
Suppose thatĤ R M ∈ 3×3 and H R M ∈ 3×3 are the measurement and the estimation of the homography, thenĤ R M and H R M can be converted into two column vectors vecĤ R M ∈ 9 and vecH R M ∈ 9 respectively. Considering the measurement noises of the homographyĤ R M ∈ 3×3 , the nonlinear vision measurement model is formalized as: where v f lir ∈ 9 is assumed to be a zero-mean Gaussian noise.
(1)Ĥ ev sv Calculation The homographyĤ R M ∈ 3×3 can be calculated by the feature matching between synthetic images and real images, which is described in Section 2.3.2. The detailed algorithm for homography calculation refers to [42] which gives the transformation rule for lines. A line transforms as: where (l R , l M ) is a line pair between the synthetic image and the infrared image. The main line features include the four edges of runway at least that support the calculation of the homography with eight degrees of freedom.
(2) H R M Estimation M C n b is the estimated attitude matrix from the body frame to the navigation frame, and M P n is the estimated position of the body, while R C n b is the measured attitude matrix, and R P n is the measured position of the body. According to Equations (2)-(6), H R M can be calculated as follows: So vecH R M can be expressed as the function of attitude error ψ n and position error δP n through the conversion H R M → vecH R M .

Other Observations
Besides the above visual measurements, the proposed landing navigation can integrate with other common observations such as air pressure height and radio altitude. These measurement models can be written as follows:ĥ whereĥ imu is the altitude measured by IMU,ĥ hpr indicates the air pressure height measured by the barometer, andĥ ralt represents the radio altimeter. v hpr and v ralt are all assumed to be zero-mean Gaussian white noise. By combining FLIR vision, air pressure height and radio altimeter, the nonlinear measurement model is presented as: The multi-source information fusion framework based on SR_UKF consists of the process model and measurement model, which realizes the integration of inertial measurements, infrared image, airport geo-reference, air data and radio altitude.

Observability
Observability is an inherent characteristic of the proposed VINS; it provides an understanding of how well states of a system can be inferred from the system output measurements. Recently there has been many works in studying the observability of VINSs [36][37][38]. We apply the non-linear observability analysis proposed by Herman and Krener in [36] and refer to the work of Kelly [37] and Weiss [38] for details about how to apply this method to a system similar to ours. In the following, the observability analysis of the core system is established by studying the observability matrix rank based on Lie derivatives.

Nonlinear Observability
Considering the state space as an infinitely smooth manifold X of dimension n, the nonlinear system is described by the following model: where χ ∈ n is the state vector, u i ∈ 1 , i = 0 · · · p denotes the control input, u 0 = 1, and y = [y 1 , · · · , y m ] T ∈ m is the measurement vector with y k = h k (χ), k = 1, · · · , m. The zeroth-order Lie derivative is the function itself, i.e., L 0 h(χ) = h(χ). The first-order Lie derivative of h with respect to f i at χ ∈ X is: The recursive Lie derivative is defined as: The k-th derivative of h along f i is: Based on the preceding expression for the Lie derivative, the observability matrix is defined as: If the observability matrix O is full rank, the system is locally weakly observable.

Observability Analysis
In order to reveal the observability of our proposed system, we use the motion state instead of the state errors. The state errors are approximations where second and higher order terms are omitted under the assumption of a small error state [38]. However, the observability analysis on the full nonlinear system prevents information loss.
First, we define the system state vector of the core system as follows: Then the nonlinear kinematic equations of the core system for computing the Lie derivatives is rearranged as: where C q n b is the rotational matrix corresponding to the quaternion q n b , Ξ(q) is the quaternion multiplication matrix for the quaternion of rotation q with . q = 0.5Ξ(q)ω, ω m denotes the angular velocity vector, a m is the accelerate vector.
A well-known result that we will use in the observability analysis of (31) is the following: when four and more known features are detected in FLIR image frame, the infrared camera pose is observable. According to Equation (2), the measurements can be summarized as: where R 0 = sv C b n , p 0 = sv p n , N n = −1·R 0 ·e 3 , d n = −1·e 3 T ·p 0 , and T 0 = R 0 ·p 0 ·N n T /d n . Furthermore, we enforce the unit-quaternion constraints by employing the following additional measurement equation: (1) Zeroth-Order Lie Derivatives: Define the zeroth-order Lie derivative of h 1 and h 2 , which are simply the measurement functions themselves, i.e.,: Their gradients are: (2) First-Order Lie Derivatives: The first-order Lie derivatives of h 1 and h 2 with respect to f 0 are computed as (34): Their gradients are: (3) Second-Order Lie Derivatives: The second-order Lie derivative of h 1 with respect to f 0 is computed as (36): The gradient is: where We obtain the observability matrix O by stacking the gradient matrices above: where the complete matrix has size 5 × 5. Considering the system state of aerial vehicle in landing phase, the attitude is relatively stable without any complex maneuver, i.e., pitch θ ∈ [2 • , 4 • ], roll φ ∈ [−1 • , 1 • ], and angle velocity vector ω m is minor. In the observability matrix O, these matrices q n b T , q n b T Ξ(q), and D 1 (p n ) are full rank. After applying block Gaussian elimination to removing any rows of the matrix O that consist entirely of zeros, a row-reduced form of the matrix O having the same rank is given by: which has full column rank, so the proposed system is proven to be observable.

Experimental Section and Discussion
In this section, we designed a flight data acquisition platform and adopted real flight data to verify the accuracy and robustness of the proposed method.

Experiments Preparation
The flight data was gathered at a general aviation airport (Pucheng, China) under different weather conditions such as fog, haze, cloud and sunny. As shown in Figure 7, the general aircraft (Y-12F) was equipped with an image sensing suite, an INS (Applanix AV510), a flight parameter recorder (FPR, AMPEX miniR 700), a flight video recorder (FVR, VM-4), a barometer (BARO, XSC-6E) and a radio altimeter (RALT, Honeywell KRA405b). An image sensing suite (ISS) mounted on the aircraft radome contains a SWIR camera (NIP PHK03M100CSW0) and a visible light camera. Furthermore, an INS, FPR, and FVR were installed on the deck of aircraft cabin. The flight data mainly included FLIR video (frame rate 24 Hz), inertial measurements (update 100 Hz), air pressure height (update 16 Hz) and radio altimeter (update 20 Hz) which were labeled by recorders with time stamp to synchronize measurements. In addition, a DGPS ground station (Trimble R5) was used for DGPS-inertial integration navigation to provide the ground truth.
To get accurate motion estimations, precise FLIR camera parameters and camera/INS relative pose are needed. Classical calibration method based on chessboard pattern [45] is adopted to obtain intrinsic parameters of the FLIR camera. The world coordinates of FLIR camera and INS are individually measured by an electronic total station, then the FLIR camera/INS relative pose can be calculated through vector relation between them [10]. The calibrated parameters of INS and FLIR camera are shown in Table 1.  on the aircraft radome contains a SWIR camera (NIP PHK03M100CSW0) and a visible light camera. Furthermore, an INS, FPR, and FVR were installed on the deck of aircraft cabin. The flight data mainly included FLIR video (frame rate 24 Hz), inertial measurements (update 100 Hz), air pressure height (update 16 Hz) and radio altimeter (update 20 Hz) which were labeled by recorders with time stamp to synchronize measurements. In addition, a DGPS ground station (Trimble R5) was used for DGPS-inertial integration navigation to provide the ground truth.  The flight data is stored in a flight data simulator (FDS), which can play back the whole flight process for the algorithm design and verification. Moreover, the geographic data of the airport and its surrounding has been surveyed accurately. In this paper, experiments are run on an embedded computer board (Nvidia Jetson TX2) with six ARM CPU cores, 256 Pascal GPU cores, 8 GB memory. The block diagram of the experimental platform is shown in Figure 8. The embedded computer receives the airborne sensors data from the FDS and simultaneously reads the airport geographic information stored in the solid-state disk (SSD), then outputs the aircraft motion states through multi-source information fusion. To get accurate motion estimations, precise FLIR camera parameters and camera/INS relative pose are needed. Classical calibration method based on chessboard pattern [45] is adopted to obtain intrinsic parameters of the FLIR camera. The world coordinates of FLIR camera and INS are individually measured by an electronic total station, then the FLIR camera/INS relative pose can be calculated through vector relation between them [10]. The calibrated parameters of INS and FLIR camera are shown in Table 1.  The flight data is stored in a flight data simulator (FDS), which can play back the whole flight process for the algorithm design and verification. Moreover, the geographic data of the airport and its surrounding has been surveyed accurately. In this paper, experiments are run on an embedded computer board (Nvidia Jetson TX2) with six ARM CPU cores, 256 Pascal GPU cores, 8 GB memory. The block diagram of the experimental platform is shown in Figure 8. The embedded computer receives the airborne sensors data from the FDS and simultaneously reads the airport geographic information stored in the solid-state disk (SSD), then outputs the aircraft motion states through multi-source information fusion.

Runway Detection Experiment
An ideal line segment detector could process any images regardless of its orientation or size, and extract line segments in real-time without parameters tuning. Among existing algorithms, EDLines detector [19] and Line Segments Detector (LSD) [20] can satisfy these requirements. However, EDLines runs up to 11 times faster than LSD [19], which makes it more suitable for real-time runway detection. As shown in Figure 9, line segments are extracted from the ROI by LSD and EDLines detector, respectively.

Runway Detection Experiment
An ideal line segment detector could process any images regardless of its orientation or size, and extract line segments in real-time without parameters tuning. Among existing algorithms, EDLines detector [19] and Line Segments Detector (LSD) [20] can satisfy these requirements. However, EDLines runs up to 11 times faster than LSD [19], which makes it more suitable for real-time runway detection. As shown in Figure 9, line segments are extracted from the ROI by LSD and EDLines detector, respectively. In this paper, a complete landing process in fog is used for verifying the proposed algorithm. Experimental results contain two parts: runway detection and motion estimation. Some runway detection results are shown in Figure 10, From top to bottom the three rows represent three typical scenarios captured at flight altitudes of 200 ft, 100 ft and 60 ft, and the three columns from left to right denote the coarse layer, the fine layer, and the final results, respectively.  In this paper, a complete landing process in fog is used for verifying the proposed algorithm. Experimental results contain two parts: runway detection and motion estimation. Some runway detection results are shown in Figure 10, From top to bottom the three rows represent three typical scenarios captured at flight altitudes of 200 ft, 100 ft and 60 ft, and the three columns from left to right denote the coarse layer, the fine layer, and the final results, respectively.
At the coarse layer, the ROI is marked in red at the left column. At the fine layer of our improved method, some line segments in ROI are detected and highlighted in red, and the trapezoid of runway contour is labeled in green at the middle column. These line segments are fitted into the final runway features which are shown in red at the right column. In addition, the statistics of runway detection listed in Table 2 show that the ratio of pixels in ROI to total pixels in CCD is less than 25%. Obviously, the proposed method is faster than others [33,34] which process the whole image, and its robustness is significantly improved.  In this paper, a complete landing process in fog is used for verifying the proposed algorithm. Experimental results contain two parts: runway detection and motion estimation. Some runway detection results are shown in Figure 10, From top to bottom the three rows represent three typical scenarios captured at flight altitudes of 200 ft, 100 ft and 60 ft, and the three columns from left to right denote the coarse layer, the fine layer, and the final results, respectively.

Motion Estimation Experiment
As shown in Figure 11, the approach and landing trajectories from the two landing navigation methods are presented. The red curve represents INS/DGPS data, and the green is the motion estimation of the proposed method. The blue pattern denotes the airport runway area. The aircraft descended from 500 feet to 47 feet, through three typical altitudes of 200 feet, 100 feet and 60 feet, flying for 59.45 s. Five recorded time points are marked in this figure. In our experiments, the results of INS/DGPS integration are selected as ground truth.
The proposed algorithm is compared with three other methods such as INS/GPS integration [46], EPnP based method [26] and INS/GPS/BARO/RALT integration [47]. To be consistent with the specifications of the sensor manufactures, the comparison results of position errors, velocity errors, and attitude errors are shown in Figure 12. ∆X e , ∆X n , and ∆X u denote the measurement errors of the aircraft position in the eastward, northward, and upward, respectively. ∆ψ, ∆θ, and ∆φ represent the measurement errors of the aircraft yaw, pitch, and roll separately. ∆V e , ∆V n , and ∆V u are the east, north, and azimuth measurement errors of the aircraft velocity severally. As shown in Figure 12, the motion errors of INS/GPS/BARO/RALT integration are obviously larger than those of the others, while the motion errors of the proposed algorithm are smaller than those of the others. Because the EPnP-based algorithm adopts pure image features to calculate the position and orientation of the camera relative to the runway, the accuracy is greatly limited by the relative distance between the camera and the runway. It is difficult to accurately extract the features of the runway terminal in the 500-200 feet stage. Besides, the errors effect of the runway features in the 100-47 feet stage is greater due to the high ratio of runway features to image. The accuracy of motion estimation is higher only in the 200-100 feet stage. Figure 11, the approach and landing trajectories from the two landing navigation methods are presented. The red curve represents INS/DGPS data, and the green is the motion estimation of the proposed method. The blue pattern denotes the airport runway area. The aircraft descended from 500 feet to 47 feet, through three typical altitudes of 200 feet, 100 feet and 60 feet, flying for 59.45 s. Five recorded time points are marked in this figure. In our experiments, the results of INS/DGPS integration are selected as ground truth. The proposed algorithm is compared with three other methods such as INS/GPS integration [46], EPnP based method [26] and INS/GPS/BARO/RALT integration [47]. To be consistent with the specifications of the sensor manufactures, the comparison results of position errors, velocity errors, and attitude errors are shown in Figure 12. Meanwhile, the data update rate is limited by the camera frame rate, which is lower than the INS update rate. In addition, the accuracy of motion estimation based on INS/GPS/BARO/RALT cannot be further improved due to the larger measurement errors of barometer and radio altimeter. However, this paper improves the existing runway detection algorithm to avoid the problem that the features of runway terminal are difficult to accurately detect, which can obtain accurate runway features. After the integration of vision measurements and inertial data, the update rate of motion estimation is also improved. Even if in low-visibility environments the motion estimations of the proposed method are still accurate enough, which is benefited from the accurate visual observations. Meanwhile, the data update rate is limited by the camera frame rate, which is lower than the INS update rate. In addition, the accuracy of motion estimation based on INS/GPS/BARO/RALT cannot be further improved due to the larger measurement errors of barometer and radio altimeter. However, this paper improves the existing runway detection algorithm to avoid the problem that the features of runway terminal are difficult to accurately detect, which can obtain accurate runway features. After the integration of vision measurements and inertial data, the update rate of motion estimation is also improved. Even if in low-visibility environments the motion estimations of the proposed method are still accurate enough, which is benefited from the accurate visual observations. In addition, the RMS errors of different motion estimation are listed in Table 3. The attitude, velocity and position errors of INS/GPS/BARO/RALT integration are slightly larger than those of the others, while these errors of the proposed algorithm are smaller than those of the others. Among several flight parameters, the height observation is one of the most important for flight safety in landing phase. The flight altitude from GPS is usually inaccurate and unreliable, while the height channel of INS trends to diverges caused by the absence of damping. In general, air pressure height or radio altitude is adopted to damp the height channel of INS, but their accuracy is too low to meet the precision landing. The proposed algorithm that absorbs the advantages of vision and inertial sensors can not only improves the estimation accuracy but also guarantees high update rate.

As shown in
In Figure 13, the flight height in landing obtained by different methods is represented. The RMS errors of flight height in the landing phase are shown in Table 4. Radio altimeter and barometer are not only of low update rate but also of poor accuracy, which is not suitable for landing navigation.
Although INS/GPS mode has high update rate of height data, its accuracy is poor compared with DGPS/INS. The EPnP-based method has higher accuracy than INS/GPS mode, but EPnP has lower update rate than INS/GPS mode due to its use of pure vision navigation. Obviously, the height precision obtained from the proposed INS/FLIR method is the smallest, it can replace INS/DGPS mode to meet precision landing demands. In Figure 13, the flight height in landing obtained by different methods is represented. The RMS errors of flight height in the landing phase are shown in Table 4. Radio altimeter and barometer are not only of low update rate but also of poor accuracy, which is not suitable for landing navigation. Although INS/GPS mode has high update rate of height data, its accuracy is poor compared with DGPS/INS. The EPnP-based method has higher accuracy than INS/GPS mode, but EPnP has lower update rate than INS/GPS mode due to its use of pure vision navigation. Obviously, the height precision obtained from the proposed INS/FLIR method is the smallest, it can replace INS/DGPS mode to meet precision landing demands.

Discussions
The proposed method has high precision up to the DGPS/INS level in low visibility. Firstly, the homography can be served as an ideal visual observation without error accumulation. Meanwhile, owing to the improved runway detection method, it can efficiently overcome the defects of infrared images and smoothly run in a landing scene with large scale and less texture. Compared to ILS and GPS, our method merely takes advantage of an infrared camera to cooperate with airborne navigation sensors, e.g. IMUs, to achieve autonomous motion estimation with low cost, robustness

Discussions
The proposed method has high precision up to the DGPS/INS level in low visibility. Firstly, the homography can be served as an ideal visual observation without error accumulation. Meanwhile, owing to the improved runway detection method, it can efficiently overcome the defects of infrared images and smoothly run in a landing scene with large scale and less texture. Compared to ILS and GPS, our method merely takes advantage of an infrared camera to cooperate with airborne navigation sensors, e.g., IMUs, to achieve autonomous motion estimation with low cost, robustness and accuracy. In particular, the accuracy of our method has reached the level of the DGPS/INS for precision approach and landing.
In the proposed method, the main factors that affect the accuracy of aircraft motion estimation include sensors calibration errors, terrain database precision, spatiotemporal consistency, and runway detection quality. The errors can be partially eliminated by strict sensors calibration [10,45], high precision terrain database and time synchronization [38]. However, the accuracy of runway detection has a great influence on the proposed method, which can be guaranteed by the algorithm itself. The size of synthetic runway neighborhood directly affects the accuracy of the fitted straight line features.
If the neighborhood is too small, the line features will not be found. If the neighborhood is too large, the interference features will increase significantly. In this paper, the pixel errors (∆r, ∆c) are set to be 2σ, which are trade-off settings.

Conclusions and Future Works
The paper proposed a novel visual-inertial navigation method to provide drift-free pose estimation for fixed-wing aircrafts landing, in which inertial measurements, infrared observations and geo-information are organically fused in the UKF. In addition, the proposed method has been proven to be observable by nonlinear observability. Comprehensive experiments with real flight data have verified the accuracy and robustness of the proposed method.
In the future, there are still some research tasks to do for further improvement. (1) For stronger adaptability, we will adopt a multispectral image fusion method [48,49] to enhance the sensitivity in more weather conditions such as rain, snow, or dust. (2) Deep-learning methods [50] can be tried to detect semantic objects with known geo-references around the runway in infrared images, which should not only increase the quantity of vision features to improve the system precision, but also intensify the robustness to detect and recognize different airports. (3) For convenience, the online technique for calibration of the camera to an inertial system [51,52] can also be used to substitute the complicated hand-eye calibration.
Author Contributions: L.Z. proposed this visual-inertial navigation method, and wrote the source code and the manuscript together; Z.Z. made contribution to algorithm design, paper written and modification; L.H. took part in the algorithm verification; P.W. was responsible for data collection and experiments; W.N. supplied help on experiments and paper revision.