Vision/INS Integrated Navigation System for Poor Vision Navigation Environments

In order to improve the performance of an inertial navigation system, many aiding sensors can be used. Among these aiding sensors, a vision sensor is of particular note due to its benefits in terms of weight, cost, and power consumption. This paper proposes an inertial and vision integrated navigation method for poor vision navigation environments. The proposed method uses focal plane measurements of landmarks in order to provide position, velocity and attitude outputs even when the number of landmarks on the focal plane is not enough for navigation. In order to verify the proposed method, computer simulations and van tests are carried out. The results show that the proposed method gives accurate and reliable position, velocity and attitude outputs when the number of landmarks is insufficient.


Introduction
The inertial navigation system (INS) is a self-contained dead-reckoning navigation system that provides continuous navigation outputs with high-bandwidth and short-term stability. Due to its navigation characteristics, the accuracy of the navigation output degrades as time passes. In order to improve the performance of the INS, a navigation aid can be integrated into the INS. The GPS/INS integrated navigation system is one of the most generally used integrated navigation systems [1,2]. However, the GPS/INS integrated navigation system may not produce reliable navigation outputs, since the GPS signal is vulnerable to interference such as jamming and spoofing [3,4]. In recent years, many alternative navigation systems to GPS such as vision, radar, laser, ultrasonic sensor, UWB (Ultra-Wide Band) and eLoran (enhanced Long range navigation) have been studied in order to provide continuous, reliable navigation outputs [4].
Vision sensors have recently been used for navigation of vehicles such as cars, small-sized low-cost airborne systems and mobile robots due to their benefits in terms of weight, cost and power consumption [5][6][7]. Navigations using vision sensors can be classified into three methods [4,7]. The first method determines the position of the vehicle by comparing the measured image of a camera with the stored image or stored information of a map [8]. The second method, which is called landmark-based vision navigation, determines position and attitude by calculating directions to landmarks from the measured image of the landmarks [9,10]. The third method, called visual odometry, determines the motion of the vehicle from successive images of the camera [11]. Among these three methods, the landmark-based approach is known to have the advantages of bounded navigation parameter error and simple computation [7].
In order to integrate an inertial navigation system with a vision navigation system, several methods have been proposed [12][13][14][15][16]. The method in [12] uses gimbal angle and/or bearing information calculated from camera images. In this case, the integrated navigation method may not give an optimal navigation output since the inputs to the integration filter are processed outputs from raw measurements from the vision sensor. When the visual odometry is used for the integrated navigation system as in [13], the error of the navigation output from the vision navigation system increases with time. The integrated navigation method proposed in [14][15][16] uses the position and attitude, velocity or heading information from the vision navigation system. This integrated navigation method may not give a reliable navigation output when the number of landmarks in the camera image is not enough for the navigation output.
This paper proposes an inertial and vision integrated navigation method for poor vision environments, in which position and attitude outputs cannot be obtained from a vision navigation system due to the limited number of landmarks. The proposed method uses focal plane measurements of landmarks in the camera and INS outputs. Since there is no need to have navigation output from the vision navigation system, the proposed method can give integrated navigation output even when the number of landmarks in the camera is not enough for the navigation output. In addition to this, since the integration method uses raw measurements for integration filter, the navigation output may have better performance. In Section 2, a brief description of landmark-based vision navigation is given. The proposed integration method is presented in Section 3. Results of computer simulations and vehicle experiments are given in Section 4. The concluding remarks and further studies are mentioned in Section 5.

Landmark-Based Vision Navigation
Vision navigation output is computed from the projected landmarks on the focal plane in landmark-based vision navigation [7,13,14]. Figure 1 shows projected landmarks on the focal plane when the pin hole camera model is adopted. The x c axis of the camera frame is aligned with the optical axis of the camera. The y c and z c axes are in the horizontal and vertical direction of the focal plane, respectively. The focal plane is placed at a distance of focal length, f , on the x c axis. information calculated from camera images. In this case, the integrated navigation method may not give an optimal navigation output since the inputs to the integration filter are processed outputs from raw measurements from the vision sensor. When the visual odometry is used for the integrated navigation system as in [13], the error of the navigation output from the vision navigation system increases with time. The integrated navigation method proposed in [14][15][16] uses the position and attitude, velocity or heading information from the vision navigation system. This integrated navigation method may not give a reliable navigation output when the number of landmarks in the camera image is not enough for the navigation output. This paper proposes an inertial and vision integrated navigation method for poor vision environments, in which position and attitude outputs cannot be obtained from a vision navigation system due to the limited number of landmarks. The proposed method uses focal plane measurements of landmarks in the camera and INS outputs. Since there is no need to have navigation output from the vision navigation system, the proposed method can give integrated navigation output even when the number of landmarks in the camera is not enough for the navigation output. In addition to this, since the integration method uses raw measurements for integration filter, the navigation output may have better performance. In Section 2, a brief description of landmark-based vision navigation is given. The proposed integration method is presented in Section 3. Results of computer simulations and vehicle experiments are given in Section 4. The concluding remarks and further studies are mentioned in Section 5.

Landmark-Based Vision Navigation
Vision navigation output is computed from the projected landmarks on the focal plane in landmark-based vision navigation [7,13,14]. Figure 1 shows projected landmarks on the focal plane when the pin hole camera model is adopted. The axis of the camera frame is aligned with the optical axis of the camera. The and axes are in the horizontal and vertical direction of the focal plane, respectively. The focal plane is placed at a distance of focal length, , on the axis. As shown in Figure 1, the landmark at position ( , , ) is projected into the point ( , , ) on the focal plane in the camera frame. Equations (1) and (2) represent the relationship between the measurements on the focal plane and landmark coordinate values.

=
(1) As shown in Figure 1, the landmark at position P c k (X c k , Y c k , Z c k ) is projected into the point p c k ( f , u k , v k ) on the focal plane in the camera frame. Equations (1) and (2) represent the relationship between the measurements on the focal plane and landmark coordinate values.
Equation (3) is the navigation equation to obtain navigation output from landmark measurement on the focal plane of the camera.
where the subscript k denotes index of landmarks and P n k is a known position vector of the kth landmark. b, c and n denote the body frame, the camera frame and the navigation frame, respectively. P n u is the vehicle's three-dimensional position vector in the navigation frame. r k is the distance ratio of the projected landmark on the focal plane to the actual landmark in the camera frame. C n b and C b c are the direction cosine matrix from the body frame to the navigation frame and the direction cosine matrix from the camera frame to the body frame, respectively. Here, C b c is a constant matrix since the camera is fixed to the body.
It can be seen from Equation (3) that at least three measurements are required in order to determine a navigation output of six variables, which are three-dimensional position and attitude [14]. In this paper, more than 0 and less than 3 landmarks are available in the poor vision environments.

Vision/INS Integrated Navigation System
Equation (3) is the navigation equation to obtain navigation output from landmark measurement on the focal plane of the camera.
where the subscript denotes index of landmarks and is a known position vector of the th landmark. , and denote the body frame, the camera frame and the navigation frame, respectively.
is the vehicle's three-dimensional position vector in the navigation frame. is the distance ratio of the projected landmark on the focal plane to the actual landmark in the camera frame. and are the direction cosine matrix from the body frame to the navigation frame and the direction cosine matrix from the camera frame to the body frame, respectively. Here, is a constant matrix since the camera is fixed to the body.
It can be seen from Equation (3) that at least three measurements are required in order to determine a navigation output of six variables, which are three-dimensional position and attitude [14]. In this paper, more than 0 and less than 3 landmarks are available in the poor vision environments.

Process Model of the Kalman Filter
where ( ) is process noise vector with covariance ( ). Equation (4) can be rewritten into Equation (5).  .

Process Model of the Kalman Filter
where w(t) is process noise vector with covariance Q(t). Equation (4) can be rewritten into Equation (5).
where w nav and w sen are the navigation parameter error vector noise and sensor error vector noise v, respectively. State vector δx is composed of navigation parameter error vector δx nav and sensor error vector δx sen . 0 m×n denotes an m by n zero matrix. The navigation parameter error vector is composed of position error, velocity error and attitude error of the INS as given in Equation (6).
where δP, δV, and ϕ are position error, velocity error and attitude error expressed in the rotation vector, respectively. The subscripts N, E and D are the north, the east and the down axes in the navigation frame, respectively. The sensor error vector includes six inertial sensor errors and three vision sensor errors as in Equation (7).
where ∇ and are accelerometer and the gyro error, respectively. δu and δv are errors of the coordinate values on the focal plane and δ f is focal length error. The subscripts x, y and z denote roll, pitch and yaw direction in the body frame, respectively. Submatrix F 11 in Equation (5) is given in Equation (8).
where Ω n en , Ω n ie and Ω n in are the skew-symmetric matrix of the vehicle's craft-rate in the navigation frame, the skew-symmetric matrix of the earth rate in the navigation frame and the skew-symmetric matrix of the rotation rate of the navigation frame relative to the inertial frame represented in the navigation frame, respectively. f n × is the skew-symmetric matrix of the vehicle's specific force in the navigation frame. Submatrix F 12 in Equation (5) is given in Equation (9).
The accelerometer sensor error and the gyro error are modeled as random constants and are given in Equations (10) and (11), respectively. .
The vision senor errors are also modeled as random constants and are given in Equations (12)- (14). .

Measurement Model of the Kalman Filter
The measurement equation for the Kalman filter is given in Equation (15).
where H(t) is the observation matrix and v(t) is the measurement noise vector with covariance R(t).
The measurement vector δz(t) is given in Equation (16).
where δu k and δv k denote the differences between the INS-based estimates and measurements on the focal plane for the k-th landmark. n denotes the number of the landmarks on the focal plane of the camera. The INS-based estimates for each element in Equation (16) are calculated from position and attitude outputs of INS and the position information of the landmarks. The observation matrix is given in Equation (17).
Each sub-matrix in the observation matrix can be obtained from computing the Jacobian. The submatrix H 1 is given in Equation (18).

Measurement Model of the Kalman Filter
The measurement equation for the Kalman filter is given in Equation (15).
where ( ) is the observation matrix and ( ) is the measurement noise vector with covariance ( ). The measurement vector ( ) is given in E (16).
where and denote the differences between the INS-based estimates and measurements on the focal plane for the -th landmark. den number of the landmarks on the focal plane of the camera. The INS-based estimates for each element in Equation (16) are calculated from posi attitude outputs of INS and the position information of the landmarks. The observation matrix is given in Equation (17).
Each sub-matrix in the observation matrix can be obtained from computing the Jacobian. The submatrix 1 is given in Equation (18).
where [ ] is the position vector in the navigation frame. The submatrix 2 is given in Equation (19) where [ ] is the attitude vector expressed in the Euler angle in the navigation frame. The attitude error in the process model Equati represented in the rotation vector, whereas the attitude error in Equation (19) is represented in the Euler angle. The relationship between the rotatio and the Euler angle is expressed in Equation (26) [18]. The ( = 1,2,3, = 1,2,3) in Equation (19) The submatrix 3 is given in Equation (21).
where [ ] is the position vector of the th landmark in the camera frame and can be expressed in Equation (22).
where [x y z] T is the position vector in the navigation frame. The submatrix H 2 is given in Equation (19).

̇= 0
The vision senor errors are also modeled as random constants and are given in Equations (12)- (14).

Measurement Model of the Kalman Filter
The measurement equation for the Kalman filter is given in Equation (15).
where ( ) is the observation matrix and ( ) is the measurement noise vector with covariance ( ). The measur (16).
where and denote the differences between the INS-based estimates and measurements on the focal plan number of the landmarks on the focal plane of the camera. The INS-based estimates for each element in Equatio attitude outputs of INS and the position information of the landmarks. The observation matrix is given in Equation Each sub-matrix in the observation matrix can be obtained from computing the Jacobian. The submatrix 1 is where [ ] is the position vector in the navigation frame. The submatrix 2 is given in Equation (19) The submatrix 3 is given in Equation (21).
where [ ] is the position vector of the th landmark in the camera frame and can be expressed in Equat ̇= 0 (11) nor errors are also modeled as random constants and are given in Equations (12)- (14).
observation matrix and ( ) is the measurement noise vector with covariance ( ). The measurement vector ( ) is given in Equation denote the differences between the INS-based estimates and measurements on the focal plane for the -th landmark. denotes the dmarks on the focal plane of the camera. The INS-based estimates for each element in Equation (16) (17).
trix in the observation matrix can be obtained from computing the Jacobian. The submatrix 1 is given in Equation (18).
is the position vector in the navigation frame. The submatrix 2 is given in Equation (19) x 3 is given in Equation (21).
where [α β γ] T is the attitude vector expressed in the Euler angle in the navigation frame. The attitude error in the process model Equation (6)  The vision senor errors are also modeled as random constants and are given in Equations (12)- (14). Each sub-matrix in the observation matrix can be obtained from computing the Jacobian. The submatrix 1 is given in Equation (18) The submatrix 3 is given in Equation (21).
The submatrix H 3 is given in Equation (21).
where [X k Y k Z k ] T is the position vector of the kth landmark in the camera frame and can be expressed in Equation (22).
Also, [∂v k /∂x ∂v k /∂y ∂v k /∂z] in Equation (18) It can be seen from Equation (16) that the proposed method can provide an integrated navigation output even though the number of landmarks is not sufficient for a vision navigation output.

Computer Simulation and Experimental Result
The proposed method is verified through computer simulations and van tests.

Computer Simulation
Computer simulations of the proposed integrated navigation method were carried out for a low medium-grade inertial sensor and a low-cost commercial camera. Figure 3 shows the scheme of the simulations. Reference trajectory and inertial sensor data were generated using MATLAB and INS tool box manufactured by GPSoft LLC. True camera measurement data of the landmarks were first generated using the pinhole camera model given in Equations (1) and (2). The camera measurement data on the focal plane of the landmarks were finally generated by adding noises into the true camera measurement data. Zero to ten landmarks to be observed on every image are placed by a random generator. The IMU measurement data were also generated by adding noises into the true IMU measurement data. Tables 1 and 2 show the specifications of the IMU and the vision sensor for the simulation. first generated using the pinhole camera model given in Equations (1) and (2). The camera measurement data on the focal plane of the landmarks were finally generated by adding noises into the true camera measurement data. Zero to ten landmarks to be observed on every image are placed by a random generator. The IMU measurement data were also generated by adding noises into the true IMU measurement data. Tables 1 and 2 show the specifications of the IMU and the vision sensor for the simulation.   Figure 3. Scheme of simulation.  Figure 4. Less than three landmarks were intentionally placed randomly in a specific area in order to create a poor vision navigation environment.
Results of the proposed method were compared with those of another integration method in [14]. In the integration method in [14], the outputs of the vision navigation system are position and attitude and state vector is given in Equation (27).
The measurement vector is given in Equation (28). 50 Monte-Carlo simulations were performed for an eight-shaped flight path with constant height as shown in Figure 4. Less than three landmarks were intentionally placed randomly in a specific area in order to create a poor vision navigation environment.
Results of the proposed method were compared with those of another integration method in [14]. In the integration method in [14], the outputs of the vision navigation system are position and attitude and state vector is given in Equation (27).
The measurement vector is given in Equation (28).
= (28) As with the loosely coupled GPS/INS integrated navigation method, the method in [14] has redundancy in the navigation output. The vision navigation system can provide a stand-alone navigation output even when the INS and/or the integrated navigation system cannot provide a navigation output. However, as described in Section 2, the vision system cannot give navigation output when less than three landmarks are available on the focal plane. In this case, performance of integrated navigation system can deteriorate since the measurement update process cannot be performed in the integration Kalman filter. As shown in Equation (15), the measurement update process can be performed even when only one landmark is visible on the focal plane in the proposed method. Only the time update in Kalman filtering is performed when no landmarks are visible at all. Figure 5 shows results of the estimated vision sensor errors of the proposed method in the simulation. It can be seen from the results that the vision sensor errors are well estimated and the performance of the vision navigation system is improved. As with the loosely coupled GPS/INS integrated navigation method, the method in [14] has redundancy in the navigation output. The vision navigation system can provide a stand-alone navigation output even when the INS and/or the integrated navigation system cannot provide a navigation output. However, as described in Section 2, the vision system cannot give navigation output when less than three landmarks are available on the focal plane. In this case, performance of integrated navigation system can deteriorate since the measurement update process cannot be performed in the integration Kalman filter. As shown in Equation (15), the measurement update process can be performed even when only one landmark is visible on the focal plane in the proposed method. Only the time update in Kalman filtering is performed when no landmarks are visible at all. Figure 5 shows results of the estimated vision sensor errors of the proposed method in the simulation. It can be seen from the results that the vision sensor errors are well estimated and the performance of the vision navigation system is improved. In Figure 6, navigation results of the proposed method are compared with those of the pure INS and the method in [14]. Figure 7 shows the position errors in the north, east and down direction of the proposed method and the method in [14].   In Figure 6, navigation results of the proposed method are compared with those of the pure INS and the method in [14]. Figure 7 shows the position errors in the north, east and down direction of the proposed method and the method in [14]. In Figure 6, navigation results of the proposed method are compared with those of the pure INS and the method in [14]. Figure 7 shows the position errors in the north, east and down direction of the proposed method and the method in [14].   In Figure 6, navigation results of the proposed method are compared with those of the pure INS and the method in [14]. Figure 7 shows the position errors in the north, east and down direction of the proposed method and the method in [14].    Table 3 shows RMS errors for the pure INS, the method in [14] and the proposed method. It can be observed that error of the pure INS becomes large as the navigation operation continues. It can also be observed that the method in [14] gives relatively large navigation parameter errors in the area where the number of the landmarks are not enough for vision navigation output. The proposed method gives approximately 50 and 10 times better performance in the position and the attitude than the method in [14] in this area, respectively.  Figure 8 shows the experimental setup and a reference navigation system. The experimental setup consists of a camera and an IMU and is installed on an optical bench. The reference navigation system, which is a carrier-phase differential GPS (CDGPS)/INS integrated navigation system, is installed together. Outputs of the reference navigation system are regarded as true values in the evaluation of the experimental results. A low-cost commercial camera and a micro electro mechanical system (MEMS) IMU given in Tables 4 and 5 were used in the experiment. Database of the landmarks was made in advance with the help of large-scale maps and aerial photographs. Table 3 shows RMS errors for the pure INS, the method in [14] and the proposed method. It can be observed that error of the pure INS becomes large as the navigation operation continues. It can also be observed that the method in [14] gives relatively large navigation parameter errors in the area where the number of the landmarks are not enough for vision navigation output. The proposed method gives approximately 50 and 10 times better performance in the position and the attitude than the method in [14] in this area, respectively.  Figure 8 shows the experimental setup and a reference navigation system. The experimental setup consists of a camera and an IMU and is installed on an optical bench. The reference navigation system, which is a carrier-phase differential GPS (CDGPS)/INS integrated navigation system, is installed together. Outputs of the reference navigation system are regarded as true values in the evaluation of the experimental results. A low-cost commercial camera and a micro electro mechanical system (MEMS) IMU given in Tables 4 and 5 were used in the experiment. Database of the landmarks was made in advance with the help of large-scale maps and aerial photographs.    Figure 9 shows the position of the vehicle's reference trajectory in the experiment. The results of the proposed method of the experiment were compared with those of the pure INS and the method in [14]. Figures 10 and 11 show the navigation results and Table 6 shows the errors in the experiment. As with the results of the computer simulations, it can be seen from the experimental result that the proposed method provides reliable solutions with approximately 5 times better positioning performance than the method in [14] even in poor vision environments.  Figure 9 shows the position of the vehicle's reference trajectory in the experiment. The results of the proposed method of the experiment were compared with those of the pure INS and the method in [14]. Figures 10 and 11 show the navigation results and Table 6 shows the errors in the experiment. As with the results of the computer simulations, it can be seen from the experimental result that the proposed method provides reliable solutions with approximately 5 times better positioning performance than the method in [14] even in poor vision environments.

Concluding Remarks and Further Studies
This paper proposed an inertial and landmark-based vision integrated navigation method using focal plane measurements of landmarks. An integration model was derived to use the raw measurements on the focal plane in the integration Kalman filter. The proposed method has been verified through computer simulations and van tests. Performance of the proposed method has

Concluding Remarks and Further Studies
This paper proposed an inertial and landmark-based vision integrated navigation method using focal plane measurements of landmarks. An integration model was derived to use the raw measurements on the focal plane in the integration Kalman filter. The proposed method has been verified through computer simulations and van tests. Performance of the proposed method has

Concluding Remarks and Further Studies
This paper proposed an inertial and landmark-based vision integrated navigation method using focal plane measurements of landmarks. An integration model was derived to use the raw measurements on the focal plane in the integration Kalman filter. The proposed method has been verified through computer simulations and van tests. Performance of the proposed method has been compared with other integration method which used a vision navigation output, i.e., position and attitude output from a vision navigation system. It has been observed from the results that the proposed system gives reliable navigation outputs even when the number of landmarks is not sufficient for vision navigation.
An integration method to use continuous images to improve navigation performance and an integration model to efficiently detect and recognize landmarks will be studied. As future works, other filtering methods such as the particle filter and unscented Kalman filter, artificial neural network-based filtering and the application of a vision/INS integrated navigation system for sea navigation can be considered.
Author Contributions: Youngsun Kim and Dong-Hwan Hwang proposed the idea of this paper; Youngsun Kim designed and performed the experiments; Youngsun Kim and Dong-Hwan Hwang analyzed the experimental data and wrote the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.