On-line Smoothing and Error Modelling for Integration of GNSS and Visual Odometry

Global navigation satellite systems (GNSSs) are commonly used for navigation and mapping applications. However, in GNSS-hostile environments, where the GNSS signal is noisy or blocked, the navigation information provided by a GNSS is inaccurate or unavailable. To overcome these issues, this study proposed a real-time visual odometry (VO)/GNSS integrated navigation system. An on-line smoothing method based on the extended Kalman filter (EKF) and the Rauch-Tung-Striebel (RTS) smoother was proposed. VO error modelling was also proposed to estimate the VO error and compensate the incoming measurements. Field tests were performed in various GNSS-hostile environments, including under a tree canopy and an urban area. An analysis of the test results indicates that with the EKF used for data fusion, the root-mean-square error (RMSE) of the three-dimensional position is about 80 times lower than that of the VO-only solution. The on-line smoothing and error modelling made the results more accurate, allowing seamless on-line navigation information. The efficiency of the proposed methods in terms of cost and accuracy compared to the conventional inertial navigation system (INS)/GNSS integrated system was demonstrated.


Introduction
Global navigation satellite systems (GNSSs) are commonly applied for positioning and navigation. The positioning solution is accurate and continuous if direct signals from more than four satellites are received. However, in GNSS-hostile environments, where signals are reflected and blocked, the availability and accuracy of GNSS-based vehicle navigation systems are degraded significantly [1]. To overcome this issue, the integration of a GNSS and another navigation system has been proposed. The most common integration is a GNSS and an inertial navigation system (INS). Even though an INS/GNSS integrated system can improve navigation capability in GNSS-denied environments, it depends deeply on the cost of used inertial sensors and the outages in GNSS signals [2]. Inertial systems of a tactical-grade or higher can compensate to obtain appropriate positioning, accuracy, and sustainability during long-term GNSS signal unavailability [3]. For instance, in the case of a GNSS outage lasting one minute, systems with high-grade inertial sensors can achieve a real-time positioning accuracy of less than three meters. Nevertheless, the use of these sophisticated inertial sensors is limited for applications such as the primary navigation module for general land vehicles due to their price and government regulation [1,2]. Low-cost microelectromechanical systems' (MEMS) inertial sensors are thus applied as a potential complementary component. However, the positioning error of integrated systems drifts quickly when GNSS signals are blocked due to the poor performance of these inertial sensors.
Visual odometry (VO) using a camera is an alternative or supplemental navigation solution in GNSS-hostile environments [4,5]. VO estimates the ego-motion of an agent given consecutive images captured by one or more cameras. The output of VO is the relative translation and rotation of the carrier platform in the initial camera frame. Compared with wheel odometry, VO is not affected by wheel slip in uneven terrain or other adverse conditions [6]. VO utilizes low-cost sensors and the frames captured by a camera can provide a large amount of information that can be used for different purposes, including navigation [7,8]. However, its performance depends on the illumination of the environment, the texture of the static scene, and the overlap between consecutive frames [6]. The present study integrated VO and GNSS to utilize the advantages and overcome the limitations of each system in stand-alone mode.
According to the literature, the integration of VO and GNSS has been investigated. Dusha and Mejias [9] introduced a loosely coupled global positioning system (GPS)/VO integration. Their method was demonstrated using numerical simulations and was evaluated using real flight data. However, they mainly focused on the observability properties of the GPS/VO filtering instead of optimal estimation or real-time navigation performance. Moreover, in real experiments, they used a downward-looking camera, which is different in the context of the forward-facing camera's popular application in navigation. Schreiber, Königshof [10] presented a method for integrating GNSS measurements from a low-cost receiver with a locally accurate visual odometry obtained from an on-board low-cost stereo camera system. Although their system had robust localization, the achieved accuracy was insufficient for autonomous driving. Chen, Hu [11] investigated the integration of measurements from a low-cost GNSS and monocular camera measurements in a simultaneous localization and mapping (SLAM) system. The proposed system can perform in real time and achieve the absolute position, attitude, and metric scale of the vehicle. However, this system is mainly based on an ORB-SLAM framework with optimization-based algorithms, so it is unable to apply smoother in order to improve the accuracy of navigation performance. The scale is initialized in the beginning instead of being included in the state for estimation. Furthermore, they experimented with an available KITTI dataset, which ORB-SLAM system has been successful working with, instead of real experiments with their own configurations.
In this research, we proposed a real-time VO/GNSS integrated navigation system that utilizes on-line smoothing based on the extended Kalman filter (EKF) and the Rauch-Tung-Striebel (RTS) smoother. VO error modelling was also proposed to estimate the VO error and compensate the incoming measurements. The contributions addressed herein are, firstly, that the result of this work contributes to confirm the advancement of VO/GNSS integration, which can be compared to a conventional INS/GNSS approach. Secondly, on-line smoothing and error modelling were applied to enhance the performance that makes the system capable of robust ground vehicle navigation. Finally, the integrated system was validated through two live sets of data collected in various GNSS-hostile environments (e.g., under a tree canopy, urban area), together with a centimeter-accurate reference system, to demonstrate performance of the proposed system.
The remainder of this paper is organized as follows: Section 2 provides an overview of VO. Section 3 presents the design of the integrated architecture, the system model, and the measurement model. Section 4 describes data fusion strategies. The experimental results and discussion are presented in Section 5. Finally, some concluding remarks and a brief outline for future research are presented in Section 6.

General Concept of VO
VO can be divided into monocular or stereo VO. Monocular VO uses a single camera to derive ego-motion based on feature matching (or feature tracking) between consecutive images, whereas stereo VO uses a pair of cameras. Compared to monocular VO, stereo VO is more accurate but has a higher computational burden [12]. Therefore, monocular VO is the first choice for real-time applications [13]. The flowchart of VO is described in Figure 1. higher computational burden [12]. Therefore, monocular VO is the first choice for real-time applications [13]. The flowchart of VO is described in Figure 1.

Camera Calibration
Camera calibration is used to determine the camera's intrinsic and distortion parameters. The camera's intrinsic parameters are usually presented in the form of a matrix that includes the camera's focal length and principle point. The distortion parameters are usually expressed in terms of the tangential distortion and the radial distortion coefficients of the lenses. The camera calibration can be implemented using commercial software or free tools such as Bouget's Matlab Camera Calibration Toolbox [14] or OpenCV [15]. Several calibration processes have been proposed [16,17]. In this research, OpenCV, with a checkerboard pattern, was used for camera calibration.

Image Acquisition and Undistortion
A sequence of images is obtained from the camera at a certain frame rate. For real-time applications, the frame rate is critical. It should be small enough to implement in real time, but large enough to have an overlap for deriving a VO solution. The frame rate is adjusted depending on the size of the image and the moving speed of the vehicle. For example, for an image size of 808 × 608 pixels and an average vehicle speed of 2 m/s, the frame rate should be 3 frames per second for realtime applications [18].
The camera lens can distort images. Objects in distorted images look different from the way they do in reality (e.g., straight lines become curved). The magnitude of distortion increases from the center to the edges of images and varies with viewpoint. Based on the distortion parameters determined by camera calibration, a distorted image can be corrected to improve the performance of subsequent image processing. Figure 2 compares a distorted image and its correction.

Camera Calibration
Camera calibration is used to determine the camera's intrinsic and distortion parameters. The camera's intrinsic parameters are usually presented in the form of a matrix that includes the camera's focal length and principle point. The distortion parameters are usually expressed in terms of the tangential distortion and the radial distortion coefficients of the lenses. The camera calibration can be implemented using commercial software or free tools such as Bouget's Matlab Camera Calibration Toolbox [14] or OpenCV [15]. Several calibration processes have been proposed [16,17]. In this research, OpenCV, with a checkerboard pattern, was used for camera calibration.

Image Acquisition and Undistortion
A sequence of images is obtained from the camera at a certain frame rate. For real-time applications, the frame rate is critical. It should be small enough to implement in real time, but large enough to have an overlap for deriving a VO solution. The frame rate is adjusted depending on the size of the image and the moving speed of the vehicle. For example, for an image size of 808 × 608 pixels and an average vehicle speed of 2 m/s, the frame rate should be 3 frames per second for real-time applications [18].
The camera lens can distort images. Objects in distorted images look different from the way they do in reality (e.g., straight lines become curved). The magnitude of distortion increases from the center to the edges of images and varies with viewpoint. Based on the distortion parameters determined by camera calibration, a distorted image can be corrected to improve the performance of subsequent image processing. Figure 2 compares a distorted image and its correction.

Feature Matching
To derive a VO solution based on a feature-based method, feature points between consecutive

Feature Matching
To derive a VO solution based on a feature-based method, feature points between consecutive images must be matched or tracked. This process commonly has two stages. In the first stage, feature detection is used to find key points that are the most suitable for matching to features in other images [19]. Various feature detection algorithms have been proposed [20]. According to Fraundorfer and Scaramuzza [20], point-feature detectors can be divided into two groups, namely corner detectors and blob detectors. The representatives for corner detectors are FATS, Harris, Shi-Tomasi, Moravec, and Forstner, whereas SIFT, SURF, and CENSURE are typical algorithms for blob detectors. Each algorithm has its own advantages and disadvantages. In general, blob detectors are more distinctive and better localized in scale, but corner detectors are fast to compute and are better localized in image position. In the second stage, the corresponding feature is looked for in the subsequent images. This process is called feature matching or tracking. In this research, the SIFT [21] feature matching algorithm was applied. Most of the SIFT algorithm's power lies in its robust descriptor, which is stable against changes in illumination, rotation, and scale. Figure 3 illustrates the feature matching of two images.

Feature Matching
To derive a VO solution based on a feature-based method, feature points between consecutive images must be matched or tracked. This process commonly has two stages. In the first stage, feature detection is used to find key points that are the most suitable for matching to features in other images [19]. Various feature detection algorithms have been proposed [20]. According to Fraundorfer and Scaramuzza [20], point-feature detectors can be divided into two groups, namely corner detectors and blob detectors. The representatives for corner detectors are FATS, Harris, Shi-Tomasi, Moravec, and Forstner, whereas SIFT, SURF, and CENSURE are typical algorithms for blob detectors. Each algorithm has its own advantages and disadvantages. In general, blob detectors are more distinctive and better localized in scale, but corner detectors are fast to compute and are better localized in image position. In the second stage, the corresponding feature is looked for in the subsequent images. This process is called feature matching or tracking. In this research, the SIFT [21] feature matching algorithm was applied. Most of the SIFT algorithm's power lies in its robust descriptor, which is stable against changes in illumination, rotation, and scale. Figure 3 illustrates the feature matching of two images.

Motion Estimation
Motion estimation is used to determine camera transformation between the current image and the previous image. The mathematical principle of motion estimation is based on an epipolar constraint [6]. In Figure 4, a camera undergoes motion from C1 to C2 with rotation R and translation t. Given a three-dimensional (3D) point X, the projection of X in the image plane at C1 is and the corresponding image point at C2 is ν. The epipolar constraint equation is formed as Equation (1) can be rewritten in the form: where A represents the components u and v, and e represents the components of E.
To solve Equation (2), eight point correspondences are normally required [22]. Fewer point correspondences are required if additional constrains are used in the motion condition [6,23]. Equation (2) is solved based on the principle of minimizing the projection error to determine R and t.

‖ ‖
Note that Equation (2) is satisfied with static points. However, a real scenario may contain a moving object and erroneous conditions; thus, outlier removal is necessary. The RANSAC algorithm is commonly used for this task [24]. After outlier removal using RANSAC, the number of inliers is determined. A larger number of inliers usually leads to a more reliable solution in motion estimation. Therefore, in this research, a number of inliers were used to build the error model of VO.

Motion Estimation
Motion estimation is used to determine camera transformation between the current image and the previous image. The mathematical principle of motion estimation is based on an epipolar constraint [6]. In Figure 4, a camera undergoes motion from C1 to C2 with rotation R and translation t. Given a three-dimensional (3D) point X, the projection of X in the image plane at C1 is u and the corresponding image point at C2 is ν. The epipolar constraint equation is formed as Equation (1) can be rewritten in the form: where A represents the components u and v, and e represents the components of E.
To solve Equation (2), eight point correspondences are normally required [22]. Fewer point correspondences are required if additional constrains are used in the motion condition [6,23]. Equation (2) is solved based on the principle of minimizing the projection error to determine R and t.
Argmin Ae Note that Equation (2) is satisfied with static points. However, a real scenario may contain a moving object and erroneous conditions; thus, outlier removal is necessary. The RANSAC algorithm is commonly used for this task [24]. After outlier removal using RANSAC, the number of inliers is determined. A larger number of inliers usually leads to a more reliable solution in motion estimation. Therefore, in this research, a number of inliers were used to build the error model of VO. Equation (2), with condition (3), is solved utilizing singular value decomposition (SVD) to determine the essential matrix E. Rotation matrix R and translation vector t are then determined [23].
Let E = UV T be the SVD of E. Then, where [t] × is the skew matrix of the translation vector t [25]. Given the determined translation vector t and the rotation matrix R, the transformation of the camera at time k is formed as Then, the pose of the camera at time k can be determined using a concatenated equation: where Ck and Ck-1 are the poses of the camera at times k and k-1, respectively. The solution of VO is the pose of the camera, expressed in the initial camera frame. Its error will accumulate over time if no external constraint is applied.

General Architecture Design
In this system, a loosely coupled scheme for VO/GNSS integration was designed. The images taken by the camera were processed by a VO mechanization to derive the translation and rotation in camera frame. GNSS provides absolute position as the major measurement update. An EKF was designed for multi-sensor data fusion and an RTS smoother was applied to provide more accurate navigation solutions. The integration scheme is shown in Figure 5. Equation (2), with condition (3), is solved utilizing singular value decomposition (SVD) to determine the essential matrix E. Rotation matrix R and translation vector t are then determined [23].
Let E = U V T be the SVD of E. Then, where [t]× is the skew matrix of the translation vector t [25]. Given the determined translation vector t and the rotation matrix R, the transformation of the camera at time k is formed as Then, the pose of the camera at time k can be determined using a concatenated equation: where C k and C k−1 are the poses of the camera at times k and k − 1, respectively. The solution of VO is the pose of the camera, expressed in the initial camera frame. Its error will accumulate over time if no external constraint is applied.

General Architecture Design
In this system, a loosely coupled scheme for VO/GNSS integration was designed. The images taken by the camera were processed by a VO mechanization to derive the translation and rotation in camera frame. GNSS provides absolute position as the major measurement update. An EKF was designed for multi-sensor data fusion and an RTS smoother was applied to provide more accurate navigation solutions. The integration scheme is shown in Figure 5.

Model Design
System and measurement models are needed for fusing data with an estimation tool such as the EKF [1,3]. In this research, the principle ego-motion of the VO was utilized to create a system model for the EKF. The measurements from GNSS were used to form measurement models.
The system model was created by error analysis utilizing perturbation methods of VO. The details of the derivation can be found in the study of Dusha and Mejias [9]. The time-continuous VO error model is formed as where  , , and  are continuous-time derivatives of length scale factor, position, and attitude in local-level frame (n-frame), respectively. is the estimated rotation matrix from the body frame (bframe) to the n-frame. is an estimated translation expressed in the b-frame. and are translation and rotation errors in the b-frame, respectively.
Equation (9) can be presented in continuous-time system model: Equation (10) can be transformed into a discrete-time form [1,26]: where H is the design matrix or geometry matrix, and are measurement and its noise, respectively.
For positional measurement provided by GNSS, the measurement model for the EKF is formed as

Model Design
System and measurement models are needed for fusing data with an estimation tool such as the EKF [1,3]. In this research, the principle ego-motion of the VO was utilized to create a system model for the EKF. The measurements from GNSS were used to form measurement models.
The system model was created by error analysis utilizing perturbation methods of VO. The details of the derivation can be found in the study of Dusha and Mejias [9]. The time-continuous VO error model is formed as where δ . λ, δ . r n , and . ψ are continuous-time derivatives of length scale factor, position, and attitude in local-level frame (n-frame), respectively. R n b is the estimated rotation matrix from the body frame (b-frame) to the n-frame. T b is an estimated translation expressed in the b-frame. δ T b and δω b nb are translation and rotation errors in the b-frame, respectively.
Equation (9) can be presented in continuous-time system model: Equation (10) can be transformed into a discrete-time form [1,26]: where x k = [δλ δr n ψ] T 7×1 is the state vector at time (epoch) k, Φ k is the discrete-time transition matrix from epoch k to epoch k + 1, and w k is process noise [27,28].
Measurement model for the EKF is expressed as where H is the design matrix or geometry matrix, z k and v k are measurement and its noise, respectively. For positional measurement provided by GNSS, the measurement model for the EKF is formed as z = r e VO − r e GPS = HR n e x k + ε r where r e VO and r e GPS are the positional vectors provided by VO and GNSS in the Earth-centered Earth-fixed frame (e-frame), respectively, and R n e is the rotation matrix from the e-frame to the n-frame.
H is a measurement-mapping matrix describing the relationship between the measurement vector and the state vector. ε r is the position noise of GNSS measurements.

Estimation with Extended Kalman Filter
EKF equations are divided into the following two groups: time prediction and measurement update. The time prediction equations convert state and covariance from the current epoch state (k) to the next epoch (k + 1) [27].
where (ˆ) denotes estimation and (-) and (+) denote the estimated values after prediction and update, respectively. When GNSS measurements are observed, the following measurement update equations are activated: where K k is the Kalman gain, R k is the covariance matrix of GNSS measurements. All noise terms are considered to be white with known covariance and uncorrelated with each other.

On-Line Smoothing
In this research, on-line smoothing was applied. The remaining time in each epoch was utilized to perform smoothing during operation time to increase the capability of the system in terms of accuracy. Following Chiang, Duong [29], on-line smoothing was originated from an RTS smoother algorithm. The principle of this algorithm is introduced below.
According to Rauch, Tung [30], smoothing is targeted to estimate probability density function (PDF) of the states based on all measurements from time k to time N, where k ≤ N: The RTS smoother finds optimal estimation by applying a maximum likelihood of state vectors based on aiding measurements vectors: where L(x k , x k+1 z N ) is the likelihood of x k , x k+1 based on z N . The estimated and covariance of states are achieved by resolving criteria in Equation (21): wherex k|N and P k|N are smoothed states and covariance at time k based on information up to time N (k ≤ N), respectively,x k and P k are estimated states and covariance provided by the EKF at time k, respectively, and C k is cross covariance, calculated in the following equation: With on-line smoothing, the processing scheme can be done with real-time data. Figure 6 indicates the processing principle and error performance of on-line smoothing. The integrated scheme with on-line smoothing of VO/GNSS is described in Figure 7. With on-line smoothing, the processing scheme can be done with real-time data. Figure 6 indicates the processing principle and error performance of on-line smoothing. The integrated scheme with on-line smoothing of VO/GNSS is described in Figure 7.

Error Modelling with On-Line Smoothing
Error modelling in VO/GNSS integration includes VO noise modelling, the length scale factor, and heading drift. It is assumed that the estimates from the smoother are better in quality and provide more output solutions compared to the EKF; therefore, the output from the smoothing solutions in each smoothing window is used for modelling error.
where | is the smoothing solution and is the prediction of the EKF. is the system error model at time k. It is used as the error model for the next estimation step at time k+1.
With this scheme, the system error model is updated every updating step of the EKF whenever the GNSS measurement is available.
In the VO/GNSS integration, the position and the length scale factor of VO are updated With on-line smoothing, the processing scheme can be done with real-time data. Figure 6 indicates the processing principle and error performance of on-line smoothing. The integrated scheme with on-line smoothing of VO/GNSS is described in Figure 7.

Error Modelling with On-Line Smoothing
Error modelling in VO/GNSS integration includes VO noise modelling, the length scale factor, and heading drift. It is assumed that the estimates from the smoother are better in quality and provide more output solutions compared to the EKF; therefore, the output from the smoothing solutions in each smoothing window is used for modelling error.
where | is the smoothing solution and is the prediction of the EKF. is the system error model at time k. It is used as the error model for the next estimation step at time k+1.
With this scheme, the system error model is updated every updating step of the EKF whenever the GNSS measurement is available.
In the VO/GNSS integration, the position and the length scale factor of VO are updated

Error Modelling with On-Line Smoothing
Error modelling in VO/GNSS integration includes VO noise modelling, the length scale factor, and heading drift. It is assumed that the estimates from the smoother are better in quality and provide more output solutions compared to the EKF; therefore, the output from the smoothing solutions in each smoothing window is used for modelling error.
where x k|N is the smoothing solution andx − k is the prediction of the EKF.
Q k is the system error model at time k. It is used as the error model for the next estimation step at time k + 1.
With this scheme, the system error model is updated every updating step of the EKF whenever the GNSS measurement is available.
In the VO/GNSS integration, the position and the length scale factor of VO are updated continuously based on GNSS data. However, it is no longer updated because GNSS does not provide an attitude parameter. Consequently, the position of the system drifts quickly during GNSS outages due to the drift of the heading, especially during turning. Thus, heading error modelling is used to estimate the heading error based on on-line smoothing as where h k|N is the smoothed heading and h − k is the predicted heading by the EKF at time k. The heading at time k + 1 is then estimated. h k+1 = h k + ∆h k /2 (28)

Experiment and Discussion
The testing system comprised a monoband camera (Blackfly, Point Grey) with a resolution of 808 × 608 (0.5 MP). GNSS data were provided by a double-frequency GNSS receiver (ProPak V3, NovAtel). The original GNSS output data rate was 1 Hz (one data record per second); however, for testing, the GNSS data rate was decreased to 0.05 Hz (one data record every 20 s).
The system for generating reference composed a medium tactical-grade inertial measurement unit (IMU) (C-MIGIT) and a dual-frequency geodetic-grade GNSS receiver (ProPak V3, NovAtel). Some additional ground control points in the GNSS-denied environment were included to guarantee that the reference solution was at the centimeter level of accuracy. The system was set up on a platform for testing, as shown in Figure 8. The reference trajectories were generated using tightly coupled integration with a smoothing algorithm using the commercial IMU/GNSS processing software, Inertial Explore. A testing software module was written and designed in C++ programming language to acquire and process data. In the first test, the testing field was carried out at a GNSS-hostile environment at a campus of National Cheng Kung University, Tainan, Taiwan. The testing trajectory is displayed in Figure 9. For performance evaluation, the solutions provided by pure VO, VO/GNSS using an original EKF, and VO/GNSS with on-line smoothing were analyzed. Figure 9 shows the positions of these solutions on the map and Figure 10 indicates a graphical comparison of the positional error between solutions. The In the first test, the testing field was carried out at a GNSS-hostile environment at a campus of National Cheng Kung University, Tainan, Taiwan. The testing trajectory is displayed in Figure 9. For performance evaluation, the solutions provided by pure VO, VO/GNSS using an original EKF, and VO/GNSS with on-line smoothing were analyzed. Figure 9 shows the positions of these solutions on the map and Figure 10 indicates a graphical comparison of the positional error between solutions. The numerical statistics in terms of the positional root-mean-square error (RMSE) are shown in Table 1.    It can be seen from the results that the positional error of the pure VO grows quickly over time. For VO/GNSS fusion using an EKF, the VO position is constrained by GNSS, and thus, its accuracy improves significantly (by 82.3%) compared to that of pure VO. Moreover, the integrated solution can provide seamless navigation even with GNSS outages. With on-line smoothing, the smoother is activated whenever an updating measurement is found. Smoothing is performed backward from current to previous updating time, utilizing data which stores in temporary dynamic arrays. The navigation solution with smoothing is more accurate (by 92.4%) than pure VO.
For the second test, the data set was collected at the Kuei-Jen Campus, National Cheng Kung University, where the GNSS satellite is good for evaluation, as show in Figure 11. In this test, the testing equipment was similar to that of the first test. Two simulated GNSS outages were generated. The performance analysis focuses on the VO/GNSS solution with an EKF, on-line smoothing, and on-line smoothing and error modelling. The comparison between the three solutions in terms of the ground trajectory and a graph are shown in Figures 12 and 13, respectively. The numerical analysis results, in terms of the positional RMSE, are shown in Table 2. It can be seen from the results that the positional error of the pure VO grows quickly over time. For VO/GNSS fusion using an EKF, the VO position is constrained by GNSS, and thus, its accuracy improves significantly (by 82.3%) compared to that of pure VO. Moreover, the integrated solution can provide seamless navigation even with GNSS outages. With on-line smoothing, the smoother is activated whenever an updating measurement is found. Smoothing is performed backward from current to previous updating time, utilizing data which stores in temporary dynamic arrays. The navigation solution with smoothing is more accurate (by 92.4%) than pure VO.
For the second test, the data set was collected at the Kuei-Jen Campus, National Cheng Kung University, where the GNSS satellite is good for evaluation, as show in Figure 11. In this test, the testing equipment was similar to that of the first test. Two simulated GNSS outages were generated. The performance analysis focuses on the VO/GNSS solution with an EKF, on-line smoothing, and online smoothing and error modelling. The comparison between the three solutions in terms of the ground trajectory and a graph are shown in Figures 12 and 13, respectively. The numerical analysis results, in terms of the positional RMSE, are shown in Table 2.      According to the statistics in Table 2, the estimation accuracy in terms of position for smoothing is much better than that of the EKF. With on-line smoothing, the improvement in RMSE is about 50%. The heading error, however, still drifts over time, leading to a large position error, particularly during GNSS outages. For on-line smoothing and error modelling, the heading error was estimated; the accuracy improvement was 96.2% compared to that of the EKF.  According to the statistics in Table 2, the estimation accuracy in terms of position for smoothing is much better than that of the EKF. With on-line smoothing, the improvement in RMSE is about 50%. The heading error, however, still drifts over time, leading to a large position error, particularly during GNSS outages. For on-line smoothing and error modelling, the heading error was estimated; the accuracy improvement was 96.2% compared to that of the EKF.

Conclusions
This study proposed an integrated scheme of VO and GNSS with on-line smoothing and error modelling based on the EKF and the RTS smoother to overcome the issues of GNSS in GNSSchallenging environments and the problem of unbounded error in VO. A system that included a camera, a GNSS receiver, and an IMU was combined for testing and reference generation, and a console program written in C++ based on OpenCV was implemented to evaluate the proposed method.
The testing results indicate that with an EKF used for data fusion, the RMSE of the 3D position is about 80 times lower than that of the VO-only solution. With on-line smoothing and error modelling, the predicted and updated information from the EKF were smoothed and the heading error was estimated. The results are thus more accurate and provide seamless on-line navigation information.

Conclusions
This study proposed an integrated scheme of VO and GNSS with on-line smoothing and error modelling based on the EKF and the RTS smoother to overcome the issues of GNSS in GNSS-challenging environments and the problem of unbounded error in VO. A system that included a camera, a GNSS receiver, and an IMU was combined for testing and reference generation, and a console program written in C++ based on OpenCV was implemented to evaluate the proposed method.
The testing results indicate that with an EKF used for data fusion, the RMSE of the 3D position is about 80 times lower than that of the VO-only solution. With on-line smoothing and error modelling, the predicted and updated information from the EKF were smoothed and the heading error was estimated. The results are thus more accurate and provide seamless on-line navigation information.
In feature-based approach, the static, salient, and repeatable features are tracked across the sequence images. Therefore, in future work, an algorithm that adopts outlier removal and more robust feature tracking to deal with complex environments (e.g., urban roads with many moving vehicles) will be developed. Non-linear, non-Gaussian filtering and a smoothing algorithm will be applied in the VO/GNSS integrated system to overcome the limitation of the EKF in terms of error modelling and highly dynamic movement.