Lane Detection Aided Online Dead Reckoning for GNSS Denied Environments

With the emerging interest of autonomous vehicles (AV), the performance and reliability of the land vehicle navigation are also becoming important. Generally, the navigation system for passenger car has been heavily relied on the existing Global Navigation Satellite System (GNSS) in recent decades. However, there are many cases in real world driving where the satellite signals are challenged; for example, urban streets with buildings, tunnels, or even underpasses. In this paper, we propose a novel method for simultaneous vehicle dead reckoning, based on the lane detection model in GNSS-denied situations. The proposed method fuses the Inertial Navigation System (INS) with learning-based lane detection model to estimate the global position of vehicle, and effectively bounds the error drift compared to standalone INS. The integration of INS and lane model is accomplished by UKF to minimize linearization errors and computing time. The proposed method is evaluated through the real-vehicle experiments on highway driving, and the comparative discussions for other dead-reckoning algorithms with the same system configuration are presented.


Introduction
Precise positioning and localization techniques for modern land vehicles have been widely implemented for the purpose of advanced driving assist system and autonomous driving capability. Global Navigation Satellite System (GNSS) has been adopted as a primary option to obtain the position and velocity of the vehicle. Since land vehicles are designed to be driven on the road, the positioning accuracy of GNSS can be compensated with the road map from Geographic Information System (GIS) [1][2][3][4] for the conventional navigation purpose, and even with the real time kinematics (RTK) techniques [5,6], its positioning performance can be improved up to centimeter-level accuracy.
Despite the outstanding accuracy and wide coverage of RTK GNSS, the satellite signal outage and multipath error in GNSS-denied area, such as densely built city, underpass, or indoor area, significantly threaten the reliability of the GNSS measurement [7,8].
To overcome the environmental limitation of the GNSS measurement, several alternative navigation methods with other types of measurements are introduced to ensure the consistency of positional information and improve the minimum performance under a poor satellite signal condition [9][10][11]. Those methods, well known as dead-reckoning (DR), are based on the cumulative process of relative change in the speed and direction from the latest known position.
Inertial Navigation System (INS) has been commonly adopted to complement GNSS [12][13][14][15][16]. During the period that GNSS signal is unavailable, INS estimate the position, velocity and attitude by integrating the inertial measurements, such as acceleration and angular rate. With the advancement of computing technology, visual sensors have been used as positioning devices [17][18][19][20][21][22][23][24]. Modern silicons allow the real-time processing of highresolution stereo images, which can directly compute the motion of camera set, and uses machine learning to estimate 3-axis motion from a monocular system. Recently, lidar-based localization methods are also introduced to perform precise positioning with point cloud maps in sub-meter accuracy.
However, considering the fact that GNSS is still considered as a primary device for navigation systems, it is obvious that those alternative positioning methods have their own limitations. INS have been widely used in various fields, including military and aerospace technologies, where the performance and reliability are top priorities. Although the nature of INS convinces near-perfect motion estimation theoretically, there occurs an inevitable error in reality without external aiding due to the imperfection of sensor measurements. Visual odometry [25,26] and SLAM [27][28][29] estimate the ego-motion of the sensor, by comparing the positional changes of surrounding environments and reducing error accumulation using the historical measurements. The main drawback of methods based on external sensing is the result easily affected by the condition of surrounding environment. When the surrounding environment is not suitable to perform feature extraction and matching, for instance, foggy or rainy weather, low intensity, or highly homogeneous scenes, DR based on environmental sensing easily fails.
On the other hand, applying those advanced positioning and localization techniques on mass-production vehicles are considered premature for several reasons. Currently, the mainstream environmental sensing equipment for consumer cars consists of monocular vision for lane detection, frontal radar for collision avoidance, and GNSS for navigation system. It is known that monocular vision system has scale ambiguity, which disturbs absolute motion estimation, and radar has highly sparse feature points that can be easily lost. Moreover, global positioning methods based on map-matching approaches require a large amount of digital map data, and there still remain numerous works to implying the high-definition map (HD map)-based localization in public.
In order to mitigate the shortcomings of DR performance of monocular vision and inertial measurement, this research focused on lane detection results from camera. Unlike feature extraction, learning-based lane detection gives a highly consistent result from same images. Recently, as a remarkable evolution in neural-network and artificial intelligence, learning-based lane detection models [30][31][32] shows better robustness than conventional machine vision approaches in challenging situations, such as varying shadows and image occlusions by moving objects. Figure 1 presents the lane detection results from both feature-based and learning-based approaches. For real driving scenes like highway driving, those challenges happen everyday, and therefore, learning-based lane detection is widely adopted in production vehicles. In this paper, we propose a DR method that uses robust lane detection results from the learning-based lane detection model [32]. As explained above, using standalone INS will gradually lead to drifting issues for vehicle kinematic/dynamic state variables, e.g., vehicle roll angle, bank angle of road surface and vehicle heading angle. By using the robust lane detection results, these drifting problems are to be compensated and therefore will be regulated to much smaller magnitudes compared to standalone INS. Moreover, using lane detection results for correction show higher performance and better computational cost than the state-of-the-art vision-based methods in real-world experiments.
We summarize the main contributions of our work as below: • We proposed a novel filter design that combines learning-based lane detection results with IMU mechanization for accurate vehicle localization in GNSS denied environments. • Accurate online vehicle localization was achieved for various road geometry and environment conditions, verifying the robustness of our proposed method.
The rest of the paper is organized as follows. In Section 2, vehicle kinematics model and observer model are introduced. In Section 3, filter selection and implementation process are illustrated. In Section 4, experiment scenarios, vehicle set up and various dataset from the experiment are explained. In Section 5, the result of lane detection-aided DR is presented and is compared with other visual odometry-based localization algorithms. Finally, in Section 6, the conclusion of this research will be illustrated.

System Modeling
In this section, vehicle kinematics and observer model design process will be explained thoroughly. To design a kinematic model that operates inside the filter, we first need to consider the overall framework of our research. From Figure 2, we can see that, using IMU measurement and lane detection results, the system should output reliable vehicle localization data.

Vehicle Kinematics Model
Vehicle kinematics follow the process of INS mechanization and a total of 9 vehicle states are propagated. Vehicle states and inputs are shown below.
φ, θ, ψ represent the Euler angles of the vehicle frame. At the initial step, we initialize all the states and vehicle attitude matrix according to the IMU measurements. Suppose that the vehicle attitude matrix at timestep (k − 1) is (C n b ) k−1 , skew matrix computed from Euler angles is S k−1 and norm of ω 3×1 k−1 T as ||ω 3×1 k−1 T||, then we can first update the vehicle attitude matrix using the angular velocity input and compute the vehicle acceleration in the navigation frame.
T is the timestep interval and is 0.05 s (20 Hz) during the simulation process. Using the updated vehicle attitude matrix and acceleration data, we can propagate the updated Euler angles, velocity vector and position vector.
Arranging the results above, propagated vehicle states can be written as following.

Observer Model
In order to update the vehicle states by using lane detection results, we can first think of using the previous step lane geometry, as shown in Figure 4. Considering filter implementation at Section 3, previous step lane detection results and previous step vehicle position estimates are used to create the previewed lane geometry (previous sample points) at the (k − 1) th step. After the IMU pre-integration introduced at Section 2.1, we can resample points on the previous lane geometry by linear interpolation. This can be compared with the actual measurement made at the k th step (current sample points) for vehicle position error compensation.
The actual implementation starts off with creating the lane geometry information with (k − 1) th step updated vehicle position and (k − 1) th step lane detection results. Suppose that we are obtaining the global coordinates for n th previewed left lane point (x l n ) k−1 , (y l n ) k−1 . The coordinates can be computed as below.
(l n ) k−1 is the lateral distance to the 10 nm (longitudinal) previewed left lane point measured by the lane detection model. These coordinates for all the previewed points are the previous sample points in Figure 4. Then, we convert previous sample points coordinates from global frame to k th vehicle body frame (IMU pre-integrated). Frame transformation of n th previewed left lane point can be done as follows.
(ψ rel n ) k in Equation (19) represents the relative angle of the previewed lane point measured from the vehicle body x axis. (L l n ) k in Equation (20) is the 2D Euclidean distance from the IMU pre-integrated vehicle position and the n th left lane point. The superscript b at Equations (21) and (22) mean that they are measured from the vehicle body frame. Note that the subscript of (x l n ) b k−1 in Equation (21) is (k − 1) because we are simply transforming (x l n ) k−1 , which is the x coordinate of the previous sample point.
For measurement update, we can compare (y l n ) b k−1 with (l n ) k , which is the k th step lane detection result of n th previewed left lane point. IMU pre-integration process error can be compensated through this step. Other than the lane information, we also use vehicle longitudinal velocity for the measurement model.
Combining the lateral distances of previewed points(n points for left and right lanes) and vehicle longitudinal velocity, the measurement prediction matrix can be written as follows.
Having n preview points for each lane, the size of the measurement prediction matrix will be R (2n+1)×1 . For measurement update, we organize the actual measurement matrix as below.
represents the longitudinal velocity actually measured by IMU.Using Z k , Y k , we can update the vehicle states at the measurement update step, introduced at the next section.

Filter Selection and Framework
Nearly every vehicle localization problem is approached by using a filter that fits the proposed prediction/observation model and available data type well. The most popular filters are extended Kalman filter(EKF), unscented Kalman filter(UKF), and particle filter(PF) which show reliable performance for nonlinear or complex models.
Extended Kalman filter solves the nonlinear estimation problem by linearizing state and/or measurement equations and applying the standard Kalman filter formulas to the resulting linear estimation problem. The linearization yields to approximation errors, which the filter does not take into account in the prediction/update steps. Therefore, EKF error estimates tend to underestimate state uncertainties. In comparison, UKF picks so-called sigma point samples from the filtering distribution and propagates/updates them through the nonlinear state and measurement models. The resulting weighted set of sigma points represents how the updated filtering distribution, which is then approximated as a moment matched Gaussian distribution. This state estimation results represent the state uncertainty better than the estimates obtained from the EKF with an increased computational cost. Similar to UKF, the PF method propagates particles, but the main difference is that the particles are selected in a probabilistic manner. Generally, PF shows higher time complexity than EKF and UKF, because a lot of particles are needed to represent the entire nonlinear model.
Since one of our goals in this research is to implement the real-time vehicle localization method, we can see that PF is not an appropriate candidate for filter design. Taking our system into consideration, for GNSS denied situations with no precise map available, the only applicable measurement for update step is lane detection result. However, the output of lane detection model has high uncertainty for far preview distances, which may lead to huge error accumulation for EKF update process. Cancelling out the candidates, we finally have UKF as our filter structure.
From the subsections below, simple implementation of the UKF will be illustrated in the same order as the flowchart shown in Figure 5. Note that the variables used in this section are slightly modified from the ones at Section 2, adopting the Kalman filter notation.

Prediction Step
Before entering the main filtering loop, the initialization of all the state variables is done by using the GNSS/INS and vision data. Assuming that at least the initial conditions are very accurate, the variance values of all the states inside the covariance matrix were initially set as low quantities. Using the state variable format from Section 2, we can rewrite the state propagation equation in the KF notation, where function f is the state propagation function introduced at Section 2. Then, the measurement prediction step can also be rewritten as follows.
For the simplicity of an explanation, extracting sigma points and performing unscented transform were not mentioned in the Equations (27) and (28). Furthermore, the pre-diction step for state covariance matrix was skipped. Detailed information about the implementation process is shown in Figure 5.

Update Step
At the update step, we have to compare the predicted measurement with the actual measurements. Referring to the observer design at Section 2.2, state update can also be described in the KF form.
The remaining filter implementation is done according to the flowchart of Figure 5. As the simulation loop continues, X k|k and P k|k are saved for data analysis at Section 5.

Experiment
As mentioned in Section 1, our goal is to achieve accurate online vehicle localization for GNSS denied situations. Therefore, we have to compare the result of our proposed model with ground truth and other state-of-the-art visual odometry-based methods to prove the performance. The following sections describe the equipment used in the experiment, the geographical information of the test site, lane detection model, and its results in detail.

Experiment Setup and Scenarios
In this research, we focus on the outdoor, especially highway (i.e., challenging feature extraction) situations because urban and indoor (e.g., parking lot) online vehicle localization can be achieved in high accuracy by existing visual odometry(VO) or SLAM methods. The experiment is carried out on the highway located in Daejeon, South Korea, and as shown in Figure 6a, the vehicle traveled approximately 52 km.
The test vehicle used for the research is GENESIS G80 Sedan, as shown in Figure 6b, and the camera used for forward view recording is the FLIR BLACKFLY model. Two monocular cameras are attached to the vehicle in Figure 6c to perform as stereo camera. In order to compare the proposed methods with other VO methods, an industrial grade IMU, Xsens MTi-670g is also fastened to the stereo vision system, and calibrated with the vehicle body coordinate [33,34]. Finally, the CPU used for simulation is Intel Core i5-4690 CPU @ 3.50 GHz, and RAM of 16 GB.
For the performance evaluation of our proposed method in various situations, there is a need to slice the total vehicle trajectory into some specific scenarios. The scenarios are chosen mainly according to the lane geometry and the surrounding environments. The localization performance of our proposed method will be illustrated for each scenario at Section 5. At the beginning of each scenario, we assume that there is GNSS initialization. After the initialization, our proposed method and the other comparison methods are propagated without any GNSS update.

Lane Detection Model
In order to obtain lane fragments from collected images, a CRNN-based lane detection model, named "supercombo", is adopted [35], which is currently implemented in commercial aftermarket ADAS systems. The model takes its input as two successive image frame and latest fully connected layer. The output of the model consists of four lane line candidates, two road boundary for left and right edge, lead vehicle position estimation, and path planning results. In this research, we use only two lane lines, for left and right lanes, since those two lane lines are also presented in other types of lane detection methods as the essential output. It is worth noting that the detected lane lines have their preview length up to 100 m, while the estimated accuracy decreases as preview length increases.

Lane Detection Results
Before proceeding to DR implementation, we perform a pre-test of lane detection to validate the performance and reliability. Since the lane detection model is designed for a single-camera setup, the left camera from the stereo setup is used. The inference results of the lane detection model are presented in Figure 7, which describes the reprojected lane lines in the global coordinate. Ground truth of vehicle trajectory is obtained by OxTS RT3100, a commercial INS system for land vehicle test and survey.  Extending the preview distance up to 100 m and plotting for full simulation time of Scenario 1 (refer to Section 5.2), we can obtain Figure 8. Due to transformation error from image to real world coordinates and image distortion for far previewed distances, it is obvious that lateral distance data of 0 m previewed lane point is much more trustworthy compared to 100 m previewed lane point. As we can see in Figures 7 and 8, further previewed lane points show huge deviations, especially at curvy road segments. However, this does not mean that the previewed lane point data should be discarded due to the high uncertainty. Although further previewed lane points have larger position errors, their existence implies curvature of the previewed lanes and restrains kinematic/dynamic vehicle states from diverging. This is a trade off problem, and will be discussed intensively at Section 5.6.1.
To sum up, the most accurate mapping possible from this dataset would be merging all the 0 m previewed lane points. Ground truth for this research can be thought of as 2 parts. First is the accurate vehicle position measured by RT and the second is 0m previewed lane points transformed into global fixed coordinates.
At Section 5, localization error will be computed by using the ground truth vehicle position obtained above. Other than the Euclidean distance error, heading angle difference will also be considered for analysis.

Comparison Method: VO
In order to evaluate the dead-reckoning performance of the proposed method, stateof-the-art visual odometry methods are also implemented. We chose VINS [36][37][38][39], topranked VO method in KITTI benchmarks, as competitive methods, since VINS have been designed for various types of system configurations, such as monocular vision, stereo vision, visual-inertial fusion, and even vehicle model fusion. It is worth noting that, for the fair comparison, the intrinsic and extrinsic parameters for cameras and IMU have been pre-calibrated with an open-sourced visual-inertial calibration library, kalibr [40]. Figure 9 shows the baseline of the stereo setup. However, unlike the indoor situation or urban driving scenes, the performance of VO is figured out to be degraded in the highway environment. Figure 10 shows the feature matching and calculated optical flow from a given image sequence. Since the background scene is nearly homogeneous, a large portion of features are extracted from surrounding vehicles. Moreover, the feature points on surrounding vehicles are relatively closer; hence, the effect of that points can be emphasized in the pose estimation result, while learningbased lane line detection shows consistent result, with or without surrounding vehicles. In order to improve the performance degrading under the homogeneity of the scenery, the direct approach, specifically direct sparse odometry(DSO) [41], that uses the photometric error rather than the matching of selected set of feature points, has been adopted to competitive methods. The direct method shows more consistent ego-motion tracking performance. The sparse points from DSO also reflect the distinguishable characteristics in the middle of road surface, while the feature points from VINS tend to be biased on the corners on images as shown in Figure 11. However, under rapid changes in illuminance in the surrounding environment, such as direct sunlight toward camera or insufficient intensity in tunnels, the direct method shows the degraded performance, or fails occasionally.
Considering the drawbacks of comparison methods and to evaluate localization performance of our proposed method for specific lane geometry conditions, we extracted 4 scenarios from the highway drive. The result of localization for various scenarios will be presented in the following subsections, and overall analysis will be done at the end of the section. For simplicity, VINS Stereo + IMU is written as VINS1, VINS Stereo as VINS2 and VINS Mono + IMU as VINS3 for the RMSE comparison.

Scenario 1: Initial Stage
The first scenario is the initial stage of the experiment, where a vehicle passes the tollbooth and enters the highway. This scene was chosen for evaluating standardized highway road geometry. As we can see from Figure 12, the ground truth lane does not have any extreme road geometry (high curvature, long straight path). The total travel distance and travel time of scenario 1 is approximately 992 m and 60 s, respectively. Localization comparison of methods is shown in Figures 12 and 13 and Table 1.

Scenario 2: Straight Road
Scenario 2 represents the case for a long straight road. This is to evaluate and analyze the longitudinal error magnitude for our proposed method. The total travel distance and travel time of scenario 2 are approximately 4.6 km and 200 s, respectively. Localization comparison of methods are shown in Figures 14 and 15

Scenario 3: Curved Road
Scenario 3 represents the case for curvy roads. High curvature trajectory was chosen from the ground truth data. The total travel distance and travel time of scenario 3 is approximately 1077 m and 60 s respectively. Localization results are shown in Figures 16 and 17 and

Scenario 4: Tunnels
As illustrated in Section 5.1, VO shows generally degraded performance at highway situations, and this is predicted to be more intensified in tunnels. In order to compare the localization performance of VO and proposed method for challenging feature extraction environments, scenario 4 was tested at Figure 18. Scenario 4 consists of 3 consecutive tunnels at the highway, as shown in Figure 18. The total travel distance and travel time of scenario 4 are approximately 5290 m and 225 s, respectively. Localization comparison of methods is shown in Figures 19 and 20 and Table 4. In this scenario, DSO algorithm has failed.    For 4 scenarios and their localization results from Tables 1-4, we can observe that the localization performance of our proposed method is generally enhanced for further preview distances. As shown in Figure 8, although further previewed points have higher positional uncertainty, vehicle localization is stabilized by introducing forward lane geometry to the model update. Predicting the previewed point positions using the previous step lane detection measurements and vehicle position estimate "push" or "pull" the IMU mechanized vehicle position to the accurate location.
However, naively increasing the preview distance is not the optimal solution to accurate localization. Results from Tables 1 and 4 show degrading localization performance after 40 m and 50 m preview distance, respectively. This is due to the inherent uncertainty of the lane detection results for far preview distances.
Therefore, we can conclude that optimal preview distances are different for various scenarios tested in this research, but localization performance is generally enhanced for longer preview distances.

Longitudinal, Lateral Error and Heading Angle Drift of Proposed Method
It is intuitive that lane detection information helps vehicle localization in the lateral direction, but not for longitudinal direction. Observing the localization results for scenarios 1 to 4, we can see that our method shows accurate enough localization for both vehicle longitudinal and lateral directions. This is because previewed road curvature information "attracts" vehicle to the appropriate longitudinal position by measurement prediction model in Section 2.2, compensating for the accumulated longitudinal error.
If the road has high curvature, as shown in Section 5.4, longitudinal error is bounded with the help of previewed lane geometry. On the other hand, for scenario 2 (Figure15), the error keeps on increasing because there is little feedback on the longitudinal direction for long straight road section (low road curvature). However, considering that the longitudinal error reached only 11 m after 4.6 km drive, this implies that even with small lane curvature feedback, longitudinal diverging tendency is maintained at a slow increasing rate.
Other than 2D Euclidean localization error, vehicle heading angle drift should also be considered for accuracy evaluation. For all the scenarios, we can see that the heading angle drift is regulated below 2 degrees of magnitude, even for long vehicle trajectories. Similar to the longitudinal error, heading angle is bounded by using the previewed lane geometry.

Comparison with Other Methods
As we can see from Figures 12,14,16 and 19 and RMSE comparison table for each scenario, our proposed method shows much better performance in estimating the vehicle position accurately, compared to other VO and standalone INS methods.
Except for scenario 4, at least 1 VO method showed adequate localization performance for each of the scenarios. However, in scenario 4, as mentioned in Section 5.5, the accuracy of VO methods is severely degraded. DSO failed, and VINS Stereo also totally diverged from the ground truth, and the same for the remaining 2 methods. This is due to the moving and homogeneous feature extraction in 3 consecutive tunnels. Our proposed method, however, uses the robust learning-based lane detection model, which means that "features" extracted for implementation(i.e., lane information) are consistent and very stable for analysis. Based on these lane detection results and proposed model, we succeeded in achieving accurate localization performance, even for the tunnel scenario.

Overall Summary
This study proposed a novel lane detection-based online dead-reckoning method in GNSS denied situations. Using IMU measurements and robust learning-based lane detection results as input to the system, vehicle kinematics and observer were designed.
Vehicle position estimation was implemented by using unscented Kalman filter with the model structure at Section 2. For the various highway drive scenarios, the evaluation of localization performance of our proposed method was done by comparing with state-of-theart VO methods and standalone INS results. Although positional shifting was inevitable for long trajectories, the proposed method showed much better results than the comparison sets by successfully restraining the diverging vehicle states with the previewed lane geometry. Moreover, it was verified that using previewed lane information up to certain distances enhanced the vehicle localization accuracy, but showed degrading performance when using too far-previewed lane detection results.

Future Research Direction
In this paper, we have implemented the vehicle localization method by fusing learning based lane detection results with IMU mechanization. However, this method does not take into account the pitching and rolling motion of the vehicle during the highway drive. Underestimation of these additional vehicle states may have caused unwanted localization errors in the proposed model and filter design. For further research, expansion of the vehicle and lane kinematics model to 3D scale, considering the rolling and pitching motion of vehicle, can be done to enhance localization accuracy.
Moreover, together with the loop closure algorithm, the proposed method could be further improved to create an accurate digital lane map along the vehicle trajectories, and is also expected to show enhanced performances when the lane lines are not presented continuously or rapidly changing.

Data Availability Statement:
The data presented in this study are available on request from the authors. The data are not publicly available due to privacy reasons.