RTLIO: Real-Time LiDAR-Inertial Odometry and Mapping for UAVs

Most UAVs rely on GPS for localization in an outdoor environment. However, in GPS-denied environment, other sources of localization are required for UAVs to conduct feedback control and navigation. LiDAR has been used for indoor localization, but the sampling rate is usually too low for feedback control of UAVs. To compensate this drawback, IMU sensors are usually fused to generate high-frequency odometry, with only few extra computation resources. To achieve this goal, a real-time LiDAR inertial odometer system (RTLIO) is developed in this work to generate high-precision and high-frequency odometry for the feedback control of UAVs in an indoor environment, and this is achieved by solving cost functions that consist of the LiDAR and IMU residuals. Compared to the traditional LIO approach, the initialization process of the developed RTLIO can be achieved, even when the device is stationary. To further reduce the accumulated pose errors, loop closure and pose-graph optimization are also developed in RTLIO. To demonstrate the efficacy of the developed RTLIO, experiments with long-range trajectory are conducted, and the results indicate that the RTLIO can outperform LIO with a smaller drift. Experiments with odometry benchmark dataset (i.e., KITTI) are also conducted to compare the performance with other methods, and the results show that the RTLIO can outperform ALOAM and LOAM in terms of exhibiting a smaller time delay and greater position accuracy.


Background
Precise ego-motion estimation and active perception play important roles when performing navigation tasks or exploring unknown environments in robotics applications, and the potential of small unmanned airborne (S-UAS) platforms applied to collect remote sensing data have been analyzed [1]. Unmanned aerial vehicles (UAVs) running simultaneous localization and mapping (SLAM) algorithms can also be used to perform numerous tasks, including surveillance, rescue, and transportation in extreme environments [2][3][4]. In the field of SLAM, the performance of state estimation is highly reliant on sensors, such as cameras, LiDAR, and inertial measurement units (IMUs). However, there are limitations associated with each type of sensor, such as minimum illumination requirements and the presence of noise. To overcome these shortcomings of stand-alone sensors, multiple sensors have been used to increase the reliability of estimation [5][6][7][8][9]. The methods utilizing multiple sensors for state estimation are categorized into two types: loosely coupled (cf. [5,6]) and tightly coupled (cf. [7][8][9]). The tightly coupled approach directly fuses LiDAR and inertial measurements through a joint optimization that minimizes some residuals, whereas the loosely coupled approach deals with the multiple sensors separately. The tightly coupled method is less computationally efficient and more difficult to implement than the loosely coupled approach, but it is more robust in its approach to noise and more accurate [8].
Accurate and real-time localization is crucial to the feedback control of UAVs in practical applications. Acquiring accurate localization information by solving the tightly coupled problem requires a considerable amount of computation, which decreases the frequency at which state estimation can be performed for providing real-time feedback. Moreover, the requirements of robust, precise, and fast localization increase the difficulty of designing algorithms. The visual inertial odometry (VIO) method [7] proposes achieving precise and real-time results based on tightly coupled VIO that fuses camera and IMU measurements for state estimation. However, its performance can be impaired by poor lighting conditions. Since 3D LiDAR sensors are less influenced by lighting conditions and can also provide range measurements of the surrounding environment, they have been successfully used for ego-motion estimation [8][9][10][11]. Most LiDAR systems update at a lower frequency than cameras (usually 10 Hz), which means that the point cloud can be distorted when the LiDAR moves aggressively. In contrast to LiDAR, an IMU is capable of extremely high update rates, and so combining LiDAR and IMU allows their individual deficiencies to be compensated, and the state estimation for UAVs can be solved by tightly coupled optimization.
In [11], the feature points were extracted from the LiDAR point cloud, and the corresponding features from the last LiDAR measurement were matched to estimate the ego-motion. To refine the odometry, feature points were matched and registered to the feature maps. In [9], it was shown that using tightly coupled LiDAR inertial odometry (LIO) with multiple window frames to local map is too time-consuming, while the accuracy is significantly degraded if there are too few windows. The approach taken in [7] cannot be used, either in a dark or highly dynamic illumination environment. Therefore, there is always a trade-off between high accuracy and computation efficiency, and the localization performance using other sensors (e.g., cameras) to reduce the computation loading can be affected by environmental factors.

Related Works
In [11], IMUs play an important role by providing an initial guess before performing estimation. However, this approach mainly relies on LiDAR information to estimate the motion, by matching feature points extracted from the local surface to the corresponding feature, estimating the relative transform, and constructing a global map. In [7], the gyroscope bias, scale, and direction of gravity can be corrected through the initialization step. In [12], the estimator combines IMU data and plane features obtained from LiDAR for joint optimization. Notably, feature planes are compressed into the closest point in each frame, so that the estimator is able to run in real time. In [9], tightly coupled 3D LIO for graph optimization was demonstrated in both indoor and outdoor environments, but the estimation process still took too long. In [13], a feature extraction algorithm for LiDAR systems was proposed with small fields of view, and feature points were used to estimate odometry and mapping. Moreover, linear interpolation was used to suppress the effect of motion blur associated with LiDAR movement, with each point in the same frame being compensated for this movement.
Loop closure is an approach that corrects the drift that occurs during long-term operation. The method starts by identifying previously visited places, and the iterative closest point (ICP) is the most common approach that involves searching the matches between the current laser scan and the existing map. Another method computes feature descriptors from laser scans, and then verifies loop closure based on certain conditions. In [10], features were segmented into many clusters, which enables the method to perform real-time pose estimation, even in a large-scale environment. In the back-end, the k-dimensional (KD)tree method was used to search for the closest keyframe when performing loop closure, with loop closure established once the residual from ICP is sufficiently small. In [14], a real-time mapping approach was proposed that involved inserting laser scans into a probability grid. In that method, a branch-and-bound search runs in the back-end for loop closure, and if a sufficient match is found in the search window, the loop closure constraint is added to the optimization problem. Overall, this system achieves a moderate capability and pixel-level accuracy using 2D LiDAR, but it is still too time-consuming for applying to 3D LiDAR data. The approach in [15] classified laser scans into segments with feature descriptors, and the transformation was obtained by matching these segments with the map. This method saves time compared to matching the entire laser scan, but its performance may be highly dependent on the accuracy of the classifier.
A tightly coupled LIO is developed in this work to obtain high-accuracy and highfrequency localization output for the feedback control of UAVs. Although tightly coupled methods normally require more computation loading, the developed approach can generate more-accurate and more-frequent localization information. The frame-to-map estimation process is robust and stable, and loop closure is applied to further correct the accumulated error when a loop is detected. Here, the performance of this new method is compared with other approaches in the literature using the publicly available KITTI (http://www. cvlibs.net/datasets/kitti/eval_odometry.php, accesed on 5 April 2021) dataset [16,17]. KITTI dataset was chosen, since it is the first dataset that provides accurate ground truth. The data were collected using a Velodyne HDL-64E laser scanner that produces more than one million 3D points per second and a state-of-the-art OXTS RT 3003 localization system which combines GPS, GLONASS, an IMU and RTK correction signals. Additionally, a fair comparison is possible using KITTI dataset due to its large scale nature as well as the proposed novel metrics, which capture different sources of error by evaluating error statistics over all sub-sequences of a given trajectory length or driving speed [16]. Many works (e.g., [10,11]) also used KITTI as benchmarks to evaluate the localization accuracy. The results demonstrate that the algorithm applied in this work can outperform other approaches in terms of both accuracy and frequency. The main contributions of this study are summarized as follows: 1.
IMU excitation is not required for initialization, in contrast to [7].

2.
Online relocalization combined with loop closure and pose-graph optimization methods have been developed for odometry and mapping that are more accurate than in [9]. 3.
In contrast to the odometry and mapping algorithm [11], the developed RTLIO can provide a high-frequency of odometry for the UAV and constructing maps synchronously.

Overview
The architecture of RTLIO is shown in Figure 1. The system starts with measurement preprocessing (Sect. IV), in which point clouds from the LiDAR measurement are classified into corner points and plane points. The distorted clouds are corrected by the integration of IMU measurements between two consecutive LiDAR frames. In the front-end (Sect. V, VI), the initialization processing provides the bias of the gyroscope, direction of gravity, and initial velocity for bootstrapping the subsequent nonlinear optimization-based RTLIO. In the sliding window optimization, the cost function is constructed to include the marginalization, LiDAR, and IMUs information for solving the UAV pose. In the back-end (Sect. VII), the loop closure is used to detect whether the current position has been revisited, and the pose graph optimization module is used to reduce the accumulated drift to increase positioning accuracy. Finally, RTLIO provides two frequency poses. One is the LiDAR-rate pose after preprocessing and optimization at 10 Hz; the other is the IMU-rate pose generated by the IMU propagation in RTLIO at 400 Hz. Let body frame b k be defined on the IMU, where k denotes the frame when the k th LiDAR measurement is acquired. The world frame w is defined on the initial body frame, and the direction of the gravity is aligned with the z axis of the world frame. The LiDAR frame l be defined on the LiDAR. The rotation from frame A to frame B is denoted as q B A or R B A , and the translation transforming from frame A to frame B is denoted, as p A B . ⊗ represents the Hamilton product between two quaternions. All other variables are listed in Table 1.

Index
Note Euler angle R ∈ SO(3) rotation matrix T ∈ R 44 transformation matrix ω ∈ R 3 angular velocity a ∈ R 3 linear acceleration g ∈ R 3 gravity b a , b ω ∈ R 3 acceleration and gyroscope bias n a , n ω ∈ R 3 acceleration and gyroscope noise P point cloud The time stamps of the measurements from the LiDAR and camera are illustrated in Figure 2, where the sliding window includes the latest m LiDAR frames and each frame contains a set of IMU measurements, since the sensing rate of the IMU is much higher (e.g., 200 Hz).

IMU Preintegration
IMU preintegration and the covariance matrix derivation with the continuous-time IMU dynamics of an error-state Kalman filter were proposed in [7,9,18,19]. Based on [20], the IMU states can be divided into true states X, nominal statesX, and error states δX, whose compositions are defined as where α, θ, and β are the position, orientation, and velocity, respectively, and b a and b ω are defined in Table 1. The operation for a state v in the vector space is simply the Euclidean addition, i.e., v =v + δv, and for quaternion, it implies the multiplication of the quaternions, i.e., ⊗. Measurementsâ t andω t at time t ∈ (t k , t k+1 ] are defined aŝ where n a and n ω are defined as random variables with normal distribution (i.e, N ) with zero mean and variances σ 2 a and σ 2 ω . The position, velocity, and orientation states between two body frames b k and b k+1 can be propagated by integrating IMU measurementsâ t andω t during t ∈ (t k , t k+1 ] in the world frame as where Ω is the same as defined in (3) of [7]. Transforming (3) from the world frame to frame b k yields where are the true states of the IMU integration and γ is the The noises in (5) are unknown, and so the nominal states can be expressed aŝ whereb a andb ω are the biases in the accelerometer and gyroscope. The difference between the nominal states and the true states is minimized by correcting the nominal states, as described in Section 2.1.3.

Correction of Preintegration
Based on (1), the error state can be rewritten as where the operation for a state v in the vector space is simply the Euclidean addition, i.e., v =v − δv, and for quaternion, it implies the multiplication of the inverse of the quaternion.
Taking the time derivatives of (5)- (7) yieldṡ where F i and V i are error state dynamics matrices, n a0 and n a1 are the acceleration noises, n ω0 and n ω1 are the angular velocity noises, n b a and n b ω are modeled as random walks applied to the biases, and σ 2 b a and σ 2 b ω are variances of n b a and n b ω , respectively. Based on (8), the relation between error states δX i and δX i+1 can be discretized as which describes the relation of two error states at t i and t i+1 , which can be extended to the two error states at t k+1 and t k by According to [18] of δX b k+1 can be computed recursively using the first-order discrete-time covariance updated with the initial value Q where O contains the diagonal covariance matrices σ 2 a 0 , σ 2 ω 0 , σ 2 a 1 , σ 2 ω 1 , σ 2 b a , and σ 2 b ω . Based on (1), (6), and (10), the corrected preintegrations denoted asᾱ where δb a k and δb ω k are obtained from (7), with b a k and b ω k discussed in Section 2.3, and J α b a is the submatrix in F , whose location corresponds to b ω also follow the same notation.

LiDAR Feature Extraction and Distortion Compensation
LiDAR measurements are not made synchronously due to the rotating mechanism inside the LiDAR sensor, and therefore the point cloud P k in the k th frame suffers from distortion, as shown in Figure 3a. This distortion was compensated for using IMU measurements, as shown in Figure 3b. First, P k is segmented into N subframes by azimuthal angle φ, where P i k is the i th subframe for i ∈ {1, 2, . . . , N}. Second, the transformation matrix from t i k to t N k is defined as T N i , and is calculated from the IMU integration as T N i = T(t N k )T -1 (t i k ). Third, by performing subframe-wise transformation, the distortion-compensated point cloud denoted as P k is obtained as The segmentation and t i k are depicted in detail in Figure 4. After performing distortion compensation, the feature points on the planes or the edges in each sweep are extracted using the feature extraction procedure proposed by [10,11].

LiDAR Odometry
In Section 2.1.4, the feature points in each sweep are used to find the corresponding feature points in the last sweep, so that the transformation between each sweep (i.e., P k and P k+1 , defined in (13)) can be obtained by minimizing the residual. The procedures are described in detail in [10,11].

Estimator Initialization
In the monocular visual-inertial system [7], the metric scale was recovered through the initialization process. However, the developed LiDAR-inertial system in this work does not require the initialization process to recover the metric scale thanks to the range measurement from the LiDAR sensor. To help improve the preintegration accuracy, the gyroscope bias b ω needs to be estimated in Section 2.2.1, and the corrected preintegration can facilitate the estimation of the gravity vector in the first LiDAR frame g l 0 in Section 2.2.2. from (12) is combined to estimate δb ω by minimizing the following cost function: where c is the number of frames used for the initialization. Once the gyroscope bias is solved, preintegration termsα will be repropagated using (12).

Linear Alignment
After computing the gyroscope bias, another important element to consider is the gravity vector. Initialization state χ I is defined as which includes the velocities on the body frame of each moment and the gravity vector, where the magnitude of g l 0 is known.

Remark 1.
When the UAV is moving during the initialization process, velocity v b i defined in (15) can be calculated from p Given two consecutive frames b k and b k+1 in the window, q to form the minimization problem to solve state χ I defined in (15), wherê is the measurement noise. The transformation between each LiDAR measurement, as well as the registered laser scans, will be transformed to the world frame using g l 0 . This is useful when l 0 frame might not be horizontal, where LiDAR-only odometry may result in a tilted map. After b ω and g l 0 are estimated, they can be used as the initial conditions for the tightly coupled LIO in Section 2.3.

Front-End: Tightly Coupled LIO and Mapping
State vector χ, which includes all of the states in the sliding window, is defined as where m is defined in Table 1, x k is the state at the time when the k th LiDAR measurements are acquired, and x b l are the extrinsic parameters between the LiDAR and IMU. To estimate the state defined in (18), the following cost function is optimized to obtain a maximum posteriori estimation: where L is a set of indices that characterizes the LiDAR features in the sliding window, which includes two type of sets, namely edge E and plane F, such that L = {E, F}. f is the feature correspondence of j feature. ρ is a loss function used for outlier rejection, r p and H p are the prior information from marginalization defined in the subsequent , x k , x k+1 is the residual for the IMU measurements, and r L ẑ j f , χ is the residual for the LiDAR measurements defined in the subsequent analysis. The residuals are described in detail in Sections 2.3.1 and 2.3.2. To make different types of measurements unitless and scale-invariant, the Mahalanobis norm is applied to (19) is obtained by propagating the uncertainty using (11). The Ceres solver [21] was used to solve the nonlinear problem defined in (19).

IMU Measurement Model
Replacing true states α in (4) with the result from (12), allows a residual form to be constructed in (20).
where [·] xyz denotes the extraction of the imaginary part of the denoted quaternion.

LiDAR Measurement Model
The LiDAR cost function includes frame-to-frame and frame-to-map matching. The frame-to-map matching provides high precision for each state, while the frame-to-frame matching can suppress the variation of the states in the sliding window.
Consider P l k j to be a feature in the k th LiDAR frame. For frame-to-map matching, P l k j is represented in the world frame w as For frame-to-frame matching, point P l k j is represented in the previous LiDAR frame l k−1 as The residuals for edge E and plane F features are constructed, as shown in Figure 5, and defined as follows.

Residuals for the Edge Features
The point-to-line distance describing the residuals for edge features can be computed as where P t j is P l k j represented in t frame, t can represent either w or l k−1 using (21) or (22), respectively. P t f 1 is the closest edge feature to P t j , and P t f 2 is the second closest point.

Residuals for the Plane Features
The point-to-plane distance known as the Hesse normal form can be computed where n t f is the normal vector of the closest plane to P t j , and d t f is the distance from the closest plane to the origin of frame t. The residuals described in (23) and (24) are applied to construct one of the residuals defined in (19) as where r E ẑ w (j, f ) , χ is the residual of the edge feature for frame-to-map matching, r F ẑ w (j, f ) , χ is the residual of the plane feature for frame-to-map matching, and r F ẑ l k−1 (j, f ) , χ is the residual of the plane feature for frame-to-frame matching. The residual of the edge feature for frame-to-frame is not considered, since it does not help for boosting the accuracy of RTLIO.

Marginalization
In order to reduce the computational complexity and preserve the history information, a marginalization procedure needs to be applied to the sliding window method. This marginalization aims to keep the most-recent frame in the window, and the Schur complement is applied to construct a prior term based on marginalized measurements. The detail can be referred to [22,23]. A factor graph of such a system is shown in Figure 6. The frameto-map constraints do not influence the adjacent states, and so only the frame-to-frame constraints are considered. Figure 6. Illustration of the factor graph and the marginalization strategy. The oldest frame in the sliding window will be marginalized into prior information after optimizing (19).
Combining all of the residuals and solving the cost function defined in (19) yields the best estimation of states. The local map is then obtained based on the current state estimation, by applying an appropriate algorithm [11].

Back-End: Loop Closure and Pose-Graph Optimization
The optimization-based approach provides sufficient accuracy in an indoor environment, but for large-scale cases, it is inevitable that accumulated drift will occur due to various factors, such as extrinsic parameters between the LiDAR and IMU, the asynchronous sampling of measurements, and inaccurate data association during LiDAR matching. One way to correct for such drift is using loop closure. This method starts with identifying the previously visited places. Once a loop is detected by computing feature correspondences, a relocalization process tightly integrates these constraints into a cost function. This procedure minimizes drift and achieves much smoother state estimation. After the loop closure and relocalization are performed, the sliding window shifts and aligns with the past poses. Then, a pose-graph optimization algorithm can match all keyframes, in order to minimize the drift and ensure the global consistency of the system. These processes might not influence the current state estimation, but the optimized pose-graph can facilitate the consistency of global map reconstruction after performing the state estimation.

Loop Closure
The loop closure algorithm is described in Algorithm 1. Once a frame is marginalized from the sliding window, its point cloud in the body frame, P m , and its pose, T m , will be fed into the loop closure algorithm. If the L2-norm between T m and the pose at the lastest keyframe is higher than an Euclidean distance threshold, the marginalized frame is considered as a new keyframe. In this way, the keyframes are kept uniformly distributed in the space. Then, KD-tree search with search radius of r is performed if the keyframe database, (D T , D P ), which is the set of the keyframe pose and point cloud, is not empty.

Tightly Coupled Relocalization
As long as the current pose in the world coordinate is obtained, T ICP and M are fed back to the RTLIO module for state correction. The relocalization scheme is modified from (19) by solving the following cost function: where P w m ∈ M is the closest point to P w j in the global map. By solving the modified cost function, the current states can be used for relocalization in the global map.

Global Pose-Graph Optimization
Due to the LiDAR inertial setup, roll and pitch angles are fully observable once gravity and the bias are estimated. Therefore, the accumulated drift only occurs in the other four degrees of freedom (x, y, z, yaw) and can be reduced by solving keyframe states in the pose-graph. Every keyframe state serves as a vertex in the pose-graph, and two types of edges between the vertices are utilized. The pose-graph is illustrated in Figure 7.

Sequential Edge
A sequential edge represents the relative transformation between each keyframe, which is obtained from the RTLIO results. Considering a keyframe j and its previous keyframe i, the sequential edge is defined as s i j = p i j ,ψ i j , wherep i j andψ i j denote the relative position and the relative yaw angle, respectively: If the current keyframe j has a corresponding keyframe i, the loop closure edge is defined as h i j = p i j ,ψ i j which is obtained in Section 2.4.1. Then loop closure edge will be added to the pose-graph as an additional constraint.

Pose-Graph Optimization for Four Degrees of Freedom
State vector χ p of the pose-graph is defined as where n is the number of vertices in the pose-graph. To find state χ p defined in (28), the cost function is formulated as where the residuals of the sequential edge and the loop closure edge between keyframes i and j are defined as whereφ w i andθ w i are the roll and pitch angles, respectively, converted from q w b k defined in (18). The loss function ρ is applied to penalize wrong connections of the loop closure edges. Once pose-graph optimization is completed, all keyframe states are updated. The global map is then updated by registering the keyframe point cloud according to the states.

Experiment Results and Discussions
A series of experiments (https://github.com/ChadLin9596/ncrl_lio, accessed on 2 June 2021) were conducted to analyze the performance of the developed RTLIO algorithm and compare it with the current state-of-the-art algorithms. To demonstrate the real-time capability of our system, multiple indoor tests are presented in Section 3.1. The KITTI dataset was used to compare the results with real-world benchmarks; the results are discussed in Section 3.2.

Indoor Flight Test
During the experiments, multiple threads were utilized to achieve the desired performance in real time. The first thread performed distortion compensation and feature extraction from LiDAR measurements, as described in Section 2.1.4. The second thread took those features and computed the incremental motion, as described in Section 2.1.5. The third thread (described in Section 2.3) executed the RTLIO algorithm that solves the states based on the initial guess from the second thread. The RTLIO generated two types of odometry defined in Section 1.3: (i) LiDAR-rate pose, and (ii) the IMU-rate pose, which can be obtained with minimal delay. This means that the high-frequency pose can be directly used for real-time feedback control.
The precision and computation time are discussed and compared with other LiDARbased methods in Section 3.1.2, for experiments conducted in the laboratory with the OptiTrack motion capture system as the ground truth. The flight tests with RTLIO and the other methods are presented in Section 3.1.3.

System Setup
The quadcopter setup used in this work is shown in Figure 8. It comprised a 16-beam LiDAR system (Velodyne VLP-16, 10 Hz), IMU (400 Hz), and Intel NUC (NUC8i7BEH) with an i7-8559U CPU running at 2.70 GHz and 20 GB of memory. The RTLIO algorithm is implemented on the board to perform state estimation in real time.

Precision and Time Cost
Data recorded with quadcopter flying in circular trajectories in the laboratory were used as the input to LOAM (cf. [11]), ALOAM (cf. [11] (https://github.com/HKUST-Aerial-Robotics/A-LOAM, accessed on 1 April 2021), and RTLIO to conduct postprocessing. The comparison of performance is shown in Figure 9, where ALOAM is an extension of LOAM produced by HKUST. The relative pose errors (RPEs) and the time costs of the methods are listed in Table 2, which indicates that the odometry from RTLIO had lower RPEs, but a greater time cost. The methods discussed in this section are able to estimate the pose state to a certain precision, and the impact of the time costs is discussed in Section 3.1.3. The IMU-rate poses from RTLIO cannot be compared with poses from ALOAM and LOAM, because ALOAM and LOAM cannot generate high frequency poses.   Figure 10 shows the results of RTLIO along the x, y, and z axes, from take off to landing. These results show that high-performance localization is crucial to the real-time feedback control of the quadrotor and trajectory tracking, and the time delays from RTLIO and LOAM are compared in Figure 11. The computation time for RTLIO is 0.2944 ms, and the delay is small enough for feedback control.

Indoor Flight with an Obstacle
Indoor flight experiments were also conducted with the quadcopter flying along a corridor with the localization obtained by RTLIO. Figure 12a shows the setup of the indoor test environment, Figure 12b shows the top view of the map in the xy plane and the trajectory of the UAV, starting from the "WORLD" to "IMU". These tests and the results presented in Section 3.1.3 demonstrate the capability of the RTLIO algorithm to perform localization for the feedback control of the quadcopter and trajectory tracking, either in a laboratory or corridor environment, and that the generated mapping is reliable.

KITTI Dataset Evaluation
The developed RTLIO was also evaluated using KITTI dataset, which includes measurements from an inertial navigation system (OXTS RT3003), which provides the groundtruth pose and IMU measurements at 100 Hz, a 64-beam LiDAR (Velodyne HDL-64E, 10 Hz), two grayscale (Point Grey Flea 2 FL2-14S3M-C, 10 Hz), and two color cameras (Point Grey Flea 2 FL2-14S3C-C, 10 Hz). In this test, only the IMU and the LiDAR were used to evaluate our algorithm.

Front-End Performance
The RTLIO in the front-end did not include loop closure and pose-graph optimization. The results show that the average errors of the translation and rotation along the given path were 1.8560 % and 0.0043 deg/m, respectively, as reported by the KITTI evaluation. The average translation and rotation errors over different lengths in each sequence are shown in Figure 13.  Figure 13. Average translation and rotation errors of the front-end evaluated over different lengths in the KITTI dataset for sequence from 00 to 10.

Full Closed-Loop Performance
After adding the back-end to the RTLIO, the full pipeline was also evaluated using the KITTI dataset. The overall results show that the average errors of the translation and rotation along the given path are 1.6392% and 0.0035 deg/m, respectively. Figure 14 shows that the RTLIO including the back-end effectively reduces the errors compared with the front-end in Figure 13. The APE (Absolute Pose Error) evaluated with EVO (https://github.com/MichaelGrupp/ evo, accessed on 1 April 2021) is listed in Table 3, in which the sequences that contain a trajectory without loop closure are marked with an *.
The results in Table 3 indicate that the RTLIO with back-end can outperform RTLIO in APE (especially in the sequence with loop closure).

Time Consumption
The time consumption of each module in indoor flight test and KITTI datasets using an Intel i7-7700 CPU with 24~GB of memory is listed in Table 4. Threads 1-3 are used for computing the front-end of the RTLIO, and thread 4 is used for the back-end, which also reconstructs a globally consistent map. However, the RTLIO was unable to run in real time using the KITTI datasets, since scan matching is more difficult in the outdoor environment. Additionally, the time consumption increases with the increase of the number of the channels of the LiDAR. The higher the number of channels, the higher the resolution (i.e., 16 channels for VLP-16, 32 channels for HDL-32E), according to the website from Velodyne: (https://velodynelidar.com/products/hdl-32e, accessed on 2 June 2021).

Conclusions and Future Work
The RTLIO developed in this work can generate accurate and reliable odometry information in real time, and the initialization process is performed when the UAV is already in motion. The developed RTLIO method uses LiDAR and IMU to generate high-frequency odometry with improved performance compared to the methods that only use LiDARs. Moreover, a consistent and accurate global map is constructed using the loop closure and pose-graph optimization method. Experiments were conducted with the quadrotor in indoor environment and using KITTI dataset, and the results demonstrated that the RTLIO can outperform ALOAM and LOAM in terms of exhibiting a smaller time delay and greater flight stability. The RTLIO with back-end algorithm can outperform the RTLIO with only front-end algorithm, since the accumulated drift can be reduced by the developed pose graph.
Future works include designing a more stable initialization method to deal with diverse situations. In addition, detection algorithms can be integrated into the method for removing feature points on moving objects. Finally, integrating vision sensors with the current system to improve the precision of odometry will be conducted to increase the stability of pose estimation along the z axis.