Event/Visual/IMU Integration for UAV-Based Indoor Navigation

Elamin, Ahmed; El-Rabbany, Ahmed

doi:10.3390/proceedings2024110002

Open AccessProceeding Paper

Event/Visual/IMU Integration for UAV-Based Indoor Navigation^†

by

Ahmed Elamin

^* and

Ahmed El-Rabbany

Department of Civil Engineering, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada

^*

Author to whom correspondence should be addressed.

^†

Presented at the 31st International Conference on Geoinformatics, Toronto, ON, Canada, 14–16 August 2024.

Proceedings 2024, 110(1), 2; https://doi.org/10.3390/proceedings2024110002

Published: 2 December 2024

(This article belongs to the Proceedings of The 31st International Conference on Geoinformatics)

Download

Browse Figures

Versions Notes

Abstract

Unmanned aerial vehicle (UAV) navigation in indoor environments is challenging due to varying light conditions, the dynamic clutter typical of indoor spaces, and the absence of GNSS signals. In response to these complexities, emerging sensors, such as event cameras, demonstrate significant potential in indoor navigation with their low latency and high dynamic range characteristics. Unlike traditional RGB cameras, event cameras mitigate motion blur and operate effectively in low-light conditions. Nevertheless, they exhibit limitations in terms of information output during scenarios of limited motion, in contrast to standard cameras that can capture detailed surroundings. This study proposes a novel event-based visual–inertial odometry approach for precise indoor navigation. In the proposed approach, the standard images are leveraged for feature detection and tracking, while events are aggregated into frames to track features between consecutive standard frames. The fusion of IMU measurements and feature tracks facilitates the continuous estimation of sensor states. The proposed approach is evaluated and validated using a controlled office environment simulation developed using Gazebo, employing a P230 simulated drone equipped with an event camera, an RGB camera, and IMU sensors. This simulated environment provides a testbed for evaluating and showcasing the proposed approach’s robust performance in realistic indoor navigation scenarios.

Keywords:

UAV; event camera; VIO; navigation; simulation

1. Introduction

Indoor navigation has attracted significant research attention due to its critical applications, including, but not limited to, inventory management, industrial inspections and monitoring, and disaster management. The absence of global navigation satellite systems (GNSSs) in indoor environments poses a significant challenge for UAV navigation. Various sensors have been explored for indoor navigation. For instance, ultra-wideband (UWB) technology has been used for UAV navigation in indoor environments [1,2,3]. UWB localizes systems by utilizing signal parameters such as time of arrival and received signal strength from fixed nodes [4]. Despite its precision, UWB needs additional infrastructure, which can be costly. Light detection and ranging (LiDAR), traditionally used in either 2D or 3D formats, offers real-time mapping capabilities. However, 2D LiDAR cannot capture height information, while 3D LiDAR provides comprehensive mapping but with a high cost, large size, and high power consumption [5,6,7,8,9].

Cameras, including monocular, stereo, and RGB-D types, are widely used for visual navigation. Monocular navigation methods include direct approaches, feature point matching, and semi-direct approaches [10]. LSD-SLAM and ORB-SLAM are notable examples of these methods [11,12]. Stereo cameras capture scenes from multiple directions to provide accurate 3D information, while RGB-D cameras offer depth information through an infrared transmitter and receiver [13,14]. Despite their advantages, these cameras struggle with illumination changes, motion blur, and limited dynamic range [15,16].

Event cameras, a novel type of sensor, have emerged due to their high temporal resolution, low latency, low power consumption, and high dynamic range [17]. Unlike traditional cameras, event cameras respond to changes in brightness asynchronously, providing a variable data rate for the sequence of events. This has enabled pose estimation using only event cameras [18,19], though real-time application remains challenging due to optimization costs [20,21,22]. Consequently, several approaches have combined event cameras with standard cameras and IMUs, effectively enhancing feature tracking and ensuring robust navigation [23,24,25,26,27].

This study proposes a novel event-based visual–inertial odometry algorithm for precise indoor navigation. In our approach, standard images are utilized for feature detection and tracking, while events are aggregated into frames to track features between consecutive standard frames. The fusion of IMU measurements with feature tracks enables the continuous estimation of sensor states. The proposed method is evaluated and validated using a controlled office environment simulation. The structure of this paper is as follows: Section 2 details the proposed approach, Section 3 discusses the experiments, and Section 4 presents the conclusions.

2. Event-Based VIO Approach

The overall methodology is summarized in the flowchart shown in Figure 1. The proposed VIO approach consists of two primary threads: the front end and the back end. The front-end thread involves detecting and tracking features in standard images, clustering events into spatiotemporal windows between standard frames, and accumulating these events into intermediate frames to improve feature tracking. These tracks are then filtered and triangulated to determine their 3D locations. IMU measurements are fused with the feature tracks in the back-end thread to continuously estimate sensor states. This methodology is implemented using the open-source library dv-mono-vio-sample by iniVation [28].

For each standard frame, events are clustered into overlapping spatiotemporal windows, which are then accumulated into event frames. Feature detection uses the FAST corner detector [29] with a bucketing grid to ensure even distribution, and feature tracking employs Lukas–Kanade tracking. Outliers in feature tracks are filtered using two-point RANSAC. In the back end, feature tracks and IMU measurements are fused to estimate sensor states using OKVIS [30], a nonlinear optimization technique.

The visual–inertial odometry problem is formulated as a joint optimization of a cost function, including weighted reprojection errors for both standard and accumulated event frames and inertial error terms. The Google Ceres optimizer [31] is used for optimization.

3. Experiments

In order to evaluate the proposed approach, a 3D indoor environment of an office was simulated based on a robot operating system (ROS) and Gazebo [32]. Gazebo is a robust, open-source simulation tool that is relied on for a wide range of applications due to its flexibility, the availability of common features, and its support for complex, distributed environments. It employs a modular approach with multiple physics engines, an extensive library of sensors, various user interfaces, and a graphical interface, making it capable of simulating both indoor and outdoor environments with high credibility [33]. The simulated environment covers an area of 600 m², as illustrated in Figure 2.

The P230 UAV model was used in our simulation and equipped with simulated sensors, including an RGB camera, an event camera, and an IMU. The event camera data were simulated using the DVS Gazebo plugin [34]. Both RGB and event cameras capture images at a resolution of 240 × 180 pixels and feature a horizontal field of view (FOV) of 103◦ and a vertical FOV of 97.7°. Figure 3 shows the simulated P230 UAV. The standard frame data were recorded at a rate of 25 Hz, whereas the IMU data were captured at 50 Hz.

The dataset was collected by sending commands to the UAV to follow a pre-determined trajectory. These commands were sent to the UAV using an offboard control node based on the PX4 SITL framework [35] and MAVROS [36]. The PX4 SITL framework is utilized to supply the software stack that facilitates interactions with a simulated UAV frame, while MAVROS is a mavlink extensible communication node for ROS, which retrieves the current state of the model and updates the setpoint position ROS topic accordingly.

The proposed approach fuses events, standard frames, and IMU measurements. Features are detected and tracked on the standard frame image; however, to enhance tracking accuracy, intermediate frames are generated from events, and tracking is executed on these frames. IMU measurements are then combined with the feature tracks to estimate the camera states. An example of the detected and tracked features on both events and standard frames is illustrated in Figure 4. Figure 5 and Figure 6 compare the estimated trajectory with respect to ground truth in 2D and 3D, respectively. The error statistics of the position for the proposed solution are detailed in Table 1. The trajectory results compared to the ground truth show a mean RMSE of 0.21, with maximum deviations of 0.28 in the X direction, 0.38 in the Y direction, and 0.27 in the Z direction.

4. Conclusions

In this study, an event-based visual–inertial odometry algorithm was developed for precise indoor UAV navigation, addressing the complexities of GNSS-denied environments. Our approach combines events, standard frames, and IMU measurements, utilizing standard images for feature detection and tracking and intermediate frames from events to enhance tracking accuracy. IMU measurements are fused with feature tracks to estimate camera states continuously. The proposed approach was evaluated and validated using a controlled office environment simulation developed in Gazebo. A P230 simulated drone, equipped with an event camera, RGB camera, and IMU sensors, was used to test the algorithm. The UAV followed a pre-determined trajectory using commands sent via an offboard control node based on the PX4 SITL framework and MAVROS. The results demonstrate significant improvements in pose estimation accuracy, with a mean RMSE of 0.21 compared to ground truth, and maximum deviations of 0.28 in the X direction, 0.38 in the Y direction, and 0.27 in the Z direction. These findings confirm the effectiveness of our algorithm in providing accurate and reliable indoor navigation for UAVs.

Author Contributions

Conceptualization and methodology: A.E. and A.E.-R; software: A.E.; validation: A.E.; formal analysis: A.E.; writing—original draft preparation, A.E.; writing—review and editing: A.E.-R.; supervision, A.E.-R. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by Natural Sciences and Engineering Research Council of Canada (NSERC) RGPIN-2022-03822, the Government of Ontario, Toronto Metropolitan University, and SOTI Inc.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are not publicly available.

Acknowledgments

The authors would like to thank iniVation for making its DV packages available.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Tiemann, J.; Wietfeld, C. Scalable and precise multi-UAV indoor navigation using TDOA-based UWB localization. In Proceedings of the International Conference on Indoor Positioning and Indoor Navigation (IPIN), Sapporo, Japan, 18–21 September 2017; IEEE: New York, NY, USA, 2017. [Google Scholar]
Tiemann, J.; Ramsey, A.; Wietfeld, C. Enhanced UAV Indoor Navigation through SLAM-Augmented UWB Localization. In Proceedings of the IEEE International Conference on Communications Workshops (ICC Workshops), Kansas City, MO, USA, 20–24 May 2018; IEEE: New York, NY, USA, 2018. [Google Scholar]
Raja, G.; Suresh, S.; Anbalagan, S.; Ganapathisubramaniyan, A.; Kumar, N. PFIN: An Efficient Particle Filter-Based Indoor Navigation Framework for UAVs. IEEE Trans. Veh. Technol. 2021, 70, 4984–4992. [Google Scholar] [CrossRef]
Mazhar, F.; Khan, M.G.; Sällberg, B. Precise Indoor Positioning Using UWB: A Review of Methods, Algorithms, and Implementations. Wirel. Pers. Commun. 2017, 97, 4467–4491. [Google Scholar] [CrossRef]
Bachrach, A.; He, R.; Roy, N. Autonomous Flight in Unknown Indoor Environments. Int. J. Micro Air Veh. 2009, 1, 217–228. [Google Scholar] [CrossRef]
Zhang, J.; Singh, S. LOAM: Lidar odometry and mapping in real-time. In Proceedings of the Robotics: Science and Systems, Berkeley, CA, USA, 14–16 July 2014. [Google Scholar]
Bry, A.; Bachrach, A.; Roy, N. State estimation for aggressive flight in GPS-denied environments using onboard sensing. In Proceedings of the IEEE International Conference on Robotics and Automation, Saint Paul, MN, USA, 14–18 May 2012; IEEE: New York, NY, USA, 2012. [Google Scholar]
Kumar, G.A.; Patil, A.K.; Patil, R.; Park, S.S.; Chai, Y.H. A LiDAR and IMU Integrated Indoor Navigation System for UAVs and Its Application in Real-Time Pipeline Classification. Sensors 2017, 17, 1268. [Google Scholar] [CrossRef] [PubMed]
Hui, C.; Yousheng, C.; Shing, W.W. Trajectory tracking and formation flight of autonomous UAVs in GPS-denied environments using onboard sensing. In Proceedings of the IEEE Chinese Guidance, Navigation and Control Conference, Yantai, China, 8–10 August 2014; IEEE: New York, NY, USA, 2014. [Google Scholar]
Qu, Y.; Yang, M.; Zhang, J.; Xie, W.; Qiang, B.; Chen, J. An Outline of Multi-Sensor Fusion Methods for Mobile Agents Indoor Navigation. Sensors 2021, 21, 1605. [Google Scholar] [CrossRef]
Engel, J.; Schöps, T.; Cremers, D. LSD-SLAM: Large-Scale Direct Monocular SLAM, Computer Vision—ECCV 2014; Springer International Publishing: Cham, Switzerland, 2014; pp. 834–849. [Google Scholar]
Mur-Artal, R.; Montiel, J.M.M.; Tardos, J.D. ORB-SLAM: A Versatile and Accurate Monocular SLAM System. IEEE Trans. Robot. 2015, 31, 1147–1163. [Google Scholar] [CrossRef]
Taketomi, T.; Uchiyama, H.; Ikeda, S. Visual SLAM algorithms: A survey from 2010 to 2016. IPSJ Trans. Comput. Vis. Appl. 2017, 9, 16. [Google Scholar] [CrossRef]
Endres, F.; Hess, J.; Sturm, J.; Cremers, D.; Burgard, W. 3-D Mapping with an RGB-D Camera. IEEE Trans. Robot. 2014, 30, 177–187. [Google Scholar] [CrossRef]
Labbé, M.; Michaud, F. RTAB-Map as an open-source lidar and visual simultaneous localization and mapping library for large-scale and long-term online operation. J. Field Robot. 2019, 36, 416–446. [Google Scholar] [CrossRef]
Kerl, C.; Sturm, J.; Cremers, D. Dense visual SLAM for RGB-D cameras. In Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013; IEEE: New York, NY, USA, 2013. [Google Scholar]
Gallego, G.; Delbrück, T.; Orchard, G.; Bartolozzi, C.; Taba, B.; Censi, A.; Leutenegger, S.; Davison, A.J.; Conradt, J.; Daniilidis, K.; et al. Event-Based Vision: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 154–180. [Google Scholar] [CrossRef]
Rebecq, H.; Horstschaefer, T.; Gallego, G.; Scaramuzza, D. EVO: A Geometric Approach to Event-Based 6-DOF Parallel Tracking and Mapping in Real Time. IEEE Robot. Autom. Lett. 2017, 2, 593–600. [Google Scholar] [CrossRef]
Kim, H.; Leutenegger, S.; Davison, A. Real-Time 3D Reconstruction and 6-DoF Tracking with an Event Camera; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
Mueggler, E.; Gallego, G.; Rebecq, H.; Scaramuzza, D. Continuous-Time Visual-Inertial Odometry for Event Cameras. IEEE Trans. Robot. 2018, 34, 1425–1440. [Google Scholar] [CrossRef]
Zhu, A.Z.; Atanasov, N.; Daniilidis, K. Event-Based Visual Inertial Odometry. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; IEEE: New York, NY, USA, 2017. [Google Scholar]
Rebecq, H.; Horstschaefer, T.; Scaramuzza, D. Real-time Visual-Inertial Odometry for Event Cameras using Keyframe-based Nonlinear Optimization. In Proceedings of the British Machine Vision Conference, London, UK, 4–7 September 2017. [Google Scholar]
Gehrig, D.; Rebecq, H.; Gallego, G.; Scaramuzza, D. EKLT: Asynchronous Photometric Feature Tracking Using Events and Frames. Int. J. Comput. Vis. 2020, 128, 601–618. [Google Scholar] [CrossRef]
Brandli, C.; Berner, R.; Yang, M.; Liu, S.-C.; Delbruck, T. A 240 × 180 130 dB 3 µs Latency Global Shutter Spatiotemporal Vision Sensor. IEEE J. Solid-State Circuits 2014, 49, 2333–2341. [Google Scholar] [CrossRef]
Tedaldi, D.; Gallego, G.; Mueggler, E.; Scaramuzza, D. Feature detection and tracking with the dynamic and active-pixel vision sensor (DAVIS). In Proceedings of the Second International Conference on Event-based Control, Communication, and Signal Processing (EBCCSP), Krakow, Poland, 13–15 June 2016; IEEE: New York, NY, USA, 2016. [Google Scholar]
Kueng, B.; Mueggler, E.; Gallego, G.; Scaramuzza, D. Low-latency visual odometry using event-based feature tracks. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Republic of Korea, 9–14 October 2016; IEEE: New York, NY, USA, 2016. [Google Scholar]
Vidal, A.R.; Rebecq, H.; Horstschaefer, T.; Scaramuzza, D. Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High-Speed Scenarios. IEEE Robot. Autom. Lett. 2018, 3, 994–1001. [Google Scholar] [CrossRef]
iniVation. Dv-Mono-Vio-Sample. 2022. Available online: https://gitlab.com/inivation/dv/dv-mono-vio-sample (accessed on 21 March 2022).
Rosten, E.; Drummond, T. Machine Learning for High-Speed Corner Detection. In Computer Vision—ECCV 2006; Springer: Berlin/Heidelberg, Germany, 2006; pp. 430–443. [Google Scholar]
Leutenegger, S.; Furgale, P.; Rabaud, V.; Chli, M.; Konolige, K.; Siegwart, R. Keyframe-based visual-inertial SLAM using nonlinear optimization. In Proceedings of the Robotics: Science and Systems, Berlin, Germany, 24–28 June 2013. [Google Scholar]
Agarwal, S.; Mierle, K. Ceres Solver. 2022. Available online: http://ceres-solver.org (accessed on 13 May 2022).
Koenig, N.; Hsu, J.; Dolha, M.; Howard, A. Gazebo. Retrieved 2012, 3, 2012. [Google Scholar]
Koenig, N.; Howard, A. Design and use paradigms for Gazebo, an open-source multi-robot simulator. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, 28 September–2 October 2004; pp. 2149–2154. [Google Scholar]
HBPNeurorobotics. gazebo_dvs_plugin [Computer software]. GitHub. Available online: https://github.com/HBPNeurorobotics/gazebo_dvs_plugin (accessed on 1 May 2024).
Px4. Available online: https://px4.io/ (accessed on 1 May 2024).
Mavros. Available online: http://wiki.ros.org/mavros (accessed on 1 May 2024).

Figure 1. Event-based VIO workflow.

Figure 2. Indoor office environment simulated in Gazebo 9.

Figure 3. P230 simulated UAV.

Figure 4. Detected and tracked features on both standard and event frames.

Figure 5. Comparison of trajectories.

Figure 6. Comparison of trajectories in 3D.

Table 1. Position (m) error statistics.

	Mean	RMSE	Max
X	0.09	0.18	0.28
Y	0.1	0.19	0.38
Z	0.1	0.26	0.27
Total		0.21

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Elamin, A.; El-Rabbany, A. Event/Visual/IMU Integration for UAV-Based Indoor Navigation. Proceedings 2024, 110, 2. https://doi.org/10.3390/proceedings2024110002

AMA Style

Elamin A, El-Rabbany A. Event/Visual/IMU Integration for UAV-Based Indoor Navigation. Proceedings. 2024; 110(1):2. https://doi.org/10.3390/proceedings2024110002

Chicago/Turabian Style

Elamin, Ahmed, and Ahmed El-Rabbany. 2024. "Event/Visual/IMU Integration for UAV-Based Indoor Navigation" Proceedings 110, no. 1: 2. https://doi.org/10.3390/proceedings2024110002

APA Style

Elamin, A., & El-Rabbany, A. (2024). Event/Visual/IMU Integration for UAV-Based Indoor Navigation. Proceedings, 110(1), 2. https://doi.org/10.3390/proceedings2024110002

Article Menu

Event/Visual/IMU Integration for UAV-Based Indoor Navigation^†

Abstract

1. Introduction

2. Event-Based VIO Approach

3. Experiments

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Event/Visual/IMU Integration for UAV-Based Indoor Navigation †

Abstract

1. Introduction

2. Event-Based VIO Approach

3. Experiments

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Event/Visual/IMU Integration for UAV-Based Indoor Navigation^†