A Review of Visual-Inertial Simultaneous Localization and Mapping from Filtering-Based and Optimization-Based Perspectives
Abstract
:1. Introduction
2. Filtering-Based Methods
2.1. Feature Extraction and Tracking
2.1.1. Feature Extraction
2.1.2. Feature Tracking
2.2. Dynamic and Observational Models
2.3. Filtering-Based VIO and VI-SLAM
3. Optimization-Based Methods
3.1. Loop Closure
3.2. Optimization-Based VI-SLAM Algorithms
4. Comparisons between Filtering-Based and Optimization-Based Methods
4.1. Details
4.2. Experiments
5. Development Trends
5.1. SLAM with Deep Learning
5.2. Hardware Integration and Multi-Sensor Fusion
5.3. Active SLAM on Robots
5.4. Applications on Complex Dynamic Environments
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
Appendix A
Methods | MSCFK | ROVIO | S-MSCKF | OKVIS | VINS-Mono | VIORB |
---|---|---|---|---|---|---|
Platform | Vehicle | UAV | MAV | Car/Helmet | MAV | MAV |
Image | 640 × 480 @14Hz | 752 × 480 @20Hz | 752 × 480 @20Hz | 752 × 480 @20Hz | 752 × 480 @20Hz | 752 × 480 @20Hz |
Environment | outdoor | indoor | indoor/outdoor | outdoor | indoor/outdoor | indoor |
IMU | @100Hz | @200Hz | @200Hz | @200Hz | @100Hz | @200Hz |
Drift rate | 0.31% | ≈1.8% | <0.5% | <0.1% | 0.88% | ≈0 |
Appendix B
Datasets | EuRoC Datasets | PennCOSYVIO | Zurich Urban MAV Dataset | TUM VI Benchmark | Canoe Dataset |
---|---|---|---|---|---|
Carrier | MAV | Handheld | MAV | Handheld | Canoe |
Cameras | 1 stereo gray 2 × 752 × 480 (global shutter) @20Hz | 4 RGB 1920 × 1080 @30Hz (rolling shutter), 1 stereo gray 2 × 752 × 480 @20Hz, 1 fisheye gray 640 × 480 @30Hz | 1 RGB 1920 × 1080 @30Hz (rolling shutter) | 1 stereo gray 2 × 1024 × 1024 (global shutter) @20Hz | 1 stereo RGB 2 × 1600 × 1200 (rectified 2 × 600 × 800) @20Hz |
IMUs | ADIS16488 3-axis acc/gyro @200Hz | ADIS16488 3-axis acc/gyro @200Hz, Tango 3-axis acc @128Hz/3-axis gyro @100Hz | 3-axis acc/gyro @10Hz | BMI160 3-axis acc/gyro @200Hz | ADIS-16488 3-axis acc/gyro @200Hz, |
Environment | indoors | indoor/outdoors | outdoors | indoors/outdoors | Sangamon River |
Ground truth | Leica Multistation/Vicon system | fiducial markers | Pix4D | motion capture pose | GPS |
Stats/props | 11 sequences, 0.9 km | 4 sequences, 0.6 km | 1 sequence, 2 km | 28 sequences, 20 km | 28 sequences, 2.7 km |
References
- Smith, R.C.; Cheeseman, P. On the Representation and Estimation of Spatial Uncertainly. Int. J. Robot. Res. 1986, 5, 56–68. [Google Scholar] [CrossRef]
- Smith, R.; Self, M.; Cheeseman, P. Estimating Uncertain Spatial Relationships in Robotics. Mach. Intell. Pattern Recognit. 1988, 5, 435–461. [Google Scholar]
- Kleeman, L. Advanced sonar and odometry error modeling for simultaneous localisation and map building. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–8 November 2013; pp. 699–704. [Google Scholar]
- Kohlbrecher, S.; Stryk, O.V.; Meyer, J.; Klingauf, U. A flexible and scalable SLAM system with full 3D motion estimation. In Proceedings of the IEEE International Symposium on Safety, Security, and Rescue Robotics, Kyoto, Japan, 1–2 November 2011; pp. 155–160. [Google Scholar]
- Davison, A.J.; Reid, I.D.; Molton, N.D.; Stasse, O. MonoSLAM: Real-time single camera SLAM. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 1052–1067. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Durrant-Whyte, H.; Bailey, T. Simultaneous Localization and Mapping: Part I. IEEE Robot. Autom. Mag. 2006, 13, 99–110. [Google Scholar] [CrossRef]
- Bailey, T.; Durrantwhyte, H. Simultaneous localisation and mapping (slam) part 2: State of the art. IEEE Robot. Autom. Mag. 2006, 13, 108–117. [Google Scholar] [CrossRef]
- Klein, G.; Murray, D. Parallel Tracking and Mapping for Small AR Workspaces. In Proceedings of the IEEE and ACM International Symposium on Mixed and Augmented Reality, Cambridge, UK, 15–18 September 2008; pp. 1–10. [Google Scholar]
- Milford, M.J.; Wyeth, G.F.; Prasser, D. RatSLAM: A hippocampal model for simultaneous localization and mapping. In Proceedings of the IEEE International Conference on Robotics and Automation, Hong Kong, China, 31 May–5 June 2004; pp. 403–408. [Google Scholar]
- Newcombe, R.A.; Lovegrove, S.J.; Davison, A.J. DTAM: Dense tracking and mapping in real-time. In Proceedings of the IEEE International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2320–2327. [Google Scholar]
- Newcombe, R.A.; Izadi, S.; Hilliges, O.; Molyneaux, D.; Kim, D.; Davison, A.J.; Kohi, P.; Shotton, J.; Hodges, S.; Fitzgibbon, A. KinectFusion: Real-time dense surface mapping and tracking. In Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, Atlanta, GA, USA, 5–8 November 2012; pp. 127–136. [Google Scholar]
- Mur-Artal, R.; Montiel, J.M.M.; Tardos, J.D. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Trans. Robot. 2015, 31, 1147–1163. [Google Scholar] [CrossRef]
- Cadena, C.; Carlone, L.; Carrillo, H.; Latif, Y.; Scaramuzza, D.; Neira, J.; Reid, I.D.; Leonard, J.J. Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age. IEEE Trans. Robot. 2016, 32, 1309–1332. [Google Scholar] [CrossRef] [Green Version]
- Lynen, S.; Sattler, T.; Bosse, M.; Hesch, J.; Pollefeys, M.; Siegwart, R. Get Out of My Lab: Large-scale, Real-Time Visual-Inertial Localization. In Proceedings of the Robotics: Science and Systems, Rome, Italy, 13–17 July 2015. [Google Scholar]
- Schneider, T.; Dymczyk, M.; Fehr, M.; Egger, K.; Lynen, S.; Gilitschenski, I.; Siegwart, R. maplab: An Open Framework for Research in Visual-inertial Mapping and Localization. IEEE Robot. Autom. Lett. 2017, 3, 1418–1425. [Google Scholar] [CrossRef]
- Qin, T.; Li, P.; Shen, S. VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator. arXiv, 2017; arXiv:1708.03852. [Google Scholar] [CrossRef]
- Lin, Y.; Gao, F.; Qin, T.; Gao, W.; Liu, T.; Wu, W.; Yang, Z.; Shen, S. Autonomous aerial navigation using monocular visual-inertial fusion. J. Field Robot. 2017, 35, 23–51. [Google Scholar] [CrossRef] [Green Version]
- Li, P.; Qin, T.; Hu, B.; Zhu, F.; Shen, S. Monocular Visual-Inertial State Estimation for Mobile Augmented Reality. In Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, Natnes, France, 9–13 October 2017; pp. 11–21. [Google Scholar]
- Scaramuzza, D.; Fraundorfer, F. Visual Odometry [Tutorial]. IEEE Robot. Autom. Mag. 2011, 18, 80–92. [Google Scholar] [CrossRef]
- Fraundorfer, F.; Scaramuzza, D. Visual Odometry: Part II: Matching, Robustness, Optimization, and Applications. IEEE Robot. Autom. Mag. 2012, 19, 78–90. [Google Scholar] [CrossRef] [Green Version]
- Fuentes-Pacheco, J.; Ruiz-Ascencio, J.; Rendón-Mancha, J.M. Visual simultaneous localization and mapping: A survey. Artif. Intell. Rev. 2015, 43, 55–81. [Google Scholar] [CrossRef]
- Yousif, K.; Bab-Hadiashar, A.; Hoseinnezhad, R. An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics. Intell. Ind. Syst. 2015, 1, 289–311. [Google Scholar] [CrossRef] [Green Version]
- Paul, M.K.; Wu, K.; Hesch, J.A.; Nerurkar, E.D.; Roumeliotis, S.I. A comparative analysis of tightly-coupled monocular, binocular, and stereo VINS. In Proceedings of the IEEE International Conference on Robotics and Automation, Marina Bay, Singapore, 29 May–3 June 2017; pp. 165–172. [Google Scholar]
- Weiss, S.; Siegwart, R. Real-time metric state estimation for modular vision-inertial systems. In Proceedings of the IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 4531–4537. [Google Scholar]
- Weiss, S.; Scaramuzza, D.; Siegwart, R. Monocular-SLAM-based navigation for autonomous micro helicopters in GPS-denied environments. J. Field Robot. 2011, 28, 854–874. [Google Scholar] [CrossRef]
- Sun, K.; Mohta, K.; Pfrommer, B.; Watterson, M.; Liu, S.; Mulgaonkar, Y.; Taylor, C.J.; Kumar, V. Robust Stereo Visual Inertial Odometry for Fast Autonomous Flight. IEEE Robot. Autom. Lett. 2018, 3, 965–972. [Google Scholar] [CrossRef] [Green Version]
- Li, M.; Mourikis, A.I. Improving the accuracy of EKF-based visual-inertial odometry. In Proceedings of the IEEE International Conference on Robotics and Automation, St. Paul, MI, USA, 14–18 May 2012; pp. 828–835. [Google Scholar]
- Mourikis, A.I.; Roumeliotis, S.I. A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation. In Proceedings of the IEEE International Conference on Robotics and Automation, Roma, Italy, 10–14 April 2007; pp. 3565–3572. [Google Scholar]
- Veth, M.M.; Raquet, J. Fusing Low-Cost Image and Inertial Sensors for Passive Navigation. Navigation 2007, 54, 11–20. [Google Scholar] [CrossRef]
- Tardif, J.P.; George, M.; Laverne, M.; Kelly, A. A new approach to vision-aided inertial navigation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 18–22 October 2010; pp. 4161–4168. [Google Scholar]
- Jones, E.S.; Soatto, S. Visual-inertial navigation, mapping and localization: A scalable real-time causal approach. Int. J. Robot. Res. 2011, 30, 407–430. [Google Scholar] [CrossRef] [Green Version]
- Kelly, J.; Sukhatme, G.S. Visual-Inertial Sensor Fusion: Localization, Mapping and Sensor-to-Sensor Self-calibration. Int. J. Robot. Res. 2011, 30, 56–79. [Google Scholar] [CrossRef]
- Achtelik, M.; Achtelik, M.; Weiss, S.; Siegwart, R. Onboard IMU and monocular vision based control for MAVs in unknown in- and outdoor environments. In Proceedings of the IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 3056–3063. [Google Scholar]
- Weiss, S.M. Vision Based Navigation for Micro Helicopters. Ph.D. Dissertation, ETH Zurich, Zürich, Switzerland, 2012. [Google Scholar]
- Lupton, T.; Sukkarieh, S. Visual-Inertial-Aided Navigation for High-Dynamic Motion in Built Environments without Initial Conditions. IEEE Trans. Robot. 2012, 28, 61–76. [Google Scholar] [CrossRef]
- Li, M.; Mourikis, A.I. High-precision, consistent EKF-based visual-inertial odometry. Int. J. Robot. Res. 2013, 32, 690–711. [Google Scholar] [CrossRef]
- Lynen, S.; Achtelik, M.W.; Weiss, S.; Chli, M. A robust and modular multi-sensor fusion approach applied to MAV navigation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013; pp. 3923–3929. [Google Scholar]
- Sa, I.; He, H.; Huynh, V.; Corke, P. Monocular vision based autonomous navigation for a cost-effective MAV in GPS-denied environments. In Proceedings of the IEEE/ASME International Conference on Advanced Intelligent Mechatronics, Wollongong, Australia, 9–12 July 2013; pp. 1355–1360. [Google Scholar]
- Weiss, S.; Achtelik, M.W.; Lynen, S.; Chli, M. Real-time onboard visual-inertial state estimation and self-calibration of MAVs in unknown environments. In Proceedings of the IEEE International Conference on Robotics and Automation, Karlsruhe, Germany, 6–10 May 2013; pp. 957–964. [Google Scholar]
- Guo, C.X.; Roumeliotis, S.I. IMU-RGBD camera 3D pose estimation and extrinsic calibration: Observability analysis and consistency improvement. In Proceedings of the 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, Germany, 6–10 May 2013; pp. 2935–2942. [Google Scholar]
- Guo, C.; Kottas, D.; Dutoit, R.; Ahmed, A.; Li, R.; Roumeliotis, S. Efficient Visual-Inertial Navigation using a Rolling-Shutter Camera with Inaccurate Timestamps. In Proceedings of the Robotics: Science and Systems, Berkeley, CA, USA, 12–16 July 2014. [Google Scholar]
- Asadi, E.; Bottasso, C.L. Tightly-coupled stereo vision-aided inertial navigation using feature-based motion sensors. Adv. Robot. 2014, 28, 717–729. [Google Scholar] [CrossRef]
- Leutenegger, S.; Lynen, S.; Bosse, M.; Siegwart, R.; Furgale, P. Keyframe-based visual-inertial odometry using nonlinear optimization. Int. J. Robot. Res. 2015, 34, 314–334. [Google Scholar] [CrossRef]
- Leutenegger, S. Unmanned Solar Airplanes: Design and Algorithms for Efficient and Robust Autonomous Operation. Ph.D. Dissertation, ETH Zurich, Zürich, Switzerland, 2014. [Google Scholar]
- Leutenegger, S.; Furgale, P.; Rabaud, V.; Chli, M.; Konolige, K.; Siegwart, R. Keyframe-Based Visual-Inertial SLAM using Nonlinear Optimization. In Proceedings of the Robotics: Science and Systems, Berkeley, CA, USA, 12–16 July 2014; pp. 789–795. [Google Scholar]
- Wu, K.; Ahmed, A.; Georgiou, G.; Roumeliotis, S. A Square Root Inverse Filter for Efficient Vision-aided Inertial Navigation on Mobile Devices. In Proceedings of the Robotics: Science and Systems, Rome, Italy, 13–17 July 2015. [Google Scholar]
- Forster, C.; Carlone, L.; Dellaert, F.; Scaramuzza, D. IMU Preintegration on Manifold for Efficient Visual-Inertial Maximum-a-Posteriori Estimation. In Proceedings of the Robotics: Science and Systems, Rome, Italy, 13–17 July 2015. [Google Scholar]
- Burri, M.; Oleynikova, H.; Achtelik, M.W.; Siegwart, R. Real-time visual-inertial mapping, re-localization and planning onboard MAVs in unknown environments. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Hamburg, Germany, 28 September–2 October 2015; pp. 1872–1878. [Google Scholar]
- Brunetto, N.; Salti, S.; Fioraio, N.; Cavallari, T.; Stefano, L.D. Fusion of Inertial and Visual Measurements for RGB-D SLAM on Mobile Devices. In Proceedings of the IEEE International Conference on Computer Vision Workshop, Santiago, Chile, 13–16 December 2015; pp. 148–156. [Google Scholar]
- Tanskanen, P.; Naegeli, T.; Pollefeys, M.; Hilliges, O. Semi-direct EKF-based monocular visual-inertial odometry. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Hamburg, Germany, 28 September–2 October 2015; pp. 6073–6078. [Google Scholar]
- Bloesch, M.; Omari, S.; Hutter, M.; Siegwart, R. Robust visual inertial odometry using a direct EKF-based approach. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Hamburg, Germany, 28 September–2 October 2015; pp. 298–304. [Google Scholar]
- Keivan, N.; Patron-Perez, A.; Sibley, G. Asynchronous Adaptive Conditioning for Visual-Inertial SLAM. Int. J. Robot. Res. 2015, 34. [Google Scholar] [CrossRef]
- Clement, L.E.; Peretroukhin, V.; Lambert, J.; Kelly, J. The Battle for Filter Supremacy: A Comparative Study of the Multi-State Constraint Kalman Filter and the Sliding Window Filter. In Proceedings of the Computer and Robot Vision, Halifax, NS, Canada, 3–5 June 2015; pp. 23–30. [Google Scholar]
- Huai, J.; Toth, C.K.; Grejner-Brzezinska, D.A. Stereo-inertial odometry using nonlinear optimization. In Proceedings of the International Technical Meeting of the Satellite Division of the Institute of Navigation, Tampa, FL, USA, 14–18 September 2015. [Google Scholar]
- Concha, A.; Loianno, G.; Kumar, V.; Civera, J. Visual-inertial direct SLAM. In Proceedings of the IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 16–21 May 2016; pp. 1331–1338. [Google Scholar]
- Usenko, V.; Engel, J.; Stückler, J.; Cremers, D. Direct visual-inertial odometry with stereo cameras. In Proceedings of the IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 16–21 May 2016; pp. 1885–1892. [Google Scholar]
- Munguía, R.; Nuño, E.; Aldana, C.I.; Urzua, S. A Visual-aided Inertial Navigation and Mapping System. Int. J. Adv. Robot. Syst. 2016, 13, 94. [Google Scholar] [CrossRef]
- Falquez, J.M.; Kasper, M.; Sibley, G. Inertial aided dense & semi-dense methods for robust direct visual odometry. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Daejeon, Korea, 9–14 October 2016; pp. 3601–3607. [Google Scholar]
- Palézieux, N.D.; Nägeli, T.; Hilliges, O. Duo-VIO: Fast, light-weight, stereo inertial odometry. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Daejeon, Korea, 9–14 October 2016; pp. 2237–2242. [Google Scholar]
- Mur-Artal, R.; Tardós, J.D. Visual-Inertial Monocular SLAM with Map Reuse. IEEE Robot. Autom. Lett. 2017, 2, 796–803. [Google Scholar] [CrossRef]
- Laidlow, T.; Bloesch, M.; Li, W.; Leutenegger, S. Dense RGB-D-inertial SLAM with map deformations. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Vancouver, Canada, 24–28 September 2017; pp. 6741–6748. [Google Scholar]
- Fang, W.; Zheng, L.; Deng, H.; Zhang, H. Real-Time Motion Tracking for Mobile Augmented/Virtual Reality Using Adaptive Visual-Inertial Fusion. Sensors 2017, 17, 1037. [Google Scholar] [CrossRef] [PubMed]
- Bloesch, M.; Burri, M.; Omari, S.; Hutter, M.; Siegwart, R. Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback. Int. J. Robot. Res. 2017, 36, 1053–1072. [Google Scholar] [CrossRef]
- Sa, I.; Kamel, M.; Burri, M.; Bloesch, M.; Khanna, R.; Popovic, M.; Nieto, J.; Siegwart, R. Build Your Own Visual-Inertial Drone: A Cost-Effective and Open-Source Autonomous Drone. IEEE Robot. Autom. Mag. 2017, 25, 89–103. [Google Scholar] [CrossRef]
- Piao, J.; Kim, S. Adaptive Monocular Visual-Inertial SLAM for Real-Time Augmented Reality Applications in Mobile Devices. Sensors 2017, 17, 2567. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.; Chen, Z.; Zheng, W.; Wang, H.; Liu, J. Monocular Visual-Inertial SLAM: Continuous Preintegration and Reliable Initialization. Sensors 2017, 17, 2613. [Google Scholar] [CrossRef] [PubMed]
- Hesch, J.A.; Kottas, D.G.; Bowman, S.L.; Roumeliotis, S.I. Consistency Analysis and Improvement of Vision-aided Inertial Navigation. IEEE Trans. Robot. 2017, 30, 158–176. [Google Scholar] [CrossRef]
- Clark, R.; Wang, S.; Wen, H.; Markham, A.; Trigoni, N. VINet: Visual-Inertial Odometry as a Sequence-to-Sequence Learning Problem. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
- Vidal, A.R.; Rebecq, H.; Horstschaefer, T.; Scaramuzza, D. Hybrid, Frame and Event based Visual Inertial Odometry for Robust, Autonomous Navigation of Quadrotors. arXiv, 2017arXiv:1709.06310.
- Yang, Z.; Gao, F.; Shen, S. Real-time monocular dense mapping on aerial robots using visual-inertial fusion. In Proceedings of the IEEE International Conference on Robotics and Automation, Marina Bay, Singapore, 29 May–3 June 2017; pp. 4552–4559. [Google Scholar]
- Kasyanov, A.; Engelmann, F.; Stückler, J.; Leibe, B. Keyframe-Based Visual-Inertial Online SLAM with Relocalization. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vancouver, BC, Canada, 24–28 September 2017. [Google Scholar]
- Zhang, Z.; Liu, S.; Tsai, G.; Hu, H.; Chu, C.C.; Zheng, F. PIRVS: An Advanced Visual-Inertial SLAM System with Flexible Sensor Fusion and Hardware Co-Design. arXiv, 2017; arXiv:1710.00893. [Google Scholar]
- Chen, C.; Zhu, H. Visual-inertial SLAM method based on optical flow in a GPS-denied environment. Ind. Robot Int. J. 2018, 45, 401–406. [Google Scholar] [CrossRef]
- Liu, H.; Chen, M.; Zhang, G.; Bao, H.; Bao, Y. ICE-BA: Incremental, Consistent and Efficient Bundle Adjustment for Visual-Inertial SLAM. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake, UT, USA, 18–22 June 2018; pp. 1974–1982. [Google Scholar]
- Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. High-Speed Tracking with Kernelized Correlation Filters. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 583–596. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yang, Z.; Shen, S. Monocular Visual-Inertial State Estimation with Online Initialization and Camera-IMU Extrinsic Calibration. IEEE Trans. Autom. Sci. Eng. 2017, 14, 39–51. [Google Scholar] [CrossRef]
- Engel, J.; Koltun, V.; Cremers, D. Direct Sparse Odometry. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 611–625. [Google Scholar] [CrossRef] [PubMed]
- Harris, C. A combined corner and edge detector. In Proceedings of the Alvey Vision Conference, Manchester, UK, September 1988; pp. 147–151. [Google Scholar]
- Rosten, E.; Porter, R.; Drummond, T. Faster and better: A machine learning approach to corner detection. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 105–119. [Google Scholar] [CrossRef] [PubMed]
- Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the IEEE International Conference on Computer Vision, Toronto, ON, Canada, 27–30 May 2012; pp. 2564–2571. [Google Scholar]
- Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef] [Green Version]
- Bay, H.; Ess, A.; Tuytelaars, T.; Gool, L.V. Speeded-Up Robust Features (SURF). Comput. Vis. Image Underst. 2008, 110, 346–359. [Google Scholar] [CrossRef] [Green Version]
- Brito, D.N.; Nunes, C.F.G.; Padua, F.L.C.; Lacerda, A. Evaluation of Interest Point Matching Methods for Projective Reconstruction of 3D Scenes. IEEE Lat. Am. Trans. 2016, 14, 1393–1400. [Google Scholar] [CrossRef]
- Gao, X.; Zhang, T. Robust RGB-D simultaneous localization and mapping using planar point features. Robot. Autonom. Syst. 2015, 72, 1–14. [Google Scholar] [CrossRef]
- Yang, S.; Song, Y.; Kaess, M.; Scherer, S. Pop-up SLAM: Semantic monocular plane SLAM for low-texture environments. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Daejeon, Korea, 9–14 October 2016; pp. 1222–1229. [Google Scholar]
- Kong, X.; Wu, W.; Zhang, L.; Wang, Y. Tightly-Coupled Stereo Visual-Inertial Navigation Using Point and Line Features. Sensors 2015, 15, 12816–12833. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yang, S.; Scherer, S. Direct monocular odometry using points and lines. In Proceedings of the IEEE International Conference on Robotics and Automation, Marina Bay, Singapore, 29 May–3 June 2017; pp. 3871–3877. [Google Scholar]
- Zhang, G.; Lee, J.H.; Lim, J.; Suh, I.H. Building a 3-D Line-Based Map Using a Stereo SLAM. IEEE Trans. Robot. 2015, 31, 1364–1377. [Google Scholar] [CrossRef]
- Enkelmann, W. Investigation of multigrid algorithms for the estimation of optical flow fields in image sequences. Comput. Vis. Graph. Image Process. 1988, 43, 150–177. [Google Scholar] [CrossRef]
- Hassen, W.; Amiri, H. Block Matching Algorithms for motion estimation. In Proceedings of the IEEE International Conference on E-Learning in Industrial Electronics, Vienna, Austria, 10–13 November 2013; pp. 136–139. [Google Scholar]
- Weng, J. A theory of image matching. In Proceedings of the International Conference on Computer Vision, Osaka, Japan, 4–7 December 1990; pp. 200–209. [Google Scholar]
- Holmgren, D.E. An invitation to 3-D vision: From images to geometric models. Photogramm. Rec. 2004, 19, 415–416. [Google Scholar] [CrossRef]
- Sibley, G.; Matthies, L.; Sukhatme, G. Sliding window filter with application to planetary landing. J. Field Robot. 2010, 27, 587–608. [Google Scholar] [CrossRef]
- Baker, S.; Matthews, I. Lucas-Kanade 20 Years On: A Unifying Framework. Int. J. Comput. Vis. 2004, 56, 221–255. [Google Scholar] [CrossRef] [Green Version]
- Leutenegger, S.; Chli, M.; Siegwart, R.Y. BRISK: Binary Robust invariant scalable keypoints. In Proceedings of the IEEE International Conference on Computer Vision, Toronto, ON, Canada, 27–30 May 2012; pp. 2548–2555. [Google Scholar]
- Alahi, A.; Ortiz, R.; Vandergheynst, P. REAK: Fast Retina Keypoint. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Rhode Island, USA, 16–21 June 2012; pp. 510–517. [Google Scholar]
- Quan, M.; Piao, S.; Tan, M.; Huang, S.S. Map-Based Visual-Inertial Monocular SLAM using Inertial assisted Kalman Filter. arXiv, 2017; arXiv:1706.03648v2. [Google Scholar]
- Labbé, M.; Michaud, F. Long-term online multi-session graph-based SPLAM with memory management. Auton. Robot. 2017, 42, 1133–1150. [Google Scholar] [CrossRef]
- Kümmerle, R.; Grisetti, G.; Strasdat, H.; Konolige, K.; Burgard, W. G2o: A general framework for graph optimization. In Proceedings of the IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 3607–3613. [Google Scholar]
- Hess, W.; Kohler, D.; Rapp, H.; Andor, D. Real-time loop closure in 2D LIDAR SLAM. In Proceedings of the IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 16–21 May 2016; pp. 1271–1278. [Google Scholar]
- Carlone, L.; Kira, Z.; Beall, C.; Indelman, V. Eliminating conditionally independent sets in factor graphs: A unifying perspective based on smart factors. In Proceedings of the IEEE International Conference on Robotics and Automation, Hong Kong, China, 31 May–7 June2014; pp. 4290–4297. [Google Scholar]
- Burri, M.; Nikolic, J.; Gohl, P.; Schneider, T.; Rehder, J.; Omari, S.; Achtelik, M.W.; Siegwart, R. The EuRoC micro aerial vehicle datasets. Int. J. Robot. Res. 2016, 35, 1157–1163. [Google Scholar] [CrossRef]
- Miller, M.; Chung, S.J.; Hutchinson, S. The Visual-Inertial Canoe Dataset. Int. J. Robot. Res. 2018, 37, 13–20. [Google Scholar] [CrossRef]
- Majdik, A.L.; Till, C.; Scaramuzza, D. The Zurich urban micro aerial vehicle dataset. Int. J. Robot. Res. 2017, 36, 269–273. [Google Scholar] [CrossRef]
- Schubert, D.; Goll, T.; Demmel, N.; Usenko, V.; Stückler, J.; Cremers, D. The TUM VI Benchmark for Evaluating Visual-Inertial Odometry. arXiv, 2018; arXiv:1804.06120. [Google Scholar]
- Pfrommer, B.; Sanket, N.; Daniilidis, K.; Cleveland, J. PennCOSYVIO: A challenging Visual Inertial Odometry benchmark. In Proceedings of the IEEE International Conference on Robotics and Automation, Marina Bay, Singapore, 29 May–3 June 2017; pp. 3847–3854. [Google Scholar]
- Beeson, P.; Modayil, J.; Kuipers, B. Factoring the Mapping Problem: Mobile Robot Map-building in the Hybrid Spatial Semantic Hierarchy. Int. J. Robot. Res. 2010, 29, 428–459. [Google Scholar] [CrossRef]
- Lowry, S.; Sünderhauf, N.; Newman, P.; Leonard, J.J.; Cox, D.; Corke, P.; Milford, M.J. Visual Place Recognition: A Survey. IEEE Trans. Robot. 2016, 32, 1–19. [Google Scholar] [CrossRef]
- Galvez-López, D.; Tardos, J.D. Bags of Binary Words for Fast Place Recognition in Image Sequences. IEEE Trans. Robot. 2012, 28, 1188–1197. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the Computer Vision and Pattern Recognition, Columbus, USA, 24–27 June 2014; pp. 580–587. [Google Scholar]
- Gao, X.; Zhang, T. Unsupervised learning to detect loops using deep neural networks for visual SLAM system. Auton. Robot. 2017, 41, 1–18. [Google Scholar] [CrossRef]
- Arandjelovic, R.; Gronat, P.; Torii, A.; Pajdla, T.; Sivic, J. NetVLAD: CNN architecture for weakly supervised place recognition. IEEE Trans. Pattern Anal. 2017, 40, 1437–1451. [Google Scholar] [CrossRef] [PubMed]
- Kim, A.; Eustice, R.M. Active visual SLAM for robotic area coverage: Theory and experiment. Int. J. Robot. Res. 2014, 34, 457–475. [Google Scholar] [CrossRef] [Green Version]
- Thrun, S. Exploration in Active Learning; MIT Press: Cambridge, MA, USA, 1995; pp. 381–384. [Google Scholar]
- Engel, J.; Stückler, J.; Cremers, D. Large-scale direct SLAM with stereo cameras. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Hamburg, Germany, 28 September–2 October 2015; pp. 1935–1942. [Google Scholar]
- Tateno, K.; Tombari, F.; Laina, I.; Navab, N. CNN-SLAM: Real-time dense monocular SLAM with learned depth prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6565–6574. [Google Scholar]
- Rambach, J.R.; Tewari, A.; Pagani, A.; Stricker, D. Learning to Fuse: A Deep Learning Approach to Visual-Inertial Camera Pose Estimation. In Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, Merida, Mexico, 23–26 September 2016; pp. 71–76. [Google Scholar]
- Shamwell, E.J.; Leung, S.; Nothwang, W.D. Vision-Aided Absolute Trajectory Estimation Using an Unsupervised Deep Network with Online Error Correction. arXiv, 2018; arXiv:1803.05850. [Google Scholar]
- Gregorio, D.D.; Stefano, L.D. SkiMap: An efficient mapping framework for robot navigation. In Proceedings of the IEEE International Conference on Robotics and Automation, Marina Bay, Singapore, 29 May–3 June 2017; pp. 2569–2576. [Google Scholar]
- Jeong, J.; Cho, Y.; Kim, A. Road-SLAM: Road marking based SLAM with lane-level accuracy. In Proceedings of the 2017 IEEE Intelligent Vehicles Symposium (IV), California, USA, 11–14 June 2017; pp. 1473–1736. [Google Scholar]
- Huang, J.; Dai, A.; Guibas, L.; Niessner, M. 3Dlite: Towards commodity 3D scanning for content creation. ACM Trans. Graph. 2017, 36, 1–14. [Google Scholar] [CrossRef]
- Abouzahir, M.; Elouardi, A.; Latif, R.; Bouaziz, S.; Tajer, A. Embedding SLAM algorithms: Has it come of age? Robot. Auton. Syst. 2018, 100, 14–26. [Google Scholar] [CrossRef]
- Yousef, K.A.M.; Mohd, B.J.; Al-Widyan, K.; Hayajneh, T. Extrinsic Calibration of Camera and 2D Laser Sensors without Overlap. Sensors 2017, 17, 2346. [Google Scholar] [CrossRef] [PubMed]
- Zhang, J.; Singh, S. Visual-lidar odometry and mapping: Low-drift, robust, and fast. In Proceedings of the IEEE International Conference on Robotics and Automation, Washington, DC, USA, 26–30 May 2015; pp. 2174–2181. [Google Scholar]
- Rodríguez-Arévalo, M.L.; Neira, J.; Castellanos, J.A. On the Importance of Uncertainty Representation in Active SLAM. IEEE Trans. Robot. 2018, 34, 829–834. [Google Scholar] [CrossRef]
- Parulkar, A.; Shukla, P.; Krishna, K.M. Fast randomized planner for SLAM automation. In Proceedings of the IEEE International Conference on Automation Science and Engineering, Fort Worth, TX, UAS, 21–24 August 2012; pp. 765–770. [Google Scholar]
- Carlone, L.; Du, J.; Ng, M.K.; Bona, B.; Indri, M. Active SLAM and Exploration with Particle Filters Using Kullback-Leibler Divergence. J. Intell. Robot. Syst. 2014, 75, 291–311. [Google Scholar] [CrossRef]
- Lai, K.; Fox, D. Object Recognition in 3D Point Clouds Using Web Data and Domain Adaptation. Int. J. Robot. Res. 2010, 29, 29–1019. [Google Scholar] [CrossRef]
- Indelman, V.; Carlone, L.; Dellaert, F. Planning in the Continuous Domain: A Generalized Belief Space Approach for Autonomous Navigation in Unknown Environments. Int. J. Robot. Res. 2015, 34, 1021–1029. [Google Scholar] [CrossRef]
- Berg, J.V.D.; Patil, S.; Alterovitz, R. Motion planning under uncertainty using iterative local optimization in belief space. Int. J. Robot. Res. 2012, 31, 1263–1278. [Google Scholar] [CrossRef] [Green Version]
- Saarinen, J.P.; Andreasson, H.; Stoyanov, T.; Lilienthal, A.J. 3D Normal Distributions Transform Occupancy Maps: An Efficient Representation for Mapping in Dynamic Environments. Int. J. Robot. Res. 2013, 32, 1627–1644. [Google Scholar] [CrossRef]
- Zou, D.; Tan, P. CoSLAM: Collaborative Visual SLAM in Dynamic Environments. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 35, 354–366. [Google Scholar] [CrossRef] [PubMed]
Year | Paper | Back-End Approach | Camera Type | Fusion Type | Application |
---|---|---|---|---|---|
2007 | MSCKF [28] | filtering-based | monocular | tightly coupled | |
2007 | [29] | filtering-based | monocular | tightly coupled | |
2010 | [30] | filtering-based | stereo | loosely coupled | |
2011 | [31] | filtering-based | monocular | tightly coupled | vehicle |
2011 | [32] | filtering-based | monocular | tightly coupled | |
2011 | [24,25] | filtering-based | monocular | loosely coupled | |
2011 | [33] | filtering-based | monocular | loosely coupled | MAV |
2012 | [27] | filtering-based | monocular | tightly coupled | vehicle |
2012 | [34] | filtering-based | monocular | loosely coupled | |
2012 | [35] | filtering-based | stereo | tightly coupled | |
2013 | [36] | filtering-based | monocular | tightly coupled | vehicle |
2013 | [37] | filtering-based | monocular | loosely coupled | |
2013 | [38] | filtering-based | monocular | loosely coupled | MAV |
2013 | [39] | filtering-based | monocular | loosely coupled | |
2013 | [40] | filtering-based | rgb-d | tightly coupled | |
2014 | [41] | filtering-based | monocular | tightly coupled | mobile phone |
2014 | [42] | filtering-based | stereo | tightly coupled | |
2015 | OKVIS [43,44,45] | optimization-based | monocular | tightly coupled | |
2015 | SR-ISWF [46] | filtering-based | monocular | tightly coupled | mobile phone |
2015 | [47] | optimization-based | monocular | tightly coupled | |
2015 | [48] | optimization-based | Stereo | tightly coupled | MAV |
2015 | [49] | optimization-based | rgb-d | loosely coupled | Mobile devices |
2015 | [50] | filtering-based | monocular | tightly coupled | |
2015 | ROVIO [51] | filtering-based | monocular | tightly coupled | UAV |
2015 | [52] | optimization-based | monocular | tightly coupled | autonomous vehicle |
2015 | [53] | filtering-based | stereo | tightly coupled | |
2015 | [54] | optimization-based | stereo | tightly coupled | |
2016 | [55] | optimization-based | monocular | tightly coupled | |
2016 | [56] | optimization-based | stereo | tightly coupled | |
2016 | [57] | filtering-based | monocular | loosely coupled | robot |
2016 | [58] | optimization-based | rgb-d | loosely coupled | |
2016 | [59] | filtering-based | stereo | loosely coupled | |
2016 | VIORB [60] | optimization-based | monocular | tightly coupled | MAV |
2017 | [61] | optimization-based | rgb-d | tightly coupled | |
2017 | [62] | filtering-based | monocular | loosely coupled | AR/VR |
2017 | [63] | filtering-based | Multi-camera | tightly coupled | MAV |
2017 | [64] | filtering-based | monocular | tightly coupled | UAV |
2017 | VINS-mono [16,17,18] | optimization-based | monocular | tightly coupled | MAV, AR |
2017 | [65] | optimization-based | monocular | tightly coupled | AR |
2017 | [66] | optimization-based | monocular | tightly coupled | |
2017 | [67] | filtering-based | monocular | tightly coupled | MAV |
2017 | VINet [68] | end-to-end | monocular | / | deep-learning |
2017 | [69] | optimization-based | event camera | tightly coupled | |
2017 | S-MSCKF [26] | filtering-based | stereo | tightly coupled | MAV |
2017 | [70] | optimization-based | monocular | tightly coupled | MAV |
2017 | [71] | optimization-based | stereomonocular | tightly coupled | |
2017 | PIRVS [72] | filtering-based | stereo | tightly coupled | robot |
2017 | Maplab [14,15] | filtering-based | monocular | tightly coupled | mobile platform |
2018 | [73] | optimization-based | stereo | tightly coupled | mobile robot |
2018 | [74] | optimization-based | stereo | tightly coupled |
Methods | Strategies | Papers | |
---|---|---|---|
1 | feature extraction | descriptor matching | [28,60] |
2 | feature extraction | filter-based tracking | [75] |
3 | feature extraction | optical flow tracking | [26,76] |
4 | direct pixel processing | [56,77] |
Propagation: For each IMU measurement received, propagate the filter state and covariance |
Image registration: Every time a new image is recorded. |
augment the state and covariance matrix with a copy of the current camera pose estimate |
image processing modules begins operation |
Update: When the feature measurements of a given image become available, perform an EKF update |
Methods | Optimization Function | Initialization | Optimization Strategies |
---|---|---|---|
OKVIS | reprojection error and IMU temporal error term | using IMU measurements to obtain a preliminary uncertain estimate of the states | Gauss-Newton algorithm Schur complement sliding window |
Paper [56] | photometric error and IMU non-linear error terms | initialize the depth map with the propagated depth from the previous keyframe | Levenberg-Marquardt algorithm Schur complement partial marginalization |
Paper [55] | photometric error and IMU inertial residual | / | Gauss-Newton algorithm Schur complement |
VIORB | reprojection error of all matched points and IMU error term | using vision first, than initializing scale, gravity direction, velocity, and accelerometer and gyroscope biases | Gauss-Newton algorithm local bundle adjustment in local mapping |
VINS-mono | reprojection error and IMU residual | using loosely-coupled sensor fusion method get initial values, than aligning metric IMU pre-integration with the visual-only SfM results to recover scale, gravity, velocity, and even bias | Gauss-Newton algorithm Schur complement sliding window two-way marginalization scheme |
Sequence | ROVIO | S-MSCKF | OKVIS | VINS-Mono | VIORB |
---|---|---|---|---|---|
MH_01_easy | 25.32 | 33.19 | 47.32 | 39.12 | 30.82 |
MH_02_easy | 26.06 | 29.01 | 45.14 | 39.80 | 32.34 |
MH_03_medium | 26.53 | 27.51 | 49.01 | 40.48 | 36.52 |
MH_04_difficult | 25.73 | 27.91 | 48.44 | 40.03 | 33.07 |
MH_05_difficult | 26.61 | 29.61 | 45.74 | 39.05 | 36.06 |
V1_01_easy | 27.41 | 29.59 | 40.66 | 41.23 | 27.82 |
V1_02_media | 27.00 | 30.61 | 44.58 | 35.59 | 32.44 |
V1_03_difficult | 29.69 | 30.86 | 63.30 | 33.95 | 31.61 |
V2_01_easy | 27.04 | 30.83 | 49.67 | 37.55 | 27.55 |
V2_02_media | 26.89 | 28.29 | 52.94 | 36.30 | 32.07 |
V2_03_difficult | 27.29 | 27.18 | 56.74 | 34.56 | 32.23 |
Average | 26.87 | 29.51 | 49.41 | 37.97 | 32.05 |
Sequence | ROVIO | S-MSCKF | OKVIS | VINS-Mono | VIORB |
---|---|---|---|---|---|
MH_01_easy | 0.236 | 0.227 | 0.164 | 0.062 | 0.034 |
MH_02_easy | 0.247 | 0.231 | 0.187 | 0.078 | 0.049 |
MH_03_medium | 0.427 | 0.2011 | 0.274 | 0.045 | 0.040 |
MH_04_difficult | 1.170 | 0.351 | 0.375 | 0.134 | 0.111 |
MH_05_difficult | 0.863 | 0.213 | 0.432 | 0.088 | 0.269 |
V1_01_easy | 0.216 | 0.062 | 0.224 | 0.045 | 0.064 |
V1_02_media | 0.210 | 0.161 | 0.176 | 0.045 | 0.079 |
V1_03_difficult | 0.381 | 0.281 | 0.193 | 0.088 | 0.212 |
V2_01_easy | 0.298 | 0.074 | 0.176 | 0.057 | 0.150 |
V2_02_media | 0.232 | 0.152 | 0.181 | 0.114 | 0.183 |
V2_03_difficult | 0.263 | 0.366 | 0.316 | 0.109 | / |
Average | 0.413 | 0.211 | 0.245 | 0.079 | 0.119 |
Sequence | ROVIO | S-MSCKF | OKVIS | VINS-Mono | VIORB |
---|---|---|---|---|---|
MH_01_easy | 14.86 | 14.14 | 11.03 | 17.09 | 12.52 |
MH_02_easy | 15.03 | 14.22 | 11.22 | 16.95 | 12.53 |
MH_03_medium | 15.04 | 14.22 | 11.03 | 17.74 | 12.48 |
MH_04_difficult | 15.04 | 14.24 | 11.10 | 17.05 | 12.77 |
MH_05_difficult | 15.29 | 14.33 | 11.22 | 18.61 | 13.72 |
V1_01_easy | 12.86 | 14.04 | 11.28 | 17.92 | 12.72 |
V1_02_media | 14.87 | 13.79 | 11.63 | 17.03 | 12.59 |
V1_03_difficult | 14.30 | 13.82 | 11.67 | 17.80 | 12.65 |
V2_01_easy | 13.53 | 14.06 | 11.69 | 16.96 | 12.46 |
V2_02_media | 14.37 | 14.08 | 11.81 | 17.54 | 12.32 |
V2_03_difficult | 14.82 | 14.09 | 12.11 | 17.26 | 12.48 |
Average | 14.55 | 14.04 | 11.43 | 17.45 | 12.65 |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, C.; Zhu, H.; Li, M.; You, S. A Review of Visual-Inertial Simultaneous Localization and Mapping from Filtering-Based and Optimization-Based Perspectives. Robotics 2018, 7, 45. https://doi.org/10.3390/robotics7030045
Chen C, Zhu H, Li M, You S. A Review of Visual-Inertial Simultaneous Localization and Mapping from Filtering-Based and Optimization-Based Perspectives. Robotics. 2018; 7(3):45. https://doi.org/10.3390/robotics7030045
Chicago/Turabian StyleChen, Chang, Hua Zhu, Menggang Li, and Shaoze You. 2018. "A Review of Visual-Inertial Simultaneous Localization and Mapping from Filtering-Based and Optimization-Based Perspectives" Robotics 7, no. 3: 45. https://doi.org/10.3390/robotics7030045
APA StyleChen, C., Zhu, H., Li, M., & You, S. (2018). A Review of Visual-Inertial Simultaneous Localization and Mapping from Filtering-Based and Optimization-Based Perspectives. Robotics, 7(3), 45. https://doi.org/10.3390/robotics7030045