Role of Deep Learning in Loop Closure Detection for Visual and Lidar SLAM: A Survey
Abstract
:1. Introduction
2. Taxonomy of Loop Closure Detection
2.1. Vision-Based Loop Closure Detection
2.1.1. Image-to-Image Matching
Methods with Offline Vocabulary
Methods with Online Vocabulary
2.1.2. Map-to-Map Matching
2.1.3. Image-to-Map Matching
2.2. Lidar-Based Loop Closure Detection
2.2.1. Histograms
2.2.2. Segmentation
3. Role of Deep Learning in Loop Closure Detection
3.1. Vision-Based Loop Closing
3.2. Lidar-Based Loop Closing
4. Challenges of Loop Closure Detection and Role of Deep Learning
4.1. Perceptual Aliasing
4.2. Variation in Environmental Conditions
4.3. Dynamic Environment
4.4. Real-Time Loop Detection
5. Conclusions and Future Research Directions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Cadena, C.; Carlone, L.; Carrillo, H.; Latif, Y.; Scaramuzza, D.; Neira, J.; Reid, I.D.; Leonard, J.J. Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age. IEEE Trans. Robot. 2016, 32, 1309–1332. [Google Scholar] [CrossRef] [Green Version]
- Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Bay, H.; Ess, A.; Tuytelaars, T.; Van Gool, L. Speeded-Up Robust Features (SURF). Comput. Vis. Image Underst. 2008, 110, 346–359. [Google Scholar] [CrossRef]
- Li, S.; Zhang, T.; Gao, X.; Wang, D.; Xian, Y. Semi-direct monocular visual and visual-inertial SLAM with loop closure detection. Robot. Auton. Syst. 2019, 112, 201–210. [Google Scholar] [CrossRef]
- Mur-Artal, R.; Montiel, J.M.M.; Tardos, J.D. ORB-SLAM: A Versatile and Accurate Monocular SLAM System. IEEE Trans. Robot. 2015, 31, 1147–1163. [Google Scholar] [CrossRef] [Green Version]
- Zhu, H.; Wang, H.; Chen, W.; Wu, R. Depth estimation for deformable object using a multi-layer neural network. In Proceedings of the 2017 IEEE International Conference on Real-time Computing and Robotics (RCAR); Okinawa, Japan, 14–18 July 2017, Volume 2017, pp. 477–482.
- Stumm, E.; Mei, C.; Lacroix, S.; Nieto, J.; Hutter, M.; Siegwart, R. Robust Visual Place Recognition with Graph Kernels. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Las Vegas, NV, USA, 27–30 June 2016, Volume 2016, pp. 4535–4544.
- Cummins, M.; Newman, P. FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance. Int. J. Robot. Res. 2008, 27, 647–665. [Google Scholar] [CrossRef]
- Galvez-López, D.; Tardos, J.D. Bags of Binary Words for Fast Place Recognition in Image Sequences. IEEE Trans. Robot. 2012, 28, 1188–1197. [Google Scholar] [CrossRef]
- Garcia-Fidalgo, E.; Ortiz, A. IBoW-LCD: An Appearance-Based Loop-Closure Detection Approach Using Incremental Bags of Binary Words. IEEE Robot. Autom. Lett. 2018, 3, 3051–3057. [Google Scholar] [CrossRef] [Green Version]
- Chen, D.M.; Tsai, S.S.; Chandrasekhar, V.; Takacs, G.; Vedantham, R.; Grzeszczuk, R.; Girod, B. Inverted Index Compression for Scalable Image Matching. In Proceedings of the 2010 Data Compression Conference; Snowbird, UT, USA, 24–26 March 2010, p. 525.
- Naseer, T.; Ruhnke, M.; Stachniss, C.; Spinello, L.; Burgard, W. Robust visual SLAM across seasons. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); Hamburg, Germany, 28 September–2 October 2015, Volume 2015, pp. 2529–2535.
- Zhang, X.; Su, Y.; Zhu, X. Loop closure detection for visual SLAM systems using convolutional neural network. In Proceedings of the 23rd International Conference on Automation and Computing (ICAC); Huddersfield, UK, 7–8 September 2017, pp. 1–6.
- Shin, D.-W.; Ho, Y.-S. Loop Closure Detection in Simultaneous Localization and Mapping Using Learning Based Local Patch Descriptor. Electron. Imaging 2018, 2018, 284–291. [Google Scholar] [CrossRef] [Green Version]
- Qin, H.; Huang, M.; Cao, J.; Zhang, X. Loop closure detection in SLAM by combining visual CNN features and submaps. In Proceedings of the 4th International Conference on Control, Automation and Robotics, ICCAR; Auckland, New Zealand, 20–23 April 2018, pp. 426–430.
- Memon, A.R.; Wang, H.; Hussain, A. Loop closure detection using supervised and unsupervised deep neural networks for monocular SLAM systems. Rob. Auton. Syst. 2020, 126, 103470. [Google Scholar] [CrossRef]
- Azzam, R.; Taha, T.; Huang, S.; Zweiri, Y. Feature-based visual simultaneous localization and mapping: A survey. SN Appl. Sci. 2020, 2, 1–24. [Google Scholar] [CrossRef] [Green Version]
- Taketomi, T.; Uchiyama, H.; Ikeda, S. Visual SLAM algorithms: A survey from 2010 to 2016. IPSJ Trans. Comput. Vis. Appl. 2017, 9, 16. [Google Scholar] [CrossRef]
- Sualeh, M.; Kim, G.-W. Simultaneous Localization and Mapping in the Epoch of Semantics: A Survey. Int. J. Control. Autom. Syst. 2019, 17, 729–742. [Google Scholar] [CrossRef]
- Chen, C.; Wang, B.; Lu, C.X.; Trigoni, N.; Markham, A. A Survey on Deep Learning for Localization and Mapping: Towards the Age of Spatial Machine Intelligence. arXiv 2020, arXiv:2006.12567. [Google Scholar]
- Thrun, S. Probalistic Robotics. Kybernetes 2006, 35, 1299–1300. [Google Scholar] [CrossRef]
- Grisetti, G.; Kummerle, R.; Stachniss, C.; Burgard, W. A Tutorial on Graph-Based SLAM. IEEE Intell. Transp. Syst. Mag. 2010, 2, 31–43. [Google Scholar] [CrossRef]
- Scaramuzza, D.; Fraundorfer, F. Tutorial: Visual odometry. IEEE Robot. Autom. Mag. 2011, 18, 80–92. [Google Scholar] [CrossRef]
- Saputra, M.R.U.; Markham, A.; Trigoni, N. Visual SLAM and structure from motion in dynamic environments: A survey. ACM Comput. Surveys 2018, 51. [Google Scholar] [CrossRef]
- Eade, E.; Drummond, T. Unified Loop Closing and Recovery for Real Time Monocular SLAM. In Procedings of the British Machine Vision Conference 2008; BMVA Press: London, UK, 2008; Volume 13, p. 6. [Google Scholar]
- Burgard, W.; Brock, O.; Stachniss, C. Mapping Large Loops with a Single Hand-Held Camera. In Robotics: Science and Systems III; MIT Press: Cambridge, MA, USA, 2008; pp. 297–304. [Google Scholar]
- Williams, B.; Cummins, M.; Neira, J.; Newman, P.; Reid, I.; Tardos, J. An image-to-map loop closing method for monocular SLAM. In Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems; IEEE: Piscataway, NJ, USA, 2008; pp. 2053–2059. [Google Scholar]
- Williams, B.; Cummins, M.; Neira, J.; Newman, P.; Reid, I.; Tardós, J. A comparison of loop closing techniques in monocular SLAM. Robot. Auton. Syst. 2009, 57, 1188–1197. [Google Scholar] [CrossRef] [Green Version]
- Nister, D.; Stewenius, H. Scalable Recognition with a Vocabulary Tree. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06); IEEE: Piscataway, NJ, USA, 2006; Volume 2, pp. 2161–2168. [Google Scholar]
- Likas, A.; Vlassis, N.; Verbeek, J.J. The global k-means clustering algorithm. Pattern Recognit. 2003, 36, 451–461. [Google Scholar] [CrossRef] [Green Version]
- Silveira, G.; Malis, E.; Rives, P. An Efficient Direct Approach to Visual SLAM. IEEE Trans. Robot. 2008, 24, 969–979. [Google Scholar] [CrossRef]
- Chow, C.; Liu, C. Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory 1968, 14, 462–467. [Google Scholar] [CrossRef] [Green Version]
- Bay, H.; Tuytelaars, T.; Van Gool, L. SURF: Speeded up robust features. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Elsevier: Amsterdam, The Netherlands, 2006; Volume 3951, pp. 404–417. [Google Scholar]
- Cummins, M.; Newman, P. Appearance-only SLAM at large scale with FAB-MAP 2.0. Int. J. Robot. Res. 2010, 30, 1100–1123. [Google Scholar] [CrossRef]
- Piniés, P.; Paz, L.M.; Gálvez-López, D.; Tardós, J.D. CI-Graph simultaneous localization and mapping for three-dimensional reconstruction of large and complex environments using a multicamera system. J. Field Robot. 2010, 27, 561–586. [Google Scholar] [CrossRef]
- Cadena, C.; Gálvez-López, D.; Ramos, F.; Tardós, J.D.; Neira, J. Robust place recognition with stereo cameras. In IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010—Conference Proceedings; IEEE: Piscataway, NJ, USA, 2010; pp. 5182–5189. [Google Scholar]
- Angeli, A.; Filliat, D.; Doncieux, S.; Meyer, J.A. Fast and incremental method for loop-closure detection using bags of visual words. IEEE Trans. Robot. 2008, 24, 1027–1037. [Google Scholar] [CrossRef] [Green Version]
- Paul, R.; Newman, P. FAB-MAP 3D: Topological mapping with spatial and visual appearance. In Proceedings—IEEE International Conference on Robotics and Automation; IEEE: Piscataway, NJ, USA, 2010; pp. 2649–2656. [Google Scholar]
- Calonder, M.; Lepetit, V.; Strecha, C.; Fua, P. BRIEF: Binary robust independent elementary features. In European Conference on Computer Vision; Elsevier: Amsterdam, The Netherlands, 2010; Volume 6314, pp. 778–792. [Google Scholar]
- Leutenegger, S.; Chli, M.; Siegwart, R.Y. BRISK: Binary Robust invariant scalable keypoints. In Proceedings of the IEEE International Conference on Computer Vision; Barcelona, Spain, 6–13 November 2011, IEEE: Piscataway, NJ, USA, 2011; pp. 2548–2555. [Google Scholar]
- Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In 2011 International Conference on Computer Vision; IEEE: Piscataway, NJ, USA, 2011; pp. 2564–2571. [Google Scholar]
- Galvez-Lopez, D.; Tardos, J.D. Real-time loop detection with bags of binary words. In Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems; IEEE: Piscataway, NJ, USA, 2011; pp. 51–58. [Google Scholar]
- Gao, X.; Zhang, T. Loop closure detection for visual SLAM systems using deep neural networks. In Proceedings of the 2015 34th Chinese Control Conference (CCC); IEEE: Piscataway, NJ, USA, 2015; pp. 5851–5856. [Google Scholar]
- Khan, S.; Wollherr, D. IBuILD: Incremental bag of Binary words for appearance based loop closure detection. In Proceedings—IEEE International Conference on Robotics and Automation; IEEE: Piscataway, NJ, USA, 2015; pp. 5441–5447. [Google Scholar]
- Kejriwal, N.; Kumar, S.; Shibata, T. High performance loop closure detection using bag of word pairs. Robot. Auton. Syst. 2016, 77, 55–65. [Google Scholar] [CrossRef] [Green Version]
- Tan, W.; Liu, H.; Dong, Z.; Zhang, G.; Bao, H. Robust monocular SLAM in dynamic environments. In Proceedings of the 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR); IEEE: Piscataway, NJ, USA, 2013; pp. 209–218. [Google Scholar]
- Johannsson, H.; Kaess, M.; Fallon, M.; Leonard, J.J. Temporally scalable visual SLAM using a reduced pose graph. In Proceedings—IEEE International Conference on Robotics and Automation; IEEE: Piscataway, NJ, USA, 2013; pp. 54–61. [Google Scholar]
- Xu, H.; Zhang, H.-X.; Yao, E.-L.; Song, H.-T. A Loop Closure Detection Algorithm in Dynamic Scene. DEStech Trans. Comput. Sci. Eng. 2018. [Google Scholar] [CrossRef]
- Li, H.; Nashashibi, F. Multi-vehicle cooperative localization using indirect vehicle-to-vehicle relative pose estimation. In 2012 IEEE International Conference on Vehicular Electronics and Safety, ICVES 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 267–272. [Google Scholar]
- Torr, P.H.S.; Zisserman, A. MLESAC: A new robust estimator with application to estimating image geometry. Comput. Vis. Image Underst. 2000, 78, 138–156. [Google Scholar] [CrossRef] [Green Version]
- Kneip, L.; Scaramuzza, D.; Siegwart, R. A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2011; pp. 2969–2976. [Google Scholar]
- Williams, B.; Klein, G.; Reid, I. Automatic Relocalization and Loop Closing for Real-Time Monocular SLAM. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 1699–1712. [Google Scholar] [CrossRef] [PubMed]
- Lepetit, V.; Fua, P. Keypoint recognition using randomized trees. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 1465–1479. [Google Scholar] [CrossRef] [Green Version]
- Gao, X.; Wang, R.; Demmel, N.; Cremers, D. LDSO: Direct Sparse Odometry with Loop Closure. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems; Madrid, Spain, 1 –5 October 2018, pp. 2198–2204.
- Engel, J.; Koltun, V.; Cremers, D. Direct Sparse Odometry. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 611–625. [Google Scholar] [CrossRef]
- Zhou, H.; Zhang, T.; Jagadeesan, J. Re-weighting and 1-Point RANSAC-Based PnP Solution to Handle Outliers. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 3022–3033. [Google Scholar] [CrossRef] [PubMed]
- Mur-Artal, R.; Tardós, J.D. Fast relocalisation and loop closing in keyframe-based SLAM. In Proceedings—IEEE International Conference on Robotics and Automation; IEEE: Piscataway, NJ, USA, 2014; pp. 846–853. [Google Scholar]
- Rohling, T.; Mack, J.; Schulz, D. A fast histogram-based similarity measure for detecting loop closures in 3-D LIDAR data. In IEEE International Conference on Intelligent Robots and Systems; IEEE: Piscataway, NJ, USA, 2015; Volume 2015, pp. 736–741. [Google Scholar]
- Granstrom, K.; Schön, T.B.; Nieto, J.I.; Ramos, F.T. Learning to close loops from range data. Int. J. Robot. Res. 2011, 30, 1728–1754. [Google Scholar] [CrossRef]
- Zhou, Q.-Y.; Park, J.; Koltun, V. Fast Global Registration. In Mining Data for Financial Applications; Springer Nature: London, UK, 2016; Volume 9906, pp. 766–782. [Google Scholar]
- Muhammad, N.; Lacroix, S. Loop closure detection using small-sized signatures from 3D LIDAR data. In 9th IEEE International Symposium on Safety, Security, and Rescue Robotics, SSRR 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 333–338. [Google Scholar]
- Bosse, M.; Zlot, R. Place recognition using keypoint voting in large 3D lidar datasets. In Proceedings—IEEE International Conference on Robotics and Automation; IEEE: Piscataway, NJ, USA, 2013; pp. 2677–2684. [Google Scholar]
- Schmiedel, T.; Einhorn, E.; Gross, H.M. IRON: A fast interest point descriptor for robust NDT-map matching and its application to robot localization. In IEEE International Conference on Intelligent Robots and Systems; IEEE: Piscataway, NJ, USA, 2015; Volume 2015, pp. 3144–3151. [Google Scholar]
- Rusu, R.B.; Blodow, N.; Beetz, M. Fast Point Feature Histograms (FPFH) for 3D registration. In 2009 IEEE International Conference on Robotics and Automation; IEEE: Piscataway, NJ, USA, 2009; pp. 3212–3217. [Google Scholar]
- Biber, P. The Normal Distributions Transform: A New Approach to Laser Scan Matching. In Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems; Las Vegas, NV, USA, 27–31 October 2003.
- Magnusson, M.; Lilienthal, A.J.; Duckett, T. Scan registration for autonomous mining vehicles using 3D-NDT. J. Field Robot. 2007, 24, 803–827. [Google Scholar] [CrossRef] [Green Version]
- Magnusson, M.; Andreasson, H.; Nüchter, A.; Lilienthal, A.J. Automatic appearance-based loop detection from three-dimensional laser data using the normal distributions transform. J. Field Robot. 2009, 26, 892–914. [Google Scholar] [CrossRef] [Green Version]
- Weinberger, K.Q.; Blitzer, J.; Saul, L.K. Distance Metric Learning for Large Margin Nearest Neighbor Classification. Adv. Neural Inf. Process. Syst. 2005, 18, 1473–1480. [Google Scholar]
- Magnusson, M.; Andreasson, H.; Nuchter, A.; Lilienthal, A.J. Appearance-based loop detection from 3D laser data using the normal distributions transform. In Proceedings of the 2009 IEEE International Conference on Robotics and Automation; IEEE: Piscataway, NJ, USA, 2009; pp. 23–28. [Google Scholar]
- Lin, J.; Zhang, F. A fast, complete, point cloud based loop closure for LiDAR odometry and mapping. arXiv 2019, arXiv:1909.11811. [Google Scholar]
- Bosse, M.; Zlot, R. Keypoint design and evaluation for place recognition in 2D lidar maps. Robot. Auton. Syst. 2009, 57, 1211–1224. [Google Scholar] [CrossRef]
- Walthelm, A. Enhancing global pose estimation with laser range. In International Conference on Intelligent Autonomous Systems; Elsevier: Amsterdam, The Netherlands, 2004. [Google Scholar]
- Himstedt, M.; Frost, J.; Hellbach, S.; Bohme, H.-J.; Maehle, E. Large scale place recognition in 2D LIDAR scans using Geometrical Landmark Relations. In Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems; IEEE: Piscataway, NJ, USA, 2014; pp. 5030–5035. [Google Scholar]
- Wohlkinger, W.; Vincze, M. Ensemble of shape functions for 3D object classification. In Proceedings of the 2011 IEEE International Conference on Robotics and Biomimetics; IEEE: Piscataway, NJ, USA, 2011; pp. 2987–2992. [Google Scholar]
- Fernández-Moral, E.; Rives, P.; Arévalo, V.; González-Jiménez, J. Scene structure registration for localization and mapping. Robot. Auton. Syst. 2016, 75, 649–660. [Google Scholar] [CrossRef]
- Fernández-Moral, E.; Mayol-Cuevas, W.; Arevalo, V.; Gonzalez-Jimenez, J. Fast place recognition with plane-based maps. In Proceedings of the 2013 IEEE International Conference on Robotics and Automation; IEEE: Piscataway, NJ, USA, 2013; pp. 2719–2724. [Google Scholar]
- Nieto, J.; Bailey, T.; Nebot, E. Scan-SLAM: Combining EKF-SLAM and scan correlation. In Springer Tracts in Advanced Robotics; Springer: Berlin, Germany, 2006; Volume 25, pp. 167–178. [Google Scholar]
- Douillard, B.; Underwood, J.; Vlaskine, V.; Quadros, A.; Singh, S. A pipeline for the segmentation and classification of 3D point clouds. In Springer Tracts in Advanced Robotics; Springer: Berlin, Germany, 2014; Volume 79, pp. 585–600. [Google Scholar]
- Douillard, B.; Quadros, A.; Morton, P.; Underwood, J.; De Deuge, M.; Hugosson, S.; Hallstrom, M.; Bailey, T. Scan segments matching for pairwise 3D alignment. In Proceedings of the 2012 IEEE International Conference on Robotics and Automation; IEEE: Piscataway, NJ, USA, 2012; pp. 3033–3040. [Google Scholar]
- Ye, Q.; Shi, P.; Xu, K.; Gui, P.; Zhang, S. A Novel Loop Closure Detection Approach Using Simplified Structure for Low-Cost LiDAR. Sensors 2020, 20, 2299. [Google Scholar] [CrossRef] [Green Version]
- Douillard, B.; Underwood, J.; Kuntz, N.; Vlaskine, V.; Quadros, A.; Morton, P.; Frenkel, A. On the segmentation of 3D LIDAR point clouds. In Proceedings of the IEEE International Conference on Robotics and Automation; Shanghai, China, 9–13 May 2011.
- Dube, R.; Dugas, D.; Stumm, E.; Nieto, J.; Siegwart, R.; Cadena, C. SegMatch: Segment based place recognition in 3D point clouds. In Proceedings—IEEE International Conference on Robotics and Automation; IEEE: Piscataway, NJ, USA, 2017; pp. 5266–5272. [Google Scholar]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Fan, Y.; He, Y.; Tan, U.-X. Seed: A Segmentation-Based Egocentric 3D Point Cloud Descriptor for Loop Closure Detection. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); IEEE: Piscataway, NJ, USA, 2020; pp. 25–29. [Google Scholar]
- Liu, X.; Zhang, L.; Qin, S.; Tian, D.; Ouyang, S.; Chen, C. Optimized LOAM Using Ground Plane Constraints and SegMatch-Based Loop Detection. Sensors 2019, 19, 5419. [Google Scholar] [CrossRef] [PubMed]
- Schnabel, R.; Wahl, R.; Klein, R. Efficient RANSAC for Point-Cloud Shape Detection. Comput. Graph. Forum 2007, 26, 214–226. [Google Scholar] [CrossRef]
- Tomono, M. Loop detection for 3D LiDAR SLAM using segment-group matching. Adv. Robot. 2020, 34, 1530–1544. [Google Scholar] [CrossRef]
- Gao, X.; Zhang, T. Unsupervised learning to detect loops using deep neural networks for visual SLAM system. Auton. Robots 2017, 41, 1–18. [Google Scholar] [CrossRef]
- Chen, B.; Yuan, D.; Liu, C.; Wu, Q. Loop Closure Detection Based on Multi-Scale Deep Feature Fusion. Appl. Sci. 2019, 9, 1120. [Google Scholar] [CrossRef] [Green Version]
- Cascianelli, S.; Costante, G.; Bellocchio, E.; Valigi, P.; Fravolini, M.L.; Ciarfuglia, T.A. Robust visual semi-semantic loop closure detection by a covisibility graph and CNN features. Robot. Auton. Syst. 2017, 92, 53–65. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition; Honolulu, HI, USA, 21–26 July 2017, pp. 6517–6525.
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Li, F.F. Imagenet: A Large-Scale Hierarchical Image Database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition; Miami, FL, USA, 20–25 June 2009, pp. 248–255.
- He, Y.; Chen, J.; Zeng, B. A fast loop closure detection method based on lightweight convolutional neural network. Comput. Eng. 2018, 44, 182–187. [Google Scholar]
- Hu, M.; Li, S.; Wu, J.; Guo, J.; Li, H.; Kang, X. Loop closure detection for visual SLAM fusing semantic information. In Chinese Control Conference, CCC; IEEE: Piscataway, NJ, USA, 2019; Volume 2019, pp. 4136–4141. [Google Scholar]
- Wang, Y.; Zell, A. Improving Feature-based Visual SLAM by Semantics. In Proceedings of the 2018 IEEE International Conference on Image Processing, Applications and Systems (IPAS); Sophia Antipolis, France, 12–14 December 2018, IEEE: Piscataway, NJ, USA, 2018; pp. 7–12. [Google Scholar]
- Liao, Y.; Wang, Y.; Liu, Y. Graph Regularized Auto-Encoders for Image Representation. IEEE Trans. Image Process. 2017, 26, 2839–2852. [Google Scholar] [CrossRef]
- Merrill, N.; Huang, G. Lightweight Unsupervised Deep Loop Closure. In Robotics: Science and Systems; Springer: Berlin, Germany, 2018. [Google Scholar]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR; San Diego, CA, USA, 20–25 June 2005.
- Zhou, B.; Lapedriza, A.; Khosla, A.; Oliva, A.; Torralba, A. Places: A 10 Million Image Database for Scene Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 1452–1464. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ca, P.V.; Edu, L.T.; Lajoie, I.; Ca, Y.B.; Ca, P.-A.M. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion Pascal Vincent Hugo Larochelle Yoshua Bengio Pierre-Antoine Manzagol. J. Mach. Learn. Res. 2010, 11, 3371–3408. [Google Scholar]
- Zaganidis, A.; Zerntev, A.; Duckett, T.; Cielniak, G. Semantically Assisted Loop Closure in SLAM Using NDT Histograms. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); IEEE: Piscataway, NJ, USA, 2019; pp. 4562–4568. [Google Scholar]
- Li, C.R.Q.; Hao, Y.; Leonidas, S.; Guibas, J. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. Adv. Neural Inform. Process. Syst. 2017, 30, 5099–5108. [Google Scholar]
- Yang, Y.; Song, S.; Toth, C. CNN-Based Place Recognition Technique for Lidar Slam. ISPRS—Int. Arch. Photogramm. Remote. Sens. Spat. Inf. Sci. 2020. [Google Scholar] [CrossRef]
- GitHub—Mikacuy/Pointnetvlad: PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition, CVPR 2018. Available online: https://github.com/mikacuy/pointnetvlad (accessed on 27 November 2020).
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Honolulu, HI, USA, 21–26 July 2017, IEEE: Piscataway, NJ, USA, 2017; pp. 652–660. [Google Scholar]
- Arandjelovi, R.; Gronat, P.; Sivic, J. NetVLAD: CNN architecture for weakly supervised place recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Las Vegas, NV, USA, 27–30 June 2016, IEEE: Piscataway, NJ, USA, 2016; pp. 5297–5307. [Google Scholar]
- Yin, H.; Tang, L.; Ding, X.; Wang, Y.; Xiong, R. LocNet: Global Localization in 3D Point Clouds for Mobile Vehicles. In IEEE Intelligent Vehicles Symposium, Proceedings; IEEE: Piscataway, NJ, USA, 2018; pp. 728–733. [Google Scholar]
- Dubé, R.; Cramariuc, A.; Dugas, D.; Nieto, J.; Siegwart, R.; Cadena, C. SegMap: 3D Segment Mapping using Data-Driven Descriptors. In Robotics: Science and Systems XIV; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Dubé, R.; Cramariuc, A.; Dugas, D.; Sommer, H.; Dymczyk, M.; Nieto, J.; Siegwart, R.; Cadena, C. SegMap: Segment-based mapping and localization using data-driven descriptors. Int. J. Robot. Res. 2019, 39, 339–355. [Google Scholar] [CrossRef] [Green Version]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Piscataway, NJ, USA, 2016; pp. 779–788. [Google Scholar]
- Xia, Y.; Li, J.; Qi, L.; Fan, H. Loop closure detection for visual SLAM using PCANet features. In Proceedings of the International Joint Conference on Neural Networks; IEEE: Piscataway, NJ, USA, 2016; pp. 2274–2281. [Google Scholar]
- Chan, T.-H.; Jia, K.; Gao, S.; Lu, J.; Zeng, Z.; Ma, Y. PCANet: A Simple Deep Learning Baseline for Image Classification? IEEE Trans. Image Process. 2015, 24, 5017–5032. [Google Scholar] [CrossRef] [Green Version]
- Chen, X.; Labe, T.; Milioto, A.; Rohling, T.; Vysotska, O.; Haag, A.; Behley, J.; Stachniss, C. OverlapNet: Loop Closing for LiDAR-based SLAM. In Proceedings of the Robotics: Science and Systems (RSS); Online Proceedings, 14–16 July 2020.
- Milioto, A.; Vizzo, I.; Behley, J.; Stachniss, C. RangeNet++: Fast and Accurate LiDAR Semantic Segmentation. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); IEEE: Piscataway, NJ, USA, 2019; pp. 4213–4220. [Google Scholar]
- Wang, S.; Lv, X.; Liu, X.; Ye, D. Compressed Holistic ConvNet Representations for Detecting Loop Closures in Dynamic Environments. IEEE Access 2020, 8, 60552–60574. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Las Vegas, NV, USA, 27–30 June 2016, IEEE: Piscataway, NJ, USA, 2016; pp. 770–778. [Google Scholar]
- Żywanowski, K.; Banaszczyk, A.; Nowicki, M. Comparison of camera-based and 3D LiDAR-based loop closures across weather conditions. arXiv 2020, arXiv:2009.03705. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015; San Diego, CA, USA, 7–9 May 2015.
- Olid, D.; Fácil, J.M.; Civera, J. Single-View Place Recognition under Seasonal Changes. arXiv 2018, arXiv:1808.06516. [Google Scholar]
- Facil, J.M.; Olid, D.; Montesano, L.; Civera, J. Condition-Invariant Multi-View Place Recognition. arXiv 2019, arXiv:1902.09516. [Google Scholar]
- Liu, Y.; Xiang, R.; Zhang, Q.; Ren, Z.; Cheng, J. Loop closure detection based on improved hybrid deep learning architecture. In Proceedings—2019 IEEE International Conferences on Ubiquitous Computing and Communications and Data Science and Computational Intelligence and Smart Computing, Networking and Services, IUCC/DSCI/SmartCNS 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 312–317. [Google Scholar]
- Sunderhauf, N.; Shirazi, S.; Dayoub, F.; Upcroft, B.; Milford, M. On the performance of ConvNet features for place recognition. In IEEE International Conference on Intelligent Robots and Systems; IEEE: Piscataway, NJ, USA, 2015; pp. 4297–4304. [Google Scholar]
- Maddern, W.; Pascoe, G.; Linegar, C.; Newman, P. 1 year, 1000 km: The Oxford RobotCar dataset. Int. J. Robot. Res. 2016, 36, 3–15. [Google Scholar] [CrossRef]
- Nordlandsbanen: Minute by Minute, Season by Season. Available online: https://nrkbeta.no/2013/01/15/nordlandsbanen-minute-by-minute-season-by-season/ (accessed on 5 December 2020).
- Suenderhauf, N.; Shirazi, S.; Jacobson, A.; Dayoub, F.; Pepperell, E.; Upcroft, B.; Milford, M. Place recognition with ConvNet landmarks: Viewpoint-robust, condition-robust, training-free. In Proceedings of the Robotics: Science and Systems Conference XI; Rome, Italy, 13–15 July 2015.
- Hou, Y.; Zhang, H.; Zhou, S. Convolutional neural network-based image representation for visual loop closure detection. In 2015 IEEE International Conference on Information and Automation, ICIA 2015—In Conjunction with 2015 IEEE International Conference on Automation and Logistics; IEEE: Piscataway, NJ, USA, 2015; pp. 2238–2245. [Google Scholar]
- Computer Vision Group—Dataset Download. Available online: https://vision.in.tum.de/data/datasets/rgbd-dataset/download (accessed on 2 December 2020).
- Peng, X.; Wang, L.; Wang, X.; Qiao, Y. Bag of visual words and fusion methods for action recognition: Comprehensive study and good practice. Comput. Vis. Image Underst. 2016, 150, 109–125. [Google Scholar] [CrossRef] [Green Version]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems; Neural Information Processing Systems Foundation Inc.: San Diego, CA, USA, 2015; pp. 91–99. [Google Scholar]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common objects in context. In Computer Vision–ECCV 2014. ECCV 2014. Lecture Notes in Computer Science; Springer: Cham, Swizerland, 2014; pp. 740–755. [Google Scholar]
- Stückler, J.; Behnke, S. Multi-resolution surfel maps for efficient dense 3D modeling and tracking. J. Vis. Commun. Image Represent. 2014, 25, 137–147. [Google Scholar] [CrossRef]
- Endres, F.; Hess, J.; Sturm, J.; Cremers, D.; Burgard, W. 3-D Mapping With an RGB-D Camera. IEEE Trans. Robot. 2014, 30, 177–187. [Google Scholar] [CrossRef]
- Mur-Artal, R.; Tardos, J.D. ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras. IEEE Trans. Robot. 2017, 33, 1255–1262. [Google Scholar] [CrossRef] [Green Version]
- Debeunne, C.; Vivet, D. A Review of Visual-LiDAR Fusion based Simultaneous Localization and Mapping. Sensors 2020, 20, 2068. [Google Scholar] [CrossRef] [Green Version]
Method | Benefits | Limitations | ||
---|---|---|---|---|
Vision-based | Image-to-Image | Offline Vocabulary |
|
|
Online Vocabulary |
|
| ||
Map-to-Map |
|
| ||
Image-to-Map |
|
| ||
Lidar-based | Histograms |
|
| |
Segmentation |
|
|
Ref. | Year | Sensor | Components | Deep Learning Algorithm | Env | Challenges | Seman-tics | |||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Weather | Seasons | Light | Viewpoint | Effi-ciency | Dynamic Env | |||||||
[89] | 2019 | C | CNN feature | AlexNet | | - | - | + | - | - | - | - |
[95] | 2019 | C | SIFT, SURF, ORB | Faster R-CNN | | - | - | - | + | - | + | + |
[96] | 2018 | C | ORB | Yolo [111] | | - | - | + | + | + | + | + |
[98] | 2018 | C | HoG | Autoencoder | | + | + | + | + | - | + | - |
[102] | 2019 | L | Semantic-NDT | PointNet++ [103] | | - | - | + | + | + | + | + |
[108] | 2018 | L | Semi-handcrafted | Siamese | | - | - | + | + | + | + | - |
[109] | 2018 | L | SegMap | CNN | | - | - | - | + | - | - | - |
[110] | 2020 | L | SegMap | CNN | | - | - | - | + | + | - | |
[112] | 2016 | C | SIFT, SURF, ORB | PCANet [113] | | - | - | + | + | - | - | - |
[114] | 2020 | L | Semantic class | RangeNet++ [115] | | - | - | - | + | - | + | + |
[116] | 2020 | C | CNN feature | ResNet18 [117] | | + | + | + | + | + | + | - |
[118] | 2020 | C/L | CNN feature | VGG16 [119] | | + | - | + | - | - | + | - |
[120] | 2018 | C | CNN feature | VGG16 | | + | + | + | + | - | - | - |
[121] | 2019 | C | CNN Multiview descriptor | ResNet-50 [117] | | + | + | + | + | - | + | - |
[122] | 2019 | C | Semantic feature | Hybrid [123] | | + | + | + | + | - | - | + |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Arshad, S.; Kim, G.-W. Role of Deep Learning in Loop Closure Detection for Visual and Lidar SLAM: A Survey. Sensors 2021, 21, 1243. https://doi.org/10.3390/s21041243
Arshad S, Kim G-W. Role of Deep Learning in Loop Closure Detection for Visual and Lidar SLAM: A Survey. Sensors. 2021; 21(4):1243. https://doi.org/10.3390/s21041243
Chicago/Turabian StyleArshad, Saba, and Gon-Woo Kim. 2021. "Role of Deep Learning in Loop Closure Detection for Visual and Lidar SLAM: A Survey" Sensors 21, no. 4: 1243. https://doi.org/10.3390/s21041243
APA StyleArshad, S., & Kim, G.-W. (2021). Role of Deep Learning in Loop Closure Detection for Visual and Lidar SLAM: A Survey. Sensors, 21(4), 1243. https://doi.org/10.3390/s21041243