SVG-Loop: Semantic–Visual–Geometric Information-Based Loop Closure Detection
Abstract
:1. Introduction
- (1)
- A semantic bag-of-words model was constructed to reduce the interference caused by dynamic objects and improve the accuracy of image matching.
- (2)
- A semantic landmark vector was designed that can express semantic and geometric information of images and improve the robustness of loop closure detection.
- (3)
- A semantic–visual–geometric information-based loop detection algorithm SVG-Loop is proposed to improve robustness in complex environments.
2. Related Work
2.1. Bag-of-Words Model-Based Loop Closure Detection
2.2. Semantic-Information-Based Loop Closure Detection
3. Methodology
3.1. Semantic Bag-of-Words Model
3.1.1. Semantic–Visual Word Extraction
3.1.2. Vocabulary Construction
3.1.3. Visual Loop Closure Candidate Detection
3.2. Semantic Landmark Vector Model
3.2.1. Semantic Descriptor Generation
3.2.2. Semantic Loop Closure Candidate Detection
3.3. Fuse Calculation
4. Experimental Results and Analysis
4.1. Dataset Experiments
4.1.1. Indoor Dataset
4.1.2. Outdoor Dataset
4.2. Practical Environmental Experiments
4.2.1. Indoor Experiments
4.2.2. Outdoor Experiments
5. Discussion
5.1. Experiments Analysis
- ➢
- The indoor dataset (TUM RGB-D dataset) consists of images taken in stable circumstances. There are no light changes and only a few dynamic objects in sequences, which are selected to complete experiments. Results in Figure 6 and Figure 7 show that the SVG-Loop model is sensitive to loop closure. Figure 8 and Figure 9 indicate that SVG-Loop can combine the SLAM system to achieve higher localization accuracy in an environment where loop closures exist.
- ➢
- The outdoor dataset (KITTI odometry dataset) contains various dynamic objects but no dramatic light changes. Experiments in this part were leveraged to test the robustness of the SVG-Loop method in an outdoor environment with dynamic interference. According to Figure 11, the SVG-Loop algorithm can overcome some of the dynamic interference and complete loop closure detection.
- ➢
- The practical indoor experiments included light changes but no dynamic objects. The SVG-Loop method is robust to light changes of different levels and angles, such as in Figure 13. Compared with other visual-based methods, Table 2 shows that SVG-Loop is sensitive to loops and can capture loops quickly and effectively. However, the simultaneous appearance of light changes and lack of semantic landmarks will cause a serious decline in the recall rate of the SVG-Loop algorithm.
- ➢
- The practical outdoor dataset constructs the most complex situation of the four parts. Drastic light changes, different weather changes, and high-frequency dynamic objects are included in the dataset. According to Table 3, SVG-Loop is robust to outdoor light alters, weather changes, and the movement of dynamic objects. Figure 15 and Figure 16 illustrate that the SVG-Loop model has the potential to detect loop closure for a SLAM system in complex environments.
5.2. Experiment Implementation and Optimization Possibilities
6. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Smith, R.C.; Cheeseman, P. On the Representation and Estimation of Spatial Uncertainty. Int. J. Robot. Res. 1986, 5, 56–68. [Google Scholar] [CrossRef]
- Cadena, C.; Carlone, L.; Carrillo, H.; Latif, Y.; Scaramuzza, D.; Neira, J.; Reid, I.; Leonard, J.J. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Trans. Robot. 2016, 32, 1309–1332. [Google Scholar] [CrossRef] [Green Version]
- Ho, K.L.; Newman, P. Detecting loop closure with scene sequences. Int. J. Comput. Vis. 2007, 74, 261–286. [Google Scholar] [CrossRef]
- Williams, B.; Klein, G.; Reid, I. Automatic relocalization and loop closing for real-time monocular SLAM. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 1699–1712. [Google Scholar] [CrossRef]
- Geavlete, B.; Stanescu, F.; Moldoveanu, C.; Jecu, M.; Adou, L.; Ene, C.; Bulai, C.; Geavlete, P. 227 The test of time for new advances in BPH endoscopic treatment—Prospective, randomized comparisons of bipolar plasma enucleation versus open prostatectomy and continuous versus standard plasma vaporization and monopolar TURP. Eur. Urol. Suppl. 2014, 13, e227. [Google Scholar] [CrossRef]
- Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the IEEE International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2564–2571. [Google Scholar] [CrossRef]
- Cummins, M.; Newman, P. FAB-MAP: Probabilistic localization and mapping in the space of appearance. Int. J. Robot. Res. 2008, 27, 647–665. [Google Scholar] [CrossRef]
- Angeli, A.; Filliat, D.; Doncieux, S.; Meyer, J.A. Fast and incremental method for loop-closure detection using bags of visual words. IEEE Trans. Robot. 2008, 24, 1027–1037. [Google Scholar] [CrossRef] [Green Version]
- Gálvez-López, D.; Tardós, J.D. Bags of binary words for fast place recognition in image sequences. IEEE Trans. Robot. 2012, 28, 1188–1197. [Google Scholar] [CrossRef]
- Mur-Artal, R.; Montiel, J.M.M.; Tardos, J.D. ORB-SLAM: A Versatile and Accurate Monocular SLAM System. IEEE Trans. Robot. 2015, 31, 1147–1163. [Google Scholar] [CrossRef] [Green Version]
- Mur-Artal, R.; Tardos, J.D. ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras. IEEE Trans. Robot. 2017, 33, 1255–1262. [Google Scholar] [CrossRef] [Green Version]
- Campos, C.; Elvira, R.; Rodríguez, J.J.G.; Montiel, J.M.M.; Tardós, J.D. ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM. IEEE Trans. Robot. 2020, 1–15. [Google Scholar] [CrossRef]
- Sivic, J.; Zisserman, A. Video google: A text retrieval approach to object matching in videos. Proc. IEEE Int. Conf. Comput. Vis. 2003, 2, 1470–1477. [Google Scholar] [CrossRef]
- Milford, M.J.; Wyeth, G.F. SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights. In Proceedings of the IEEE International Conference on Robotics and Automation, Saint Paul, MN, USA, 14–18 May 2012; pp. 1643–1649. [Google Scholar] [CrossRef]
- Siam, S.M.; Zhang, H. Fast-SeqSLAM: A fast appearance based place recognition algorithm. In Proceedings of the IEEE International Conference on Robotics and Automation, Marina Bay Sands, Singapore, 29 May–3 June 2017; pp. 5702–5708. [Google Scholar] [CrossRef]
- Tsintotas, K.A.; Bampis, L.; Gasteratos, A. DOSeqSLAM: Dynamic on-line sequence based loop closure detection algorithm for SLAM. In Proceedings of the IEEE International Conference on Imaging Systems and Techniques, Kraków, Poland, 16–18 October 2018; pp. 1–6. [Google Scholar] [CrossRef]
- Tsintotas, K.A.; Bampis, L.; Gasteratos, A. Probabilistic appearance-based place recognition through bag of tracked words. IEEE Robot. Autom. Lett. 2019, 4, 1737–1744. [Google Scholar] [CrossRef]
- Neuland, R.; Rodrigues, F.; Pittol, D.; Jaulin, L.; Maffei, R.; Kolberg, M.; Prestes, E. Interval Inspired Approach Based on Temporal Sequence Constraints to Place Recognition. J. Intell. Robot. Syst. Theory Appl. 2021, 102, 1–24. [Google Scholar] [CrossRef]
- Chen, Z.; Maffra, F.; Sa, I.; Chli, M. Only look once, mining distinctive landmarks from ConvNet for visual place recognition. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Vancouver, BC, Canada, 24–28 September 2017; pp. 9–16. [Google Scholar] [CrossRef]
- Wang, Y.; Qiu, Y.; Cheng, P.; Duan, X. Robust loop closure detection integrating visual–spatial–semantic information via topological graphs and CNN features. Remote Sens. 2020, 12, 3890. [Google Scholar] [CrossRef]
- Finman, R.; Paull, L.; Leonard, J.J. Toward Object-based Place Recognition in Dense RGB-D Maps. In Proceedings of the IEEE International Conference on Robotics and Automation workshop, Seattle, WA, USA, 26–30 May 2015. [Google Scholar]
- Stumm, E.; Mei, C.; Lacroix, S.; Chli, M. Location graphs for visual place recognition. In Proceedings of the IEEE International Conference on Robotics and Automation, Seattle, WA, USA, 26–30 May 2015; pp. 5475–5480. [Google Scholar] [CrossRef] [Green Version]
- Gawel, A.; Del Don, C.; Siegwart, R.; Nieto, J.; Cadena, C. X-View: Graph-Based Semantic Multi-view Localization. IEEE Robot. Autom. Lett. 2018, 3, 1687–1694. [Google Scholar] [CrossRef] [Green Version]
- Cascianelli, S.; Costante, G.; Bellocchio, E.; Valigi, P.; Fravolini, M.L.; Ciarfuglia, T.A. Robust visual semi-semantic loop closure detection by a covisibility graph and CNN features. Robot. Auton. Syst. 2017, 92, 53–65. [Google Scholar] [CrossRef]
- Arandjelovic, R.; Gronat, P.; Torii, A.; Pajdla, T.; Sivic, J. NetVLAD: CNN Architecture for Weakly Supervised Place Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 1437–1451. [Google Scholar] [CrossRef] [Green Version]
- Hausler, S.; Garg, S.; Xu, M.; Milford, M.; Fischer, T. Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021. [Google Scholar]
- Sturm, J.; Engelhard, N.; Endres, F.; Burgard, W.; Cremers, D. A benchmark for the evaluation of RGB-D SLAM systems. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Vilamoura-Algarve, Portugal, 7–12 October 2012; pp. 573–580. [Google Scholar] [CrossRef] [Green Version]
- Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision meets robotics: The KITTI dataset. Int. J. Robot. Res. 2013, 32, 1231–1237. [Google Scholar] [CrossRef] [Green Version]
- Lowry, S.; Sunderhauf, N.; Newman, P.; Leonard, J.J.; Cox, D.; Corke, P.; Milford, M.J. Visual Place Recognition: A Survey. IEEE Trans. Robot. 2016, 32, 1–19. [Google Scholar] [CrossRef] [Green Version]
- Masone, C.; Caputo, B. A Survey on Deep Visual Place Recognition. IEEE Access 2021, 9, 19516–19547. [Google Scholar] [CrossRef]
- Chen, Y.; Gan, W.; Zhang, L.; Liu, C.; Wang, X. A survey on visual place recognition for mobile robots localization. In Proceedings of the 2017 14th Web Information Systems and Applications Conference, Liuzhou, China, 11–12 November 2017; pp. 187–192. [Google Scholar] [CrossRef]
- Glover, A.; Maddern, W.; Warren, M.; Reid, S.; Milford, M.; Wyeth, G. OpenFABMAP: An open source toolbox for appearance-based loop closure detection. In Proceedings of the IEEE International Conference on Robotics and Automation, Saint Paul, MN, USA, 14–18 May 2012; pp. 4730–4735. [Google Scholar] [CrossRef] [Green Version]
- Mur-Artal, R.; Tardós, J.D. Fast relocalisation and loop closing in keyframe-based SLAM. In Proceedings of the IEEE International Conference on Robotics and Automation, Hong Kong, China, 31 May–7 June 2014; pp. 846–853. [Google Scholar] [CrossRef] [Green Version]
- Khan, S.; Wollherr, D. IBuILD: Incremental bag of Binary words for appearance based loop closure detection. In Proceedings of the IEEE International Conference on Robotics and Automation, Seattle, WA, USA, 26–30 May 2015; pp. 5441–5447. [Google Scholar] [CrossRef] [Green Version]
- Tsintotas, K.A.; Bampis, L.; Gasteratos, A. Assigning visual words to places for loop closure detection. In Proceedings of the IEEE International Conference on Robotics and Automation, Brisbane, Australia, 21–25 May 2018; pp. 5979–5985. [Google Scholar] [CrossRef]
- Bampis, L.; Amanatiadis, A.; Gasteratos, A. Fast loop-closure detection using visual-word-vectors from image sequences. Int. J. Robot. Res. 2018, 37, 62–82. [Google Scholar] [CrossRef]
- Tsintotas, K.A.; Bampis, L.; Rallis, S.; Gasteratos, A. SeqSLAM with bag of visual words for appearance based loop closure detection. Mech. Mach. Sci. 2019, 67, 580–587. [Google Scholar] [CrossRef]
- Tsintotas, K.A.; Bampis, L.; Gasteratos, A. Modest-vocabulary loop-closure detection with incremental bag of tracked words. Robot. Auton. Syst. 2021, 141, 103782. [Google Scholar] [CrossRef]
- Yu, C.; Liu, Z.; Liu, X.-J.; Xie, F.; Yang, Y.; Wei, Q.; Fei, Q. DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, Madrid, Spain, 1–5 October 2018; pp. 1168–1174. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation}. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Zhang, Z.; Zhang, J.; Tang, Q. Mask R-CNN Based Semantic RGB-D SLAM for Dynamic Scenes. In Proceedings of the IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM, Hong Kong, China, 8–12 July 2019; pp. 1151–1156. [Google Scholar] [CrossRef]
- Kirillov, A.; Girshick, R.; He, K.; Dollár, P. Panoptic FPN. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 6392–6401. [Google Scholar] [CrossRef] [Green Version]
- Merrill, N.; Huang, G. CALC2.0: Combining Appearance, Semantic and Geometric Information for Robust and Efficient Visual Loop Closure. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Macao, China, 4–8 November 2019; pp. 4554–4561. [Google Scholar] [CrossRef] [Green Version]
- Wang, Z.; Peng, Z.; Guan, Y.; Wu, L. Two-Stage vSLAM Loop Closure Detection Based on Sequence Node Matching and Semi-Semantic Autoencoder. J. Intell. Robot. Syst. Theory Appl. 2021, 101, 29. [Google Scholar] [CrossRef]
- Wang, H.; Wang, C.; Xie, L. Online visual place recognition via saliency re-identification. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Macao, China, 4–8 November 2019; pp. 5030–5036. [Google Scholar] [CrossRef]
- Oh, J.H.; Jeon, J.D.; Lee, B.H. Place recognition for visual loop-closures using similarities of object graphs. Electron. Lett. 2015, 51, 44–46. [Google Scholar] [CrossRef]
- Chen, H.; Zhang, G.; Ye, Y. Semantic Loop Closure Detection with Instance-Level Inconsistency Removal in Dynamic Industrial Scenes. IEEE Trans. Ind. Inform. 2021, 17, 2030–2040. [Google Scholar] [CrossRef]
- Kirillov, A.; He, K.; Girshick, R.; Rother, C.; Dollar, P. Panoptic segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–18 June 2020; pp. 9396–9405. [Google Scholar] [CrossRef]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 936–944. [Google Scholar] [CrossRef] [Green Version]
- Muja, M.; Lowe, D.G. Fast approximate nearest neighbors with automatic algorithm configuration. In Proceedings of the VISAPP 4th International Conference on Computer Vision Theory and Applications, Lisboa, Portugal, 5–8 February 2009; Volume 1, pp. 331–340. [Google Scholar] [CrossRef] [Green Version]
- Cadena, C.; Gálvez-López, D.; Tardós, J.D.; Neira, J. Robust place recognition with stereo sequences. IEEE Trans. Robot. 2012, 28, 871–885. [Google Scholar] [CrossRef]
- Paszke, A.; Gross, S.; Chintala, S.; Chanan, G.; Yang, E.; DeVito, Z.; Lin, Z.; Desmaison, A.; Antiga, L.; Lerer, A. Automatic differentiation in PyTorch. In Proceedings of the 31st Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 5, pp. 1–4. [Google Scholar]
- Gao, X.; Wang, R.; Demmel, N.; Cremers, D. LDSO: Direct Sparse Odometry with Loop Closure. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Madrid, Spain, 1–5 October 2018; pp. 2198–2204. [Google Scholar] [CrossRef] [Green Version]
Sequence | DBoW2 | OpenFABMAP | SRLCD | BoTW-LCD | SVG-Loop |
---|---|---|---|---|---|
00 | 58.83 | 30.04 | 68.32 | 63.23 | 73.51 |
02 | 53.23 | 24.43 | 63.38 | 72.62 | 78.07 |
05 | 44.46 | 39.23 | 43.12 | 42.89 | 47.87 |
06 | 47.71 | 35.33 | 33.26 | 52.85 | 58.11 |
07 | 56.35 | 30.96 | 26.20 | 58.49 | 50.46 |
09 | 57.89 | 41.87 | 20.00 | 74.58 | 46.12 |
Dataset | DBoW2 | OpenFABMAP | SRLCD | BoTW-LCD | SVG-Loop | |||||
---|---|---|---|---|---|---|---|---|---|---|
P (%) | R (%) | P (%) | R (%) | P (%) | R (%) | P (%) | R (%) | P (%) | R (%) | |
Room 1 | 100.00 | 68.33 | 100.00 | 65.00 | 100.00 | 58.33 | 100.00 | 78.33 | 100.00 | 86.67 |
Room 2 | 100.00 | 56.41 | 100.00 | 53.84 | 100.00 | 21.79 | 100.00 | 60.26 | 100.00 | 71.79 |
Room 3 | 23.56 | 12.28 | 15.34 | 7.02 | 100.00 | 38.60 | 100.00 | 54.38 | 100.00 | 64.91 |
Room 4 | 20.00 | 3.26 | 13.71 | 5.43 | 12.51 | 6.52 | 100.00 | 28.26 | 100.00 | 21.73 |
Dataset | DBoW2 | OpenFABMAP | SRLCD | BoTW-LCD | SVG-Loop | |||||
---|---|---|---|---|---|---|---|---|---|---|
P (%) | R (%) | P (%) | R (%) | P (%) | R (%) | P (%) | R (%) | P (%) | R (%) | |
Loop 1 | 100.00 | 30.36 | 100.00 | 23.00 | 100.00 | 30.06 | 100.00 | 24.23 | 100.00 | 40.79 |
Loop 2 | 100.00 | 42.20 | 100.00 | 38.04 | 100.00 | 45.95 | 100.00 | 49.06 | 100.00 | 63.20 |
Loop 3 | 5.12 | 1.03 | 3.11 | 0.86 | 74.96 | 16.23 | 100.00 | 19.17 | 100.00 | 21.41 |
Average Time (ms) | ||||
---|---|---|---|---|
KITTI | TUM | Practical Datasets | ||
Panoptic segmentation | 231.5 | 187.3 | 251.8 | |
Semantic Bag of Words | Feature extraction | 13.2 | 11.4 | 14.3 |
Vocabulary generation | 3.6 | 3.5 | 3.4 | |
Semantic landmark Vector | Graph construction | 16.9 | 15.6 | 18.0 |
Vector generation | 1.9 | 1.7 | 1.8 | |
Loop closure detection | 36.5 | 33.1 | 38.6 | |
Total | 303.6 | 252.6 | 327.9 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yuan, Z.; Xu, K.; Zhou, X.; Deng, B.; Ma, Y. SVG-Loop: Semantic–Visual–Geometric Information-Based Loop Closure Detection. Remote Sens. 2021, 13, 3520. https://doi.org/10.3390/rs13173520
Yuan Z, Xu K, Zhou X, Deng B, Ma Y. SVG-Loop: Semantic–Visual–Geometric Information-Based Loop Closure Detection. Remote Sensing. 2021; 13(17):3520. https://doi.org/10.3390/rs13173520
Chicago/Turabian StyleYuan, Zhian, Ke Xu, Xiaoyu Zhou, Bin Deng, and Yanxin Ma. 2021. "SVG-Loop: Semantic–Visual–Geometric Information-Based Loop Closure Detection" Remote Sensing 13, no. 17: 3520. https://doi.org/10.3390/rs13173520
APA StyleYuan, Z., Xu, K., Zhou, X., Deng, B., & Ma, Y. (2021). SVG-Loop: Semantic–Visual–Geometric Information-Based Loop Closure Detection. Remote Sensing, 13(17), 3520. https://doi.org/10.3390/rs13173520