Text-MCL: Autonomous Mobile Robot Localization in Similar Environment Using Text-Level Semantic Information
Abstract
:1. Introduction
- (1)
- We propose a coarse-to-fine localization paradigm for mobile robots in similar environments, especially in the long corridor of a high-rise building. Text-level semantic information is used to give coarse localization, and laser LiDar data together with the MCL method are used for fine localization;
- (2)
- The camera mounted on the mobile robot does not capture images in real-time nor process them continuously. It depends on the analysis result of the laser data. When the robot is near a door or elevator area, it adjusts the orientation of the camera to capture images with text content in the best perspective. Thus, the computing cost reduces and the detection and recognition accuracy improves;
- (3)
- A moving strategy is proposed to avoid random walking. This is beneficial to construct a high-quality map with text information. Furthermore, it can help the robot find the door or elevator area and get coarse localization as soon as possible;
- (4)
- Initialization of only 100 particles is needed to scatter in the doorway area where the nearest text is detected and recognized. Then, the weights of the particles are assigned according to their positions. The traditional MCL method is used in the subsequent moving process. If a new text is detected, the mobile repeats the previous two steps and can recover from robot kidnapping. Experimental results show that the Text-MCL approach can achieve a 96.2% successful localization rate and can relocate from robot kidnapping after a short movement, while the conventional method always fails.
2. Related Work
2.1. Mapping and Localization
2.2. Text Extraction and Application in Mobile Robotics
3. Methods
3.1. Strategy of the Robot Movement
Algorithm 1. Autonomous Moving Strategy. |
Input: Raw data from laser scans |
Output: Moving decisions |
1: Initialization and extract the invalid scan areas |
2: if Mode is Mapping then |
3: Go to mapping stage |
4: else if Mode is Localization then |
5: Go to Navigation stage |
6: Execute Localization procedure |
7: end if |
8: Mapping stage |
9: if only one invalid scan area and then |
10: Move toward the invalid scan area and constructs map |
11: else if two invalid scan areas and then |
12: Move toward one side randomly |
13: Keep moving until only one invalid area left |
14: Go to step 9 |
15: end if |
16: Navigation stage |
17: if only one invalid scan area and then |
18: Use AMCL method to localize |
19: else if two invalid scan areas and then |
20: Move toward one side randomly |
21: Detect and recognize text information |
22: Go to step 18 |
23: end if |
24: return |
3.2. Text Detection and Recognition
3.3. Map-Building Using Text-Level Information
3.4. Localization Stage
Algorithm 2. Improved AMCL Algorithm with Text Information. |
Input: sample set , control , observation (raw laser data, image with text), map . |
Output: particle set with optimal estimated pose |
1: Initialization, |
2: uniformly sample poses in a elliptical area near the door or elevator |
3: |
4: robot moves forward |
5: if no text detected then |
6: for to N do |
7: sample_motion_model |
8: measurement_model |
9: |
10: end for |
11: for to N do |
12: Draw sample from with probability |
13: Add to |
14: end for |
15: else |
16: go to step 2, replaces |
17: end if |
18: return |
4. Experiments and Results
4.1. Experimental Setup and Environment
4.2. Text Detection and Recognition Performance
4.3. Global Localization Tests
4.4. Re-Localization from Kidnapping
5. Discussion and Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Meng, J.; Wang, S.; Xie, Y.; Jiang, L.; Li, G.; Liu, C. Efficient re-localization of mobile robot using strategy of finding a missing person. Measurement 2021, 176, 109212. [Google Scholar] [CrossRef]
- Taheri, H.; Xia, Z.C. SLAM; definition and evolution. Eng. Appl. Artif. Intell. 2021, 97, 104032. [Google Scholar] [CrossRef]
- Campos, C.; Elvira, R.; Rodriguez, J.J.G.; Montiel, J.M.M.; Tardos, J.D. ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual–Inertial, and Multimap SLAM. IEEE Trans. Robot. 2021, 37, 1874–1890. [Google Scholar] [CrossRef]
- Xu, L.; Feng, C.; Kamat, V.R.; Menassa, C.C. An occupancy grid mapping enhanced visual SLAM for real-time locating applications in indoor GPS-denied environments. Autom. Constr. 2019, 104, 230–245. [Google Scholar] [CrossRef]
- Fox, D.; Burgard, W.; Dellaert, F.; Thrun, S. Monte carlo localization: Efficient position estimation for mobile robots. AAAI/IAAI 1999, 1999, 2. [Google Scholar]
- Wu, Z.; Yue, Y.; Wen, M.; Zhang, J.; Peng, G.; Wang, D. MSTSL: Multi-sensor based two-step localization in geometrically symmetric environments. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021. [Google Scholar]
- Ge, G.; Zhang, Y.; Jiang, Q.; Wang, W. Visual features assisted robot localization in symmetrical environment using laser SLAM. Sensors 2021, 21, 1772. [Google Scholar] [CrossRef] [PubMed]
- Sarlin, P.-E.; Cadena, C.; Siegwart, R.; Dymczyk, M. From coarse to fine: Robust hierarchical localization at large scale. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 12708–12717. [Google Scholar]
- Qian, R.; Hu, D.; Dinkel, H.; Wu, M.; Xu, N.; Lin, W. Multiple sound sources localization from coarse to fine. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Springer International Publishing: Cham, Switzerland, 2020; pp. 292–308. [Google Scholar]
- Debeunne, C.; Vivet, D. A review of visual-LiDAR fusion based simultaneous localization and mapping. Sensors 2020, 20, 2068. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Mullane, J.; Vo, B.-N.; Adams, M.D. A Random-finite-set approach to bayesian SLAM. IEEE Trans. Robot. 2011, 27, 268–282. [Google Scholar] [CrossRef]
- Bailey, T.; Nieto, J.; Guivant, J.; Stevens, M.; Nebot, E. Consistency of the EKF-SLAM algorithm. In Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, 9–15 October 2006; pp. 3562–3568. [Google Scholar]
- Kokovkina, V.A.; Antipov, V.A.; Kirnos, V.; Priorov, A.L. The algorithm of EKF-SLAM using laser scanning system and fisheye camera. In Proceedings of the 2019 Systems of Signal Synchronization, Generating and Processing in Telecommunications (SYNCHROINFO), Yaroslavl, Russia, 1–3 July 2019; pp. 1–6. [Google Scholar]
- Bahraini, M.S. On the efficiency of SLAM using adaptive unscented kalman filter. Iran. J. Sci. Technol. Trans. Mech. Eng. 2020, 44, 727–735. [Google Scholar] [CrossRef]
- Murphy, K.; Russell, S. Rao-Blackwellised particle filtering for dynamic Bayesian networks. In Sequential Monte Carlo Methods in Practice; Springer: New York, NY, USA, 2001; pp. 499–515. [Google Scholar]
- Grisetti, G.; Stachniss, C.; Burgard, W. Improved techniques for grid mapping with rao-blackwellized particle filters. IEEE Trans. Robot. 2007, 23, 34–46. [Google Scholar] [CrossRef] [Green Version]
- Doucet, A.; de Freitas, J.F.G.; Murphy, K.; Russel, S. Rao-blackwellized partcile filtering for dynamic bayesian networks. In Proceedings of the Conference on Uncertainty in Artifificial Intelligence (UAI), Stanford, CA, USA, 30 June–3 July 2000; pp. 176–183. [Google Scholar]
- Hess, W.; Kohler, D.; Rapp, H.; Andor, D. Real-time loop closure in 2D rangefinder SLAM. In Proceedings of the IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 16–21 May 2016; pp. 1271–1278. [Google Scholar]
- Thrun, S. Probabilistic robotics. Commun. ACM 2002, 45, 52–57. [Google Scholar] [CrossRef]
- Alatise, M.B.; Hancke, G.P. Pose Estimation of a mobile robot based on fusion of IMU data and vision data using an extended kalman filter. Sensors 2017, 17, 2164. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Thrun, S.; Fox, D.; Burgard, W.; Dellaert, F. Robust Monte Carlo localization for mobile robots. Artif. Intell. 2001, 128, 99–141. [Google Scholar] [CrossRef] [Green Version]
- Xu, S.; Chou, W.; Dong, H. A Robust indoor localization system integrating visual localization aided by CNN-based image retrieval with Monte Carlo localization. Sensors 2019, 19, 249. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Long, S.; He, X.; Yao, C. Scene text detection and recognition: The deep learning era. Int. J. Comput. Vis. 2021, 129, 161–184. [Google Scholar] [CrossRef]
- Neumann, L.; Matas, J. Real-time scene text localization and recognition. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 3538–3545. [Google Scholar]
- Epshtein, B.; Ofek, E.; Wexler, Y. Detecting text in natural scenes with stroke width transform. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 2963–2970. [Google Scholar]
- Lee, J.-J.; Lee, P.-H.; Lee, S.-W.; Yuille, A.; Koch, C. AdaBoost for text detection in natural scene. In Proceedings of the 2011 International Conference on Document Analysis and Recognition(ICDAR), Beijing, China, 18–21 September 2011; pp. 429–434. [Google Scholar]
- Yao, C.; Bai, X.; Shi, B.; Liu, W. Strokelets: A learned multi-scale representation for scene text recognition. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 4042–4049. [Google Scholar]
- Agbemenu, A.S.; Yankey, J.; Addo, E.O. An automatic number plate recognition system using opencv and tesseract ocr engine. Int. J. Comput. Appl. 2018, 180, 1–5. [Google Scholar]
- Tafti, A.P.; Baghaie, A.; Assefi, M.; Arabnia, H.R.; Yu, Z.; Peissig, P. OCR as a service: An experimental evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym. In Proceedings of the Springer International Symposium on Visual Computing, Las Vegas, NV, USA, 12–14 December 2016. [Google Scholar]
- Liao, M.; Shi, B.; Bai, X.; Wang, X.; Liu, W. Textboxes: A fast text detector with a single deep neural network. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; pp. 4161–4167. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision; Springer: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
- Tian, Z.; Huang, W.; He, T.; He, P.; Qiao, Y. Detecting text in natural image with connectionist text proposal network. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 56–72. [Google Scholar] [CrossRef] [Green Version]
- Zhou, X.; Yao, C.; Wen, H.; Wang, Y.; Zhou, S.; He, W.; Liang, J. EAST: An efficient and accurate scene text detector. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Ly, N.T.; Nguyen, C.T.; Nakagawa, M. An attention-based row-column encoder-decoder model for text recognition in Japanese historical documents. Pattern Recognit. Lett. 2020, 136, 134–141. [Google Scholar] [CrossRef]
- Shi, B.; Bai, X.; Yao, C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 2298–2304. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Liu, X.; Liang, D.; Yan, S.; Chen, D.; Qiao, Y.; Yan, J. FOTS: Fast oriented text spotting with a unified network. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 5676–5685. [Google Scholar]
- Tomono, M.; Yuta, S. Mobile robot navigation in indoor environments using object and character recognition. In Proceedings of the Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065), San Francisco, CA, USA, 24–28 April 2000; pp. 313–320. [Google Scholar]
- Radwan, N.; Tipaldi, G.D.; Spinello, L.; Burgard, W. Do you see the bakery? Leveraging geo-referenced texts for global localization in public maps. In Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 16–21 May 2016; pp. 4837–4842. [Google Scholar]
Text Category | Room Number | Floor Number |
---|---|---|
One time result | 95.3 | 98.6 |
Two of three times result | 98.4 | 100 |
Placed Location | Rotation Motion | Translation Motion | ||||||
---|---|---|---|---|---|---|---|---|
Average Number of Iterations | Successful Localization Times | Average Number of Iterations | Successful Localization Times | |||||
AMCL | Text-MCL | AMCL | Text-MCL | AMCL | Text-MCL | AMCL | Text-MCL | |
# 1 | 75 | - | 3 | - | 77 | 26 | 6 | 10 |
# 2 | 123 | - | 0 | - | 95 | 19 | 2 | 10 |
# 3 | 129 | - | 1 | - | 117 | 16 | 0 | 10 |
# 4 | 137 | - | 0 | - | 126 | 18 | 2 | 10 |
# 5 | 132 | - | 0 | - | 135 | 19 | 0 | 10 |
# 6 | 153 | - | 0 | - | 131 | 25 | 1 | 9 |
# 7 | 146 | - | 0 | - | 134 | 19 | 1 | 10 |
# 8 | 137 | - | 0 | - | 137 | 27 | 0 | 10 |
# 9 | 131 | - | 0 | - | 129 | 20 | 1 | 8 |
# 10 | 129 | - | 0 | - | 105 | 18 | 0 | 9 |
# 11 | 118 | - | 0 | - | 79 | 19 | 3 | 10 |
# 12 | 74 | - | 8 | - | 53 | 20 | 7 | 10 |
Method | Sampling Particles | Iterations (Minimum) | Iterations (Maximum) | Iterations (Average) | Moving Distance (Average)/m | Success Rate |
---|---|---|---|---|---|---|
AMCL | 500 | 42 | 128 | 74 | 12.53 | 19.6% |
AMCL | 5000 | 47 | 193 | 93 | 17.46 | 22.9% |
Text-MCL | 100 | 16 | 47 | 31 | 2.72 | 96.2% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ge, G.; Zhang, Y.; Wang, W.; Jiang, Q.; Hu, L.; Wang, Y. Text-MCL: Autonomous Mobile Robot Localization in Similar Environment Using Text-Level Semantic Information. Machines 2022, 10, 169. https://doi.org/10.3390/machines10030169
Ge G, Zhang Y, Wang W, Jiang Q, Hu L, Wang Y. Text-MCL: Autonomous Mobile Robot Localization in Similar Environment Using Text-Level Semantic Information. Machines. 2022; 10(3):169. https://doi.org/10.3390/machines10030169
Chicago/Turabian StyleGe, Gengyu, Yi Zhang, Wei Wang, Qin Jiang, Lihe Hu, and Yang Wang. 2022. "Text-MCL: Autonomous Mobile Robot Localization in Similar Environment Using Text-Level Semantic Information" Machines 10, no. 3: 169. https://doi.org/10.3390/machines10030169
APA StyleGe, G., Zhang, Y., Wang, W., Jiang, Q., Hu, L., & Wang, Y. (2022). Text-MCL: Autonomous Mobile Robot Localization in Similar Environment Using Text-Level Semantic Information. Machines, 10(3), 169. https://doi.org/10.3390/machines10030169