Text Spotting towards Perceptually Aliased Urban Place Recognition
Abstract
1. Introduction
2. Related Work
2.1. Visual Place Recognition
2.2. Scene Text Detection, Recognition, and Spotting
2.3. Datasets
3. Methodology
3.1. Problem Definition
- Ability to identify the place name texts (i.e., filter noise);
- Support varied layouts (i.e., words, lines, or regions);
- Support a variety of output bounding shapes.
- Partial detections caused by occlusion (e.g., Velt gelato vs Eisewelt Gelato);
- Incompetency of the spotter (e.g., candy vs. candy a go go!);
- Inaccurate readings (e.g., allbirds vs albirols);
- Discrepancies between the display name and listed name (e.g., Noa Coffee vs. Noa Cafe Harajuku);
- Commonly used words leading to incorrect matches (e.g., cafe, salon, Harajuku);
- Mismatching spaces (e.g., Gustyiestudio vs. Gu Style Studio Harajuku);
- Repetitive words among texts (e.g., Body Line, LINE, Body Shop);
- Occurrences of symbols (e.g., -,&, #, emojis);
- Accented characters (ć,è);
- Separate identification of place name and tag-line.
3.2. VPRText Pipeline
3.2.1. Word Spotting
3.2.2. Place Text Identification
3.2.3. Text Similarity Search
3.3. Hierarchical VPR Architecture
4. Dataset
4.1. Tokyo Place Text
5. Experiments and Results
5.1. Evaluation of Text Spotters
5.2. Evaluation of VPRText
5.2.1. Appearance Variance—Day and Night Condition
5.2.2. Perceptual Aliasing
5.3. Evaluation of Hierarchical Architecture
6. Discussion
6.1. Text Spotters
6.2. VPRText
6.3. Hierarchical System
6.4. Limitations
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| MDPI | Multidisciplinary Digital Publishing Institute | 
| VPR | Visual Place Recognition | 
| POI | Place of Interest | 
| SoA | State of the Art | 
| GPS | Global Positioning System | 
| E2E | end to end | 
| SLAM | Simultaneous Localization and Mapping | 
References
- Masone, C.; Caputo, B. A Survey on Deep Visual Place Recognition. IEEE Access 2021, 9, 19516–19547. [Google Scholar] [CrossRef]
- Garg, S.; Fischer, T.; Milford, M. Where Is Your Place, Visual Place Recognition? In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, Montreal, QC, Canada, 19–27 August 2021; pp. 4416–4425. [Google Scholar] [CrossRef]
- Arandjelović, R.; Gronat, P.; Torii, A.; Pajdla, T.; Sivic, J. NetVLAD: CNN Architecture for Weakly Supervised Place Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 1437–1451. [Google Scholar] [CrossRef] [PubMed]
- Noh, H.; Araujo, A.; Sim, J.; Weyand, T.; Han, B. Large-Scale Image Retrieval with Attentive Deep Local Features. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 3476–3485. [Google Scholar] [CrossRef]
- Cao, B.; Araujo, A.; Sim, J. Unifying Deep Local and Global Features for Image Search. In Proceedings of the Computer Vision–ECCV 2020, Glasgow, UK, 23–28 August 2020; pp. 726–743. [Google Scholar]
- Weyand, T.; Araujo, A.; Cao, B.; Sim, J. Google Landmarks Dataset v2—A Large-Scale Benchmark for Instance-Level Recognition and Retrieval. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 2572–2581. [Google Scholar] [CrossRef]
- Hettiarachchi, D.; Kamijo, S. Visual and Positioning Information Fusion Towards Urban Place Recognition. SN Comput. Sci. 2023, 4, 44. [Google Scholar] [CrossRef]
- Garg, S.; Suenderhauf, N.; Milford, M. LoST? Appearance-Invariant Place Recognition for Opposite Viewpoints using Visual Semantics. In Proceedings of the Robotics: Science and Systems XIV, Pittsburgh, PA, USA, 16–30 June 2018. [Google Scholar]
- Khaliq, A.; Ehsan, S.; Chen, Z.; Milford, M.; McDonald-Maier, K. A Holistic Visual Place Recognition Approach Using Lightweight CNNs for Significant ViewPoint and Appearance Changes. IEEE Trans. Robot. 2020, 36, 561–569. [Google Scholar] [CrossRef]
- Doan, D.; Latif, Y.; Chin, T.J.; Liu, Y.; Do, T.T.; Reid, I. Scalable Place Recognition Under Appearance Change for Autonomous Driving. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9318–9327. [Google Scholar] [CrossRef]
- Churchill, W.; Newman, P. Experience-based navigation for long-term localisation. Int. J. Robot. Res. 2013, 32, 1645–1661. [Google Scholar] [CrossRef]
- Porav, H.; Maddern, W.; Newman, P. Adversarial Training for Adverse Conditions: Robust Metric Localisation Using Appearance Transfer. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 1011–1018. [Google Scholar] [CrossRef]
- Anoosheh, A.; Sattler, T.; Timofte, R.; Pollefeys, M.; Gool, L.V. Night-to-Day Image Translation for Retrieval-based Localization. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 5958–5964. [Google Scholar] [CrossRef]
- Hong, Z.; Petillot, Y.; Lane, D.; Miao, Y.; Wang, S. TextPlace: Visual Place Recognition and Topological Localization Through Reading Scene Texts. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 2861–2870. [Google Scholar] [CrossRef]
- Li, B.; Zou, D.; Sartori, D.; Pei, L.; Yu, W. TextSLAM: Visual SLAM with Planar Text Features. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 2102–2108. [Google Scholar] [CrossRef]
- Ge, G.; Zhang, Y.; Wang, W.; Jiang, Q.; Hu, L.; Wang, Y. Text-MCL: Autonomous mobile robot localization in similar environment using text-level semantic information. Machines 2022, 10, 169. [Google Scholar] [CrossRef]
- Teichmann, M.; Araujo, A.; Zhu, M.; Sim, J. Detect-To-Retrieve: Efficient Regional Aggregation for Image Search. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 5104–5113. [Google Scholar] [CrossRef]
- Torii, A.; Taira, H.; Sivic, J.; Pollefeys, M.; Okutomi, M.; Pajdla, T.; Sattler, T. Are Large-Scale 3D Models Really Necessary for Accurate Visual Localization? IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 814–829. [Google Scholar] [CrossRef] [PubMed]
- Radwan, N.; Tipaldi, G.D.; Spinello, L.; Burgard, W. Do you see the bakery? Leveraging geo-referenced texts for global localization in public maps. In Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 16–21 May 2016; pp. 4837–4842. [Google Scholar] [CrossRef]
- Yu, J.; Su, J. Visual Place Recognition via Semantic and Geometric Descriptor for Automated Valet Parking. In Proceedings of the 2021 IEEE International Conference on Robotics and Biomimetics (ROBIO), Sanya, China, 6–9 December 2021; pp. 1142–1147. [Google Scholar] [CrossRef]
- Long, S.; He, X.; Yao, C. Scene text detection and recognition: The deep learning era. Int. J. Comput. Vis. 2021, 129, 161–184. [Google Scholar] [CrossRef]
- Chen, X.; Jin, L.; Zhu, Y.; Luo, C.; Wang, T. Text recognition in the wild: A survey. ACM Comput. Surv. (CSUR) 2021, 54, 1–35. [Google Scholar] [CrossRef]
- Tian, S.; Pan, Y.; Huang, C.; Lu, S.; Yu, K.; Tan, C.L. Text Flow: A Unified Text Detection System in Natural Scene Images. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 4651–4659. [Google Scholar] [CrossRef]
- Zhou, X.; Yao, C.; Wen, H.; Wang, Y.; Zhou, S.; He, W.; Liang, J. East: An efficient and accurate scene text detector. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5551–5560. [Google Scholar]
- Ye, J.; Chen, Z.; Liu, J.; Du, B. TextFuseNet: Scene Text Detection with Richer Fused Features. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, Yokohama, Japan, 7–15 January 2021; pp. 516–522. [Google Scholar]
- Zhang, S.X.; Zhu, X.; Hou, J.B.; Liu, C.; Yang, C.; Wang, H.; Yin, X.C. Deep relational reasoning graph network for arbitrary shape text detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 9699–9708. [Google Scholar]
- Liao, M.; Wan, Z.; Yao, C.; Chen, K.; Bai, X. Real-Time Scene Text Detection with Differentiable Binarization. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 11474–11481. [Google Scholar]
- Shi, B.; Bai, X.; Yao, C. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2298–2304. [Google Scholar] [CrossRef] [PubMed]
- Yue, X.; Kuang, Z.; Lin, C.; Sun, H.; Zhang, W. RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020. [Google Scholar]
- Li, H.; Wang, P.; Shen, C.; Zhang, G. Show, attend and read: A simple and strong baseline for irregular text recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 8610–8617. [Google Scholar]
- Lee, J.; Park, S.; Baek, J.; Oh, S.J.; Kim, S.; Lee, H. On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020; pp. 2326–2335. [Google Scholar] [CrossRef]
- Fang, S.; Xie, H.; Wang, Y.; Mao, Z.; Zhang, Y. Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 7098–7107. [Google Scholar]
- Bookstein, F. Principal warps: Thin-Plate splines and the decomposition of deformations. IEEE Trans. Pattern Anal. Mach. Intell. 1989, 11, 567–585. [Google Scholar] [CrossRef]
- Jaderberg, M.; Simonyan, K.; Zisserman, A. Spatial transformer networks. In Proceedings of the Advances in Neural Information Processing Systems 28 (NIPS 2015), Montreal, QC, Canada, 7–12 December 2015. [Google Scholar]
- Liu, Y.; Shen, C.; Jin, L.; He, T.; Chen, P.; Liu, C.; Chen, H. ABCNet v2: Adaptive bezier-curve network for real-time end-to-end text spotting. arXiv 2021, arXiv:2105.03620. [Google Scholar] [CrossRef] [PubMed]
- Qiao, L.; Chen, Y.; Cheng, Z.; Xu, Y.; Niu, Y.; Pu, S.; Wu, F. Mango: A mask attention guided one-stage scene text spotter. arXiv 2020, arXiv:2012.04350. [Google Scholar] [CrossRef]
- Qiao, L.; Tang, S.; Cheng, Z.; Xu, Y.; Niu, Y.; Pu, S.; Wu, F. Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), New York, NY, USA, 7–12 February 2020; pp. 11899–11907. [Google Scholar]
- Chng, C.K.; Liu, Y.; Sun, Y.; Ng, C.C.; Luo, C.; Ni, Z.; Fang, C.; Zhang, S.; Han, J.; Ding, E.; et al. ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text—RRC-ArT. In Proceedings of the 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, Australia, 20–25 September 2019; pp. 1571–1576. [Google Scholar] [CrossRef]
- Nayef, N.; Patel, Y.; Busta, M.; Chowdhury, P.N.; Karatzas, D.; Khlif, W.; Matas, J.; Pal, U.; Burie, J.C.; Liu, C.L.; et al. ICDAR2019 Robust Reading Challenge on Multi-lingual Scene Text Detection and Recognition—RRC-MLT-2019. In Proceedings of the 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, Australia, 20–25 September 2019; pp. 1582–1587. [Google Scholar] [CrossRef]
- Sun, Y.; Ni, Z.; Chng, C.K.; Liu, Y.; Luo, C.; Ng, C.C.; Han, J.; Ding, E.; Liu, J.; Karatzas, D.; et al. ICDAR 2019 Competition on Large-Scale Street View Text with Partial Labeling—RRC-LSVT. In Proceedings of the 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, Australia, 20–25 September 2019; pp. 1557–1562. [Google Scholar] [CrossRef]
- Zhang, C.; Ding, W.; Peng, G.; Fu, F.; Wang, W. Street View Text Recognition With Deep Learning for Urban Scene Understanding in Intelligent Transportation Systems. IEEE Trans. Intell. Transp. Syst. 2021, 22, 4727–4743. [Google Scholar] [CrossRef]
- Zhang, X.; Wang, L.; Su, Y. Visual place recognition: A survey from deep learning perspective. Pattern Recognit. 2021, 113, 107760. [Google Scholar] [CrossRef]
- Torii, A.; Arandjelovic, R.; Sivic, J.; Okutomi, M.; Pajdla, T. 24/7 place recognition by view synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1808–1817. [Google Scholar]
- Sun, Y.; Liu, J.; Liu, W.; Han, J.; Ding, E.; Liu, J. Chinese street view text: Large-scale chinese text reading with partially supervised learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9086–9095. [Google Scholar]
- He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Levenshtein, V.I. Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dokl. 1966, 10, 707–710. [Google Scholar]
- Hettiarachchi, D.; Kamijo, S. Visual and Location Information Fusion for Hierarchical Place Recognition. In Proceedings of the 2022 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 7–9 January 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Haklay, M.; Weber, P. OpenStreetMap: User-Generated Street Maps. IEEE Pervasive Comput. 2008, 7, 12–18. [Google Scholar] [CrossRef]
- Ch’ng, C.K.; Chan, C.S. Total-text: A comprehensive dataset for scene text detection and recognition. In Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 9–15 November 2017; Volume 1, pp. 935–942. [Google Scholar]
- Qiao, L.; Jiang, H.; Chen, Y.; Li, C.; Li, P.; Li, Z.; Zou, B.; Guo, D.; Xu, Y.; Xu, Y.; et al. DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding. In Proceedings of the 30th ACM International Conference on Multimedia, Lisboa, Portugal, 10–14 October 2022; pp. 7355–7358; pp. 7355–7358. [Google Scholar] [CrossRef]
- Sheng, F.; Chen, Z.; Xu, B. NRTR: A no-recurrence sequence-to-sequence model for scene text recognition. In Proceedings of the 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, Australia, 20–25 September 2019; pp. 781–786. [Google Scholar]
- Kuang, Z.; Sun, H.; Li, Z.; Yue, X.; Lin, T.H.; Chen, J.; Wei, H.; Zhu, Y.; Gao, T.; Zhang, W.; et al. MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding. arXiv 2021, arXiv:2108.06543. [Google Scholar]
- Gupta, A.; Vedaldi, A.; Zisserman, A. Synthetic data for text localisation in natural images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2315–2324. [Google Scholar]








| Condition | Method | N.E.D. | ||
|---|---|---|---|---|
| Word | Line | Region | ||
| Day | Mango | 0.668 | 0.757 | 0.746 | 
| ABCNet | 0.679 | 0.778 | 0.684 | |
| Text Perceptron | 0.684 | 0.782 | 0.692 | |
| Mask RCNN | 0.668 | 0.688 | 0.746 | |
| Night | Mango | 0.684 | 0.757 | 0.738 | 
| ABCNet | 0.652 | 0.726 | 0.689 | |
| Text Perceptron | 0.704 | 0.820 | 0.759 | |
| Mask RCNN | 0.684 | 0.718 | 0.738 | |
| Detector | Recognizer | N.E.D. | ||
|---|---|---|---|---|
| Word | Line | Region | ||
| DRRG | ABINet | 0.633 | 0.630 | 0.533 | 
| RobustScanner | 0.613 | 0.605 | 0.508 | |
| SATRN | 0.590 | 0.589 | 0.488 | |
| NRTR_1/8-1/4 | 0.505 | 0.502 | 0.423 | |
| SAR | 0.622 | 0.615 | 0.512 | |
| DB_r50 | ABINet | 0.621 | 0.645 | 0.586 | 
| RobustScanner | 0.604 | 0.638 | 0.573 | |
| SATRN | 0.605 | 0.639 | 0.571 | |
| NRTR_1/8-1/4 | 0.540 | 0.568 | 0.515 | |
| SAR | 0.602 | 0.637 | 0.569 | |
| Criteria | Unit | (%) | |||
|---|---|---|---|---|---|
| K = 1 | K = 3 | K = 5 | K = 10 | ||
| Word Spotting | Text Perceptron | 87.28 | 88.44 | 88.44 | 89.02 | 
| ABCNet | 83.24 | 87.28 | 87.28 | 87.28 | |
| MANGO | 82.08 | 86.71 | 87.86 | 87.86 | |
| Mask RCNN | 81.50 | 86.71 | 87.86 | 87.86 | |
| Text Unit | Word | 78.95 | 85.53 | 85.53 | 86.84 | 
| Line | 82.89 | 86.84 | 86.84 | 86.84 | |
| Region | 80.26 | 86.84 | 86.84 | 86.84 | |
| Alll | 84.21 | 90.79 | 90.79 | 90.79 | |
| Iterative Scoring | with | 84.21 | 90.79 | 90.79 | 90.79 | 
| without | 39.47 | 39.47 | 40.79 | 40.79 | |
| Place Text ID | with (N = 3) | 84.21 | 90.79 | 90.79 | 90.79 | 
| without | 76.32 | 82.89 | 88.16 | 88.16 | |
| Condition | Method | (%) | |||
|---|---|---|---|---|---|
| K = 1 | K = 3 | K = 5 | K = 10 | ||
| Day | VPRText | 84.13 | 84.92 | 84.92 | 85.71 | 
| Image Retrieval | 73.44 | 75.78 | 82.03 | 92.19 | |
| Night | VPRText | 95.74 | 97.87 | 97.87 | 97.87 | 
| Image Retrieval | 71.43 | 81.63 | 85.71 | 93.88 | |
| Condition | Method | (%) | |||
|---|---|---|---|---|---|
| K = 1 | K = 3 | K = 5 | K = 10 | ||
| Day-General | Image Retrieval | 96.15 | 100.00 | 100.00 | 100.00 | 
| VPRText | 88.16 | 88.16 | 88.16 | 89.47 | |
| VPRTextImage | 90.79 | 94.74 | 94.74 | 94.74 | |
| Day-PA | Image Retrieval | 38.00 | 38.00 | 54.00 | 80.00 | 
| VPRText | 78.00 | 80.00 | 80.00 | 80.00 | |
| VPRTextImage | 80.00 | 82.00 | 88.00 | 92.00 | |
| Night-General | Image Retrieval | 91.18 | 94.12 | 94.12 | 100.00 | 
| VPRText | 96.97 | 96.97 | 96.97 | 96.97 | |
| VPRTextImage | 96.97 | 100.00 | 100.00 | 100.00 | |
| Night-PA | Image Retrieval | 28.57 | 57.14 | 71.43 | 85.71 | 
| VPRText | 92.86 | 100.00 | 100.00 | 100.00 | |
| VPRTextImage | 92.86 | 100.00 | 100.00 | 100.00 | |
| Overall | Image Retrieval | 72.88 | 77.40 | 83.05 | 92.66 | 
| VPRText | 87.28 | 88.44 | 88.44 | 89.02 | |
| VPRTextImage | 89.02 | 92.49 | 94.22 | 95.38 | |
| Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. | 
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hettiarachchi, D.; Tian, Y.; Yu, H.; Kamijo, S. Text Spotting towards Perceptually Aliased Urban Place Recognition. Multimodal Technol. Interact. 2022, 6, 102. https://doi.org/10.3390/mti6110102
Hettiarachchi D, Tian Y, Yu H, Kamijo S. Text Spotting towards Perceptually Aliased Urban Place Recognition. Multimodal Technologies and Interaction. 2022; 6(11):102. https://doi.org/10.3390/mti6110102
Chicago/Turabian StyleHettiarachchi, Dulmini, Ye Tian, Han Yu, and Shunsuke Kamijo. 2022. "Text Spotting towards Perceptually Aliased Urban Place Recognition" Multimodal Technologies and Interaction 6, no. 11: 102. https://doi.org/10.3390/mti6110102
APA StyleHettiarachchi, D., Tian, Y., Yu, H., & Kamijo, S. (2022). Text Spotting towards Perceptually Aliased Urban Place Recognition. Multimodal Technologies and Interaction, 6(11), 102. https://doi.org/10.3390/mti6110102
 
        


 
       