Research and Implementation of Travel Aids for Blind and Visually Impaired People
Abstract
1. Introduction
- We present a portable and practical travel assistance device for the visually impaired. The system leverages the NVIDIA Jetson Nano for edge computing, integrates an Intel RealSense D435i depth camera for environment sensing, and utilizes an Arduino microcontroller with SG90 servos to provide intuitive vibration feedback, forming a robust hardware foundation for blind navigation.
- In response to the computational limitations of edge devices, we propose a novel, lightweight deep learning model for joint obstacle detection and walkable path segmentation. This model uniquely integrates a multi-scale attention mechanism with an efficient Mamba architecture and adaptive context-aware processing. Crucially, it achieves a special balance between high accuracy and real-time efficiency on embedded platforms.
- Extensive experimental validation demonstrates the superiority of our solution. Compared to state-of-the-art lightweight models (such as YOLOv9c-seg and YOLOv10n), our approach achieves exceptional accuracy in both tasks while maintaining an extremely compact model size (approximately 5MB) and high frame rates (>90 FPS), which are essential for real-time assistance for the visually impaired.
2. Related Work
3. Hardware System Build
4. Method
4.1. Multi-Scale Attention Feature Extraction Backbone
4.2. Dual-Stream Feature Fusion Module
4.3. Adaptive Context-Aware Detection and Segmentation Head
5. Experiments and Results
5.1. Data Acquisition and Production
5.2. Image Processing
5.3. Experimental Setup and Environment
5.4. Ablation Experiments
5.5. Comparative Experiments
5.6. Algorithm Results Visualization and Analysis
5.7. Real-World Scenario Test Visualization and Analysis
6. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Bourne, R.; Steinmetz, J.D.; Flaxman, S.; Briant, P.S.; Taylor, H.R.; Resnikoff, S.; Casson, R.J.; Abdoli, A.; Abu-Gharbieh, E.; Afshin, A.; et al. Trends in prevalence of blindness and distance and near vision impairment over 30 years: An analysis for the Global Burden of Disease Study. Lancet Glob. Health 2021, 9, e130–e143. [Google Scholar] [CrossRef] [PubMed]
- Xu, P.; Kennedy, G.A.; Zhao, F.-Y.; Zhang, W.-J.; Van Schyndel, R. Wearable obstacle avoidance electronic travel aids for blind and visually impaired individuals: A systematic review. IEEE Access 2023, 11, 66587–66613. [Google Scholar] [CrossRef]
- Wu, Y.; Gao, G.; Wu, M.; Wang, X.; Xue, P.; Gou, B. Survey and analysis of the application situation of urban barrier-free facilities. Chin. J. Tissue Eng. Res. 2020, 24, 271. [Google Scholar]
- Petsiuk, A.L.; Pearce, J.M. Low-cost open-source ultrasound-sensing based navigational support for the visually impaired. Sensors 2019, 19, 3783. [Google Scholar] [CrossRef] [PubMed]
- Papagianopoulos, I.; De Mey, G.; Kos, A.; Wiecek, B.; Chatziathasiou, V. Obstacle detection in infrared navigation for blind people and mobile robots. Sensors 2023, 23, 7198. [Google Scholar] [CrossRef] [PubMed]
- Wu, Z.H.; Rong, X.W.; Fan, Y. Review of research on guide robots. Comput. Eng. Appl. 2020, 56, 1–13. [Google Scholar]
- Lu, C.-L.; Liu, Z.-Y.; Huang, J.-T.; Huang, C.-I.; Wang, B.-H.; Chen, Y.; Wu, N.-H.; Wang, H.-C.; Giarré, L.; Kuo, P.-Y. Assistive navigation using deep reinforcement learning guiding robot with UWB/voice beacons and semantic feedbacks for blind and visually impaired people. Front. Robot. AI 2021, 8, 654132. [Google Scholar] [CrossRef] [PubMed]
- Arulkumaran, K.; Deisenroth, M.P.; Brundage, M.; Bharath, A.A. Deep reinforcement learning: A brief survey. IEEE Signal Process. Mag. 2023, 34, 26–38. [Google Scholar] [CrossRef]
- Cao, Z.; Xu, X.; Hu, B.; Zhou, M. Rapid detection of blind roads and crosswalks by using a lightweight semantic segmentation network. IEEE Trans. Intell. Transp. Syst. 2020, 22, 6188–6197. [Google Scholar] [CrossRef]
- Dimas, G.; Diamantis, D.E.; Kalozoumis, P.; Iakovidis, D.K. Uncertainty-aware visual perception system for outdoor navigation of the visually challenged. Sensors 2020, 20, 2385–2394. [Google Scholar] [CrossRef] [PubMed]
- Ma, Y.; Xu, Q.; Wang, Y.; Wu, J.; Long, C.; Lin, Y.-B. EOS: An efficient obstacle segmentation for blind guiding. Future Gener. Comput. Syst. 2023, 140, 117–128. [Google Scholar] [CrossRef]
- Hsieh, Y.-Z.; Lin, S.-S.; Xu, F.-X. Development of a wearable guide device based on convolutional neural network for blind or visually impaired persons. Multimed. Tools Appl. 2020, 79, 29473–29491. [Google Scholar] [CrossRef]
- Suman, S.; Mishra, S.; Sahoo, K.S.; Nayyar, A.; Scioscia, F. Vision navigator: A smart and intelligent obstacle recognition model for visually impaired users. Mob. Inf. Syst. 2022, 33, 891–971. [Google Scholar] [CrossRef]
- Mai, C.; Chen, H.; Zeng, L.; Li, Z.; Liu, G.; Qiao, Z.; Qu, Y.; Li, L.; Li, L. A smart cane based on 2D LiDAR and RGB-D camera sensor-realizing navigation and obstacle recognition. Sensors 2024, 24, 870–886. [Google Scholar] [CrossRef] [PubMed]
- Chen, Z.; Liu, X.; Kojima, M.; Huang, Q.; Arai, T. A wearable navigation device for visually impaired people based on the real-time semantic visual SLAM system. Sensors 2021, 21, 1536–1542. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.; Liang, L.; Zhao, S.; Wang, Z. GRFB-UNet: A new multi-scale attention network with group receptive field block for tactile paving segmentation. Expert Syst. Appl. 2024, 238, 122109. [Google Scholar] [CrossRef]
- Zhang, Y.; Chen, H.; He, Y.; Ye, M.; Cai, X.; Zhang, D. Road segmentation for all-day outdoor robot navigation. Neurocomputing 2018, 314, 316–325. [Google Scholar] [CrossRef]
- Wang, C.Y.; Yeh, I.H.; Mark Liao, H.Y. Yolov9: Learning what you want to learn using programmable gradient information. In Proceedings of the European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024; Springer: Cham, Switzerland, 2025; pp. 1–21. [Google Scholar]
- Khanam, R.; Hussain, M. Yolov11: An overview of the key architectural enhancements. arXiv 2024, arXiv:2410.17725. [Google Scholar] [CrossRef]
- Ravi, N.; Gabeur, V.; Hu, Y.T.; Hu, R.; Ryali, C.; Ma, T.; Khedr, H.; Rädle, R.; Rolland, C.; Gustafson, L.; et al. Sam 2: Segment anything in images and videos. arXiv 2024, arXiv:2408.00714. [Google Scholar] [PubMed]
- Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Liu, Y.; Chen, J. Detrs beat yolos on real-time object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 6–22 June 2024; pp. 16965–16974. [Google Scholar]
- Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J. Yolov10: Real-time end-to-end object detection. arXiv 2024, arXiv:2405.14458. [Google Scholar]
- Liu, S.; Zeng, Z.; Ren, T.; Li, F.; Zhang, H.; Yang, J.; Jiang, Q.; Li, C.; Yang, J.; Su, H.; et al. Grounding dino: Marrying dino with grounded pre-training for open-set object detection. In Proceedings of the European Conference on Computer Vision, Dublin, Ireland, 17–18 September 2025; Springer: Cham, Switzerland, 2025; pp. 38–55. [Google Scholar]
NVIDIA Jetson Nano | |
---|---|
GPU | 128-core Maxwell |
CPU | Quad-core ARM A57 @ 1.43 GHZ |
Memory | 4 GB 64-bit LPDDR4 25.6 GB/s |
Storage | microSD (not included) |
Video encoding | 4K @ 30|4× 1080p @ 30|9× 720p @ 30 [H.264/H.265) |
Video decoding | 4K @ 60|2× 4K @ 30|8× 1080p @ 30 [H.264/H.265) |
Camera | 1× MIPI CS1-2 DPHY lanes |
Connectivity | Gigabit Ethernet, M.2 Key E |
Display | HDMI 2.0 and eDP 1.4 |
USB interface | 4× USB 3.0, USB 2.0 Micro-B |
Other | GPIO, 2C, 2S, SPI. UART |
Mechanical part | 69 mm × 45 mm, 260-pin edge connector |
Project | Configure |
---|---|
Operating System | Ubuntu20.04 |
Graphics Card | GeForce RTX 3060 (12 GB) |
CUDA Version | 11.8 |
Python | 3.8.16 |
Deep Learning Framework | Pytorch1.13.1 |
Model | Model Size | Parameters | GFLOPs | mAPmask | FPS |
---|---|---|---|---|---|
Baseline | 6.8 | 3.26 | 12.1 | 0.952 | 67 |
+MSAFEB | 5.6 | 2.74 | 10.2 | 0.963 | 94 |
+DSFFM | 6.4 | 3.18 | 11.8 | 0.966 | 85 |
+ACADSH | 5.9 | 2.96 | 10.5 | 0.961 | 88 |
+MSAFEB+DSFFM | 5.9 | 2.84 | 10.9 | 0.971 | 95 |
+MSAFEB+ACADSH | 5.4 | 2.79 | 10.3 | 0.972 | 94 |
+DSFFH+ACADSH | 6.1 | 3.09 | 11.2 | 0.971 | 90 |
+MSAFEB+DSFFE+ACADSH | 5.1 | 2.69 | 9.8 | 0.979 | 98 |
Model | Model Size | Parameters | GFLOPs | mAP | FPS |
---|---|---|---|---|---|
Baseline | 6.3 | 3.01 | 8.2 | 0.726 | 73 |
+MSAFEB | 5.7 | 2.86 | 7.9 | 0.741 | 93 |
+DSFFM | 6.2 | 2.97 | 6.7 | 0.732 | 87 |
+ACADSH | 5.9 | 2.91 | 7.3 | 0.728 | 85 |
+MSAFEB+DSFFM | 5.4 | 2.64 | 6.5 | 0.749 | 97 |
+MSAFEB+ACADSH | 5.8 | 2.63 | 7.1 | 0.743 | 95 |
+DSFFH+ACADSH | 6.0 | 2.81 | 6.9 | 0.741 | 92 |
+MSAFEB+DSFFE+ACADSH | 5.2 | 2.47 | 6.1 | 0.757 | 98 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, J.; Xu, S.; Ma, M.; Ma, J.; Li, C. Research and Implementation of Travel Aids for Blind and Visually Impaired People. Sensors 2025, 25, 4518. https://doi.org/10.3390/s25144518
Xu J, Xu S, Ma M, Ma J, Li C. Research and Implementation of Travel Aids for Blind and Visually Impaired People. Sensors. 2025; 25(14):4518. https://doi.org/10.3390/s25144518
Chicago/Turabian StyleXu, Jun, Shilong Xu, Mingyu Ma, Jing Ma, and Chuanlong Li. 2025. "Research and Implementation of Travel Aids for Blind and Visually Impaired People" Sensors 25, no. 14: 4518. https://doi.org/10.3390/s25144518
APA StyleXu, J., Xu, S., Ma, M., Ma, J., & Li, C. (2025). Research and Implementation of Travel Aids for Blind and Visually Impaired People. Sensors, 25(14), 4518. https://doi.org/10.3390/s25144518