YOLO-Extreme: Obstacle Detection for Visually Impaired Navigation Under Foggy Weather
Abstract
1. Introduction
- (1)
- We propose YOLO-Extreme, a novel object detection framework designed to address the challenges of foggy weather conditions in visually impaired navigation. Our method incorporates the Dual-Branch Bottleneck Block (DBB), the Multi-Dimensional Collaborative Attention Module (MCAM), and the Channel-Selective Fusion Block (CSFB) for enhanced robustness and accuracy.
- (2)
- To effectively extract both local spatial and global semantic features, we integrate the Dual-Branch Bottleneck Block (DBB) into the backbone and neck, which significantly improves the discrimination of blurred or low-contrast obstacles caused by fog.
- (3)
- To suppress background noise and highlight salient obstacle cues in cluttered environments, we introduce the Multi-Dimensional Collaborative Attention Module (MCAM), which adaptively aggregates spatial attention from multiple dimensions to enhance feature representation under foggy weather conditions.
- (4)
- To enable robust and adaptive multi-scale feature fusion, the Channel-Selective Fusion Block (CSFB) is incorporated into the neck, dynamically recalibrating channel-wise contributions from high- and low-resolution features, and thereby stabilizing detection performance in complex and dynamic scenes.
- (5)
- Experimental results on the publicly available RTTS dataset [24] and Foggy Cityscapes dataset demonstrate that YOLO-Extreme achieves superior performance in obstacle detection under foggy weather conditions, validating its potential as a practical navigational aid for visually impaired individuals.
2. Related Work
2.1. Advances in Object Detection
2.2. Detection in Challenging Weather
2.3. Similar Works
3. Methodology
3.1. Architecture of the Proposed Method
3.2. Dual-Branch Bottleneck Block
3.3. Multi-Dimensional Collaborative Attention Module
3.4. Channel-Selective Fusion Block
4. Experiment
4.1. Datasets and Evaluation Metrics
4.2. Implementation Details
4.3. Ablation Experiments
4.4. Comparative Experiments
5. Visualization
6. Discussion
- 1.
- Evaluation under varying fog densities:
- 2.
- Adaptation to other adverse weather conditions:
- 3.
- Model complexity and large-scale deployment:
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zafar, S.; Asif, M.; Ahmad, M.B.; Ghazal, T.M.; Faiz, T.; Ahmad, M.; Khan, M.A. Assistive devices analysis for visually impaired persons: A review on taxonomy. IEEE Access 2022, 10, 13354–13366. [Google Scholar] [CrossRef]
- Dakopoulos, D.; Bourbakis, N.G. Wearable obstacle avoidance electronic travel aids for blind: A survey. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2009, 40, 25–35. [Google Scholar] [CrossRef]
- Tapu, R.; Mocanu, B.; Bursuc, A.; Zaharia, T. A smartphone-based obstacle detection and classification system for assisting visually impaired people. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Sydney, Australia, 2–8 December 2013; pp. 444–451. [Google Scholar]
- Vaidya, S.; Shah, N.; Shah, N.; Shankarmani, R. Real-time object detection for visually challenged people. In Proceedings of the 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 13–15 May 2020; pp. 311–316. [Google Scholar]
- Sakaridis, C.; Dai, D.; Van Gool, L. Semantic foggy scene understanding with synthetic data. Int. J. Comput. Vis. 2018, 126, 973–992. [Google Scholar] [CrossRef]
- Halder, S.S.; Lalonde, J.F.; Charette, R.d. Physics-based rendering for improving robustness to rain. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 10203–10212. [Google Scholar]
- Sakaridis, C.; Dai, D.; Hecker, S.; Van Gool, L. Model adaptation with synthetic and real data for semantic dense foggy scene understanding. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 687–704. [Google Scholar]
- Liu, X.; Ma, Y.; Shi, Z.; Chen, J. Griddehazenet: Attention-based multi-scale network for image dehazing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 7314–7323. [Google Scholar]
- Saegusa, S.; Yasuda, Y.; Uratani, Y.; Tanaka, E.; Makino, T.; Chang, J.Y.J. Development of a guide-dog robot: Leading and recognizing a visually-handicapped person using a LRF. J. Adv. Mech. Des. Syst. Manuf. 2010, 4, 194–205. [Google Scholar] [CrossRef]
- dos Santos, A.D.P.; Medola, F.O.; Cinelli, M.J.; Garcia Ramirez, A.R.; Sandnes, F.E. Are electronic white canes better than traditional canes? A comparative study with blind and blindfolded participants. Univers. Access Inf. Soc. 2021, 20, 93–103. [Google Scholar] [CrossRef]
- Manduchi, R.; Coughlan, J.M. The last meter: Blind visual guidance to a target. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Toronto, ON, Canada, 26 April–1 May 2014; pp. 3113–3122. [Google Scholar]
- Chu, Z. D-YOLO: A robust framework for object detection in adverse weather conditions. arXiv 2024, arXiv:2403.09233. [Google Scholar]
- Tahir, N.U.A.; Zhang, Z.; Asim, M.; Chen, J.; ELAffendi, M. Object detection in autonomous vehicles under adverse weather: A review of traditional and deep learning approaches. Algorithms 2024, 17, 103. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part I 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 91–99. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar]
- Meng, L.; Gang, Z.; Yawei, Y.; Jun, S. Survey of Object Detection Methods Under Adverse Weather Conditions. J. Comput. Eng. Appl. 2022, 58, 36. [Google Scholar]
- Tian, Y.; Ye, Q.; Doermann, D. Yolov12: Attention-centric real-time object detectors. arXiv 2025, arXiv:2502.12524. [Google Scholar]
- Zhong, J.; Chen, J.; Mian, A. DualConv: Dual convolutional kernels for lightweight deep neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2022, 34, 9528–9535. [Google Scholar] [CrossRef] [PubMed]
- Yu, Y.; Zhang, Y.; Cheng, Z.; Song, Z.; Tang, C. MCA: Multidimensional collaborative attention in deep convolutional neural networks for image recognition. Eng. Appl. Artif. Intell. 2023, 126, 107079. [Google Scholar] [CrossRef]
- Xie, L.; Li, C.; Wang, Z.; Zhang, X.; Chen, B.; Shen, Q.; Wu, Z. SHISRCNet: Super-resolution and classification network for low-resolution breast cancer histopathology image. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Vancouver, BC, Canada, 8–12 October 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 23–32. [Google Scholar]
- Li, B.; Ren, W.; Fu, D.; Tao, D.; Feng, D.; Zeng, W.; Wang, Z. Benchmarking single-image dehazing and beyond. IEEE Trans. Image Process. 2018, 28, 492–505. [Google Scholar] [CrossRef]
- Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object detection in 20 years: A survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
- Terven, J.; Córdova-Esparza, D.M.; Romero-González, J.A. A comprehensive review of yolo architectures in computer vision: From yolov1 to yolov8 and yolo-nas. Mach. Learn. Knowl. Extr. 2023, 5, 1680–1716. [Google Scholar] [CrossRef]
- Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J. Yolov10: Real-time end-to-end object detection. Adv. Neural Inf. Process. Syst. 2024, 37, 107984–108011. [Google Scholar]
- Khanam, R.; Hussain, M. Yolov11: An overview of the key architectural enhancements. arXiv 2024, arXiv:2410.17725. [Google Scholar]
- Zhang, S.; Chi, C.; Yao, Y.; Lei, Z.; Li, S.Z. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 9759–9768. [Google Scholar]
- Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 87–110. [Google Scholar] [CrossRef]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
- Gu, A.; Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. arXiv 2023, arXiv:2312.00752. [Google Scholar]
- Gupta, H.; Kotlyar, O.; Andreasson, H.; Lilienthal, A.J. Robust object detection in challenging weather conditions. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2024; pp. 7523–7532. [Google Scholar]
- Khatun, A.; Haque, M.R.; Basri, R.; Uddin, M.S. Single image dehazing: An analysis on generative adversarial network. Int. J. Comput. Sci. Netw. Secur. 2024, 24, 136–142. [Google Scholar] [CrossRef]
- Li, B.; Peng, X.; Wang, Z.; Xu, J.; Feng, D. An all-in-one network for dehazing and beyond. arXiv 2017, arXiv:1707.06543. [Google Scholar]
- Li, J.; Xu, R.; Ma, J.; Zou, Q.; Ma, J.; Yu, H. Domain adaptive object detection for autonomous driving under foggy weather. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2–7 January 2023; pp. 612–622. [Google Scholar]
- Liu, W.; Ren, G.; Yu, R.; Guo, S.; Zhu, J.; Zhang, L. Image-adaptive YOLO for object detection in adverse weather conditions. Proc. AAAI Conf. Artif. Intell. 2022, 36, 1792–1800. [Google Scholar] [CrossRef]
- Qin, Q.; Chang, K.; Huang, M.; Li, G. DENet: Detection-driven enhancement network for object detection under adverse weather conditions. In Proceedings of the Asian Conference on Computer Vision (ACCV), Macao, China, 4–8 December 2022; pp. 2813–2829. [Google Scholar]
- Wang, L.; Qin, H.; Zhou, X.; Lu, X.; Zhang, F. R-YOLO: A robust object detector in adverse weather. IEEE Trans. Instrum. Meas. 2022, 72, 1–11. [Google Scholar] [CrossRef]
- He, Y.; Liu, Z. A feature fusion method to improve the driving obstacle detection under foggy weather. IEEE Trans. Transp. Electrif. 2021, 7, 2505–2515. [Google Scholar] [CrossRef]
- Gharatappeh, S.; Neshatfar, S.; Sekeh, S.Y.; Dhiman, V. FogGuard: Guarding YOLO against fog using perceptual loss. arXiv 2024, arXiv:2403.08939. [Google Scholar]
- Ding, Q.; Li, P.; Yan, X.; Shi, D.; Liang, L.; Wang, W.; Xie, H.; Li, J.; Wei, M. CF-YOLO: Cross fusion YOLO for object detection in adverse weather with a high-quality real snow dataset. IEEE Trans. Intell. Transp. Syst. 2023, 24, 10749–10759. [Google Scholar] [CrossRef]
- Sun, M.; Zhu, J.; Yang, B.; Huang, J.; Zhang, X. Pedestrian detection in foggy weather through YOLOv8 based on FEAttention. In Proceedings of the Computer Graphics International Conference, Geneva, Switzerland, 1–5 July 2024; pp. 108–120. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 11–15 June 2016; pp. 770–778. [Google Scholar]
- Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 3213–3223. [Google Scholar]
- Jocher, G.; Chaurasia, A.; Qiu, J. YOLO by Ultralytics. aGPL-3.0 License. 2023. Available online: https://github.com/ultralytics/ultralytics (accessed on 15 June 2025).
- Li, B.; Peng, X.; Wang, Z.; Xu, J.; Feng, D. AOD-Net: All-in-one dehazing network. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4770–4778. [Google Scholar]
- Dong, H.; Pan, J.; Xiang, L.; Hu, Z.; Zhang, X.; Wang, F.; Yang, M.-H. Multi-scale boosted dehazing network with dense feature fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2157–2167. [Google Scholar]
- He, K.; Sun, J.; Tang, X. Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 2341–2353. [Google Scholar]
- Huang, S.-C.; Le, T.-H.; Jaw, D.-W. DSNet: Joint semantic learning for object detection in inclement weather conditions. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 2623–2633. [Google Scholar] [CrossRef]
- Hnewa, M.; Radha, H. Multiscale domain adaptive YOLO for cross-domain object detection. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; pp. 3323–3327. [Google Scholar]
- Lei, M.; Li, S.; Wu, Y.; Hu, H.; Zhou, Y.; Zheng, X.; Ding, G.; Du, S.; Wu, Z.; Gao, Y. YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception. arXiv 2025, arXiv:2506.17733. [Google Scholar]
Models | DBB | MCAM | CSFB | Person | Bus | Car | Motor | Bicycle | All |
---|---|---|---|---|---|---|---|---|---|
YOLOv12-n | 0.507 | 0.445 | 0.581 | 0.446 | 0.434 | 0.482 | |||
✓ | 0.520 | 0.447 | 0.596 | 0.471 | 0.441 | 0.495 | |||
✓ | 0.526 | 0.449 | 0.597 | 0.453 | 0.436 | 0.492 | |||
✓ | 0.523 | 0.467 | 0.603 | 0.459 | 0.433 | 0.497 | |||
✓ | ✓ | 0.525 | 0.449 | 0.598 | 0.459 | 0.441 | 0.496 | ||
✓ | ✓ | 0.524 | 0.452 | 0.600 | 0.463 | 0.442 | 0.498 | ||
✓ | ✓ | 0.527 | 0.454 | 0.602 | 0.465 | 0.445 | 0.499 | ||
✓ | ✓ | ✓ | 0.530 | 0.457 | 0.605 | 0.468 | 0.447 | 0.501 |
Models | DBB | MCAM | CSFB | Flops/G | Parameters/M |
---|---|---|---|---|---|
YOLOv12-n | 5.8 | 2.51 | |||
√ | 6.3 | 2.55 | |||
√ | 6.3 | 2.56 | |||
√ | 9.1 | 3.44 | |||
√ | √ | 6.4 | 2.56 | ||
√ | √ | 9.1 | 3.45 | ||
√ | √ | 9.2 | 3.45 | ||
√ | √ | √ | 9.1 | 3.45 |
Method | Person | Bicycle | Car | Motor | Bus | All |
---|---|---|---|---|---|---|
Yolov8 [46] | 0.623 | 0.387 | 0.465 | 0.273 | 0.161 | 0.381 |
Yolov8-C [46] | 0.619 | 0.364 | 0.157 | 0.241 | 0.155 | 0.367 |
AOD-YOLOv8 [47] | 0.598 | 0.358 | 0.407 | 0.233 | 0.130 | 0.345 |
MSBDN-YOLOv8 [48] | 0.589 | 0.374 | 0.393 | 0.209 | 0.120 | 0.337 |
Griddehaze-YOLOv8 [8] | 0.612 | 0.386 | 0.453 | 0.258 | 0.146 | 0.371 |
DCP-YOLOv8 [49] | 0.621 | 0.393 | 0.417 | 0.237 | 0.139 | 0.361 |
IA-Yolo [37] | 0.671 | 0.353 | 0.414 | 0.211 | 0.136 | 0.357 |
DSNet [50] | 0.566 | 0.345 | 0.402 | 0.198 | 0.124 | 0.327 |
MS-DAYOLOv8 [51] | 0.637 | 0.391 | 0.479 | 0.281 | 0.157 | 0.389 |
D-YOLO [12] | 0.658 | 0.402 | 0.538 | 0.308 | 0.242 | 0.430 |
YOLOv12-n | 0.507 | 0.434 | 0.581 | 0.446 | 0.445 | 0.482 |
YOLOv13-n [52] | 0.511 | 0.429 | 0.576 | 0.437 | 0.437 | 0.478 |
Ours | 0.530 | 0.447 | 0.605 | 0.468 | 0.457 | 0.501 |
Method | Speed (s/image) | FPS | mAP |
---|---|---|---|
Yolov8 | 0.025 | 40.0 | 0.381 |
AOD-YOLOv8 | 0.135 | 7.4 | 0.345 |
MSBDN-YOLOv8 | 0.104 | 9.6 | 0.337 |
GridDehaze-YOLOv8 | 0.071 | 14.1 | 0.371 |
DS-Net | 0.049 | 20.4 | 0.327 |
IA-YOLO | 0.035 | 28.6 | 0.357 |
D-YOLO | 0.033 | 30.3 | 0.430 |
YOLOv12-n | 0.0027 | 365 | 0.482 |
YOLOv13-n | 0.003 | 333 | 0.478 |
Ours | 0.0043 | 232 | 0.501 |
Method | Person | Bicycle | Car | Motor | Bus | All |
---|---|---|---|---|---|---|
Yolov8 | 0.257 | 0.159 | 0.366 | 0.042 | 0.172 | 0.199 |
Yolov8-C | 0.261 | 0.153 | 0.359 | 0.037 | 0.158 | 0.194 |
IA-YOLO | 0.267 | 0.174 | 0.373 | 0.043 | 0.192 | 0.210 |
DSNet | 0.251 | 0.143 | 0.369 | 0.045 | 0.167 | 0.195 |
MS-DAYolo | 0.274 | 0.212 | 0.353 | 0.034 | 0.179 | 0.210 |
AOD-YOLOv8 | 0.236 | 0.144 | 0.332 | 0.031 | 0.129 | 0.174 |
MSBDN-YOLOv8 | 0.235 | 0.127 | 0.341 | 0.053 | 0.130 | 0.177 |
Griddehaze-YOLOv8 | 0.221 | 0.137 | 0.323 | 0.030 | 0.133 | 0.169 |
DCP-YOLOv8 | 0.243 | 0.141 | 0.339 | 0.032 | 0.161 | 0.183 |
YOLOv12-n | 0.446 | 0.155 | 0.238 | 0.141 | 0.353 | 0.222 |
Ours | 0.457 | 0.163 | 0.245 | 0.155 | 0.371 | 0.239 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, W.; Jing, B.; Yu, X.; Zhang, W.; Wang, S.; Tang, Z.; Yang, L. YOLO-Extreme: Obstacle Detection for Visually Impaired Navigation Under Foggy Weather. Sensors 2025, 25, 4338. https://doi.org/10.3390/s25144338
Wang W, Jing B, Yu X, Zhang W, Wang S, Tang Z, Yang L. YOLO-Extreme: Obstacle Detection for Visually Impaired Navigation Under Foggy Weather. Sensors. 2025; 25(14):4338. https://doi.org/10.3390/s25144338
Chicago/Turabian StyleWang, Wei, Bin Jing, Xiaoru Yu, Wei Zhang, Shengyu Wang, Ziqi Tang, and Liping Yang. 2025. "YOLO-Extreme: Obstacle Detection for Visually Impaired Navigation Under Foggy Weather" Sensors 25, no. 14: 4338. https://doi.org/10.3390/s25144338
APA StyleWang, W., Jing, B., Yu, X., Zhang, W., Wang, S., Tang, Z., & Yang, L. (2025). YOLO-Extreme: Obstacle Detection for Visually Impaired Navigation Under Foggy Weather. Sensors, 25(14), 4338. https://doi.org/10.3390/s25144338