Research on Pedestrian Detection Method Based on Dual-Branch YOLOv8 Network of Visible Light and Infrared Images
Abstract
1. Introduction
2. Design of Dual-Modal Fusion Network Structure for Pedestrian Detection
2.1. Dual-Branch Network Infrastructure
2.2. Modal-Channel Interaction Block Design
2.3. Dynamic Alignment Feature Fusion Module
3. Pedestrian Detection Experiment and Result Analysis
3.1. Experimental Environment and Dataset Settings
3.2. Selection of Evaluation Indicators
3.3. Experimental Results and Analysis
4. Closing Remarks
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Cao, Z.; Yang, H.; Zhao, J.; Guo, S.; Li, L. Attention fusion for one-stage multispectral pedestrian detection. Sensors 2021, 21, 4184. [Google Scholar] [CrossRef] [PubMed]
- Fang, Q.; Han, D.; Wang, Z. Cross-modality fusion transformer for multispectral object detection. arXiv 2022, arXiv:2111.00273v4. [Google Scholar] [CrossRef]
- Shen, J.; Chen, Y.; Liu, Y.; Zuo, X.; Fan, H.; Yang, W. ICAFusion: Ierative cross-attention guided feature fusion for multispectral object detection. Pattern Recognit. 2024, 145, 109913. [Google Scholar] [CrossRef]
- Althoupety, A.; Wang, L.Y.; Feng, W.C.; Rekabdar, B. Daff: Dual attentive feature fusion for multispectral pedestrian detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 2997–3006. [Google Scholar]
- Ma, J.; Tang, L.; Fan, F.; Huang, J.; Mei, X.; Ma, Y. SwinFusion: Cross-domain long-range learning for general image fusion via swin transformer. IEEE/CAA J. Autom. Sin. 2022, 9, 1200–1217. [Google Scholar] [CrossRef]
- Zhang, H.; Fromont, E.; Lefevre, S.; Avignon, B. Multispectral fusion for object detection with cyclic fuse-and-refine blocks. In Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 25–28 October 2020; IEEE: New York, NY, USA, 2020; pp. 276–280. [Google Scholar]
- Cui, J.L.; Wang, H.; Zheng, H.; Hu, Z.H. Multimodal Target Detection Algorithm Based on BPP-YOLOv8. Infrared Technol. 2025, 12, 1–9. Available online: https://link.cnki.net/urlid/53.1053.TN.20240830.1105.002 (accessed on 19 March 2026).
- Li, Z.; Li, X.; Niu, Y.; Rong, C.; Wang, Y. Infrared and Visible Light Fusion for Object Detection with Low-light Enhancement. In Proceedings of the 2024 IEEE 7th International Conference on Information Systems and Computer Aided Education (ICISCAE), Dalian, China, 27–29 September 2024; IEEE: New York, NY, USA, 2024; pp. 120–124. [Google Scholar]
- Tang, L.; Yuan, J.; Zhang, H.; Jiang, X.; Ma, J. PIAFusion: A progressive infrared and visible image fusion network based on illumination aware. Inf. Fusion 2022, 83, 79–92. [Google Scholar] [CrossRef]
- Yan, C.; Zhang, H.; Li, X.; Yang, Y.; Yuan, D. Cross-modality complementary information fusion for multispectral pedestrian detection. Neural Comput. Appl. 2023, 35, 10361–10386. [Google Scholar] [CrossRef]
- Zhang, X.; Zhang, X.; Wang, J.; Ying, J.; Sheng, Z.; Yu, H.; Li, C.; Shen, H.L. TFDet: Target-aware fusion for RGB-T pedestrian detection. IEEE Trans. Neural Netw. Learn. Syst. 2024, 36, 13276–13290. [Google Scholar] [CrossRef] [PubMed]
- Li, Q.; Zhang, C.; Hu, Q.; Zhu, P.; Fu, H.; Chen, L. Stabilizing multispectral pedestrian detection with evidential hybrid fusion. IEEE Trans. Circuits Syst. Video Technol. 2023, 34, 3017–3029. [Google Scholar] [CrossRef]
- Zhao, R.; Zhang, Z.; Xu, Y.; Yao, Y.; Huang, Y.; Zhang, W.; Song, Z.; Chen, X.; Zhao, Y. Peddet: Adaptive spectral optimization for multimodal pedestrian detection. arXiv 2025, arXiv:2502.14063v2. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R. You only look once: Unified, real-time object detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- He, X.J.; Song, X.N. Improved YOLOv4-Tiny Lightweight Target Detection Algorithm. J. Front. Comput. Sci. Technol. 2024, 18, 138–150. [Google Scholar]
- Ma, J.; Tang, L.; Xu, M.; Zhang, H.; Xiao, G. STDFusionNet: An infrared and visible image fusion network based on salient target detection. IEEE Trans. Instrum. Meas. 2021, 70, 1–13. [Google Scholar] [CrossRef]
- Ram Prabhakar, K.; Sai Srikar, V.; Venkatesh Babu, R. Deepfuse: A deep unsupervised approach for exposure fusion with extreme exposure image pairs. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4714–4722. [Google Scholar]
- Jia, X.; Zhu, C.; Li, M.; Tang, W.; Zhou, W. LLVIP: A visible-infrared paired dataset for low-light vision. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 March 2021; pp. 3496–3504. [Google Scholar]
- Hwang, S.; Park, J.; Kim, N.; Choi, Y.; So Kweon, I. Multispectral pedestrian detection: Benchmark dataset and baseline. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1037–1045. [Google Scholar]
- He, M.Z.; Wu, Q.B.; Ngan, K.N.; Jiang, F.; Meng, F.M.; Xu, L.F. Misaligned RGB-infrared object detection via adaptive dual-discrepancy calibration. Remote Sens. 2023, 15, 4887. [Google Scholar] [CrossRef]
- Fang, Q.Y.; Wang, Z.K. Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery. Pattern Recognit. 2022, 130, 108786. [Google Scholar] [CrossRef]
- Gao, Q.; Zhang, C.; Shi, R.; Zhang, Y. Cross-modal Progressive Fusion Method for UAV Target Detection. Unmanned Syst. Technol. 2024, 7, 54–64. [Google Scholar]
- Cui, J.L.; Wang, H.; Zheng, H.; Hu, Z.H. A fusion image detection method based on LDI-YOLOv8. Microelectron. Comput. 2025, 42, 61–70. [Google Scholar]
- Bao, C.; Cao, J.; Hao, Q.; Cheng, Y.; Ning, Y.; Zhao, T. Dual-YOLO architecture from infrared and visible images for object detection. Sensors 2023, 23, 2934. [Google Scholar] [CrossRef] [PubMed]










| P | R | mAP@0.5 | |||||
|---|---|---|---|---|---|---|---|
| Model | Input Type | LLVIP | Kaist | LLVIP | Kaist | LLVIP | Kaist |
| YOLOv8 | ir | 0.925 | 0.724 | 0.863 | 0.613 | 0.95 | 0.681 |
| YOLOv8 | vis | 0.857 | 0.714 | 0.763 | 0.462 | 0.84 | 0.532 |
| Dual-branch | ir+vis | 0.931 | 0.744 | 0.866 | 0.616 | 0.94 | 0.71 |
| Dual-branch+MCI-Block | ir+vis | 0.933 | 0.776 | 0.871 | 0.661 | 0.941 | 0.734 |
| Dual-branch+ DAFF | ir+vis | 0.947 | 0.769 | 0.888 | 0.658 | 0.941 | 0.713 |
| Ours | ir+vis | 0.95 | 0.788 | 0.904 | 0.707 | 0.957 | 0.76 |
| Methods | P | R | mAP@0.5 |
|---|---|---|---|
| ICAFusion | 0.861 | 0.793 | 0.866 |
| ADCNet | 0.859 | 0.786 | 0.833 |
| CSM | 0.873 | 0.788 | 0.845 |
| PCMFNet | 0.915 | 0.826 | 0.912 |
| Our method | 0.95 | 0.904 | 0.957 |
| Methods | P | R | mAP@0.5 |
|---|---|---|---|
| BPP-YOLOv8 | 0.761 | 0.692 | 0.733 |
| LDI-YOLOv8 | 0.692 | 0.653 | 0.709 |
| Dual-YOLO | 0.751 | 0.667 | 0.732 |
| Our method | 0.788 | 0.707 | 0.76 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Published by MDPI on behalf of the World Electric Vehicle Association. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
He, Z.; Chen, X. Research on Pedestrian Detection Method Based on Dual-Branch YOLOv8 Network of Visible Light and Infrared Images. World Electr. Veh. J. 2026, 17, 177. https://doi.org/10.3390/wevj17040177
He Z, Chen X. Research on Pedestrian Detection Method Based on Dual-Branch YOLOv8 Network of Visible Light and Infrared Images. World Electric Vehicle Journal. 2026; 17(4):177. https://doi.org/10.3390/wevj17040177
Chicago/Turabian StyleHe, Zhuomin, and Xuewen Chen. 2026. "Research on Pedestrian Detection Method Based on Dual-Branch YOLOv8 Network of Visible Light and Infrared Images" World Electric Vehicle Journal 17, no. 4: 177. https://doi.org/10.3390/wevj17040177
APA StyleHe, Z., & Chen, X. (2026). Research on Pedestrian Detection Method Based on Dual-Branch YOLOv8 Network of Visible Light and Infrared Images. World Electric Vehicle Journal, 17(4), 177. https://doi.org/10.3390/wevj17040177

