YOLO-HVS: Infrared Small Target Detection Inspired by the Human Visual System
Abstract
1. Introduction
2. Datasets
2.1. Public Dataset-DroneVehicle
2.2. Our Dataset-DroneRoadVehicles
3. Methods
3.1. YOLO-HVS
3.2. MultiSEAM Attention Mechanism
3.3. C2f_DWR
4. Experiments and Results
4.1. Evaluation Metrics
4.1.1. Precision and Recall
4.1.2. mAP50 and mAP50-95
4.1.3. Parameters and GFLOPs
4.2. Experimental Setup
4.3. Comparison Experiments
4.4. Ablation Experiments
4.5. Visualization of Results Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhao, M.; Li, W.; Li, L.; Hu, J.; Ma, P.; Tao, R. Single-Frame Infrared Small-Target Detection: A survey. IEEE Geosci. Remote Sens. Mag. 2022, 10, 87–119. [Google Scholar] [CrossRef]
- Zhang, Q.; Zhou, L.; An, J. Real-Time Recognition Algorithm of Small Target for UAV Infrared Detection. Sensors 2024, 24, 3075. [Google Scholar] [CrossRef] [PubMed]
- Gao, C.; Meng, D.; Yang, Y.; Wang, Y.; Zhou, X.; Hauptmann, A.G. Infrared Patch-Image Model for Small Target Detection in a Single Image. IEEE Trans. Image Process. 2013, 22, 4996–5009. [Google Scholar] [CrossRef]
- Rawat, S.S.; Verma, S.K.; Kumar, Y. Review on recent development in infrared small target detection algorithms. Procedia Comput. Sci. 2020, 167, 2496–2505. [Google Scholar] [CrossRef]
- Ju, M.; Luo, J.; Liu, G.; Luo, H. ISTDet: An efficient end-to-end neural network for infrared small target detection. Infrared Phys. Technol. 2021, 114, 103659. [Google Scholar] [CrossRef]
- Wang, K.; Li, S.; Niu, S.; Zhang, K. Detection of Infrared Small Targets Using Feature Fusion Convolutional Network. IEEE Access 2019, 7, 146081–146092. [Google Scholar] [CrossRef]
- Jeon, Y.; Chang, W.; Jeong, S.; Han, S.; Park, J. A Bayesian convolutional neural network-based generalized linear model. Biometrics 2024, 80, ujae057. [Google Scholar] [CrossRef] [PubMed]
- Zhang, J.; Liu, C.; Wang, B.; Chen, C.; He, J.; Zhou, Y.; Li, J. An infrared pedestrian detection method based on segmentation and domain adaptation learning. Comput. Electr. Eng. 2022, 99, 107781. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Los Alamitos, CA, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar] [CrossRef]
- Yue, T.; Lu, X.; Cai, J.; Chen, Y.; Chu, S. YOLO-MST: Multiscale deep learning method for infrared small target detection based on super-resolution and YOLO. Opt. Laser Technol. 2025, 187, 112835. [Google Scholar] [CrossRef]
- Akshatha, K.R.; Karunakar, A.K.; Shenoy, S.B.; Pai, A.K.; Nagaraj, N.H.; Rohatgi, S.S. Human Detection in Aerial Thermal Images Using Faster R-CNN and SSD Algorithms. Electronics 2022, 11, 1151. [Google Scholar] [CrossRef]
- Han, J.; Liang, K.; Zhou, B.; Zhu, X.; Zhao, J.; Zhao, L. Infrared Small Target Detection Utilizing the Multiscale Relative Local Contrast Measure. IEEE Geosci. Remote Sens. Lett. 2018, 15, 612–616. [Google Scholar] [CrossRef]
- Han, J.; Ma, Y.; Huang, J.; Mei, X.; Ma, J. An Infrared Small Target Detecting Algorithm Based on Human Visual System. IEEE Geosci. Remote Sens. Lett. 2016, 13, 452–456. [Google Scholar] [CrossRef]
- Han, J.; Ma, Y.; Zhou, B.; Fan, F.; Liang, K.; Fang, Y. A Robust Infrared Small Target Detection Algorithm Based on Human Visual System. IEEE Geosci. Remote Sens. Lett. 2014, 11, 2168–2172. [Google Scholar] [CrossRef]
- Guo, M.H.; Xu, T.X.; Liu, J.J.; Liu, Z.N.; Jiang, P.T.; Mu, T.J.; Zhang, S.H.; Martin, R.R.; Cheng, M.M.; Hu, S.M. Attention mechanisms in computer vision: A survey. Comput. Vis. Media 2022, 8, 331–368. [Google Scholar] [CrossRef]
- Du, S.; Zhang, B.; Zhang, P.; Xiang, P.; Xue, H. FA-YOLO: An Improved YOLO Model for Infrared Occlusion Object Detection under Confusing Background. Wirel. Commun. Mob. Comput. 2021, 2021. [Google Scholar] [CrossRef]
- Deng, H.; Zhang, Y. FMR-YOLO: Infrared Ship Rotating Target Detection Based on Synthetic Fog and Multiscale Weighted Feature Fusion. IEEE Trans. Instrum. Meas. 2024, 73, 5001717. [Google Scholar] [CrossRef]
- Zhang, T.; Li, L.; Cao, S.; Pu, T.; Peng, Z. Attention-Guided Pyramid Context Networks for Detecting Infrared Small Target Under Complex Background. IEEE Trans. Aerosp. Electron. Syst. 2023, 59, 4250–4261. [Google Scholar] [CrossRef]
- Song, Z.; Yan, Y.; Cao, Y.; Jin, S.; Qi, F.; Li, Z.; Lei, T.; Chen, L.; Jing, Y.; Xia, J.; et al. An infrared dataset for partially occluded person detection in complex environment for search and rescue. Sci. Data 2025, 12, 300. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Tian, Y.; Liu, J.; Xu, Y. Multi-Stage Multi-Scale Local Feature Fusion for Infrared Small Target Detection. Remote Sens. 2023, 15, 4506. [Google Scholar] [CrossRef]
- Wang, Y.; Jiang, P.; Pan, N. Infrared small target detection based on local significance and multiscale. Digit. Signal Process. 2024, 155, 104721. [Google Scholar] [CrossRef]
- Sun, Y.; Cao, B.; Zhu, P.; Hu, Q. Drone-Based RGB-Infrared Cross-Modality Vehicle Detection Via Uncertainty-Aware Learning. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 6700–6713. [Google Scholar] [CrossRef]
- Wang, J.; Teng, X.; Li, Z.; Yu, Q.; Bian, Y.; Wei, J. VSAI: A Multi-View Dataset for Vehicle Detection in Complex Scenarios Using Aerial Images. Drones 2022, 6, 161. [Google Scholar] [CrossRef]
- Varghese, R.; Sambath, M. YOLOv8: A Novel Object Detection Algorithm with Enhanced Performance and Robustness. In Proceedings of the 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS), Chennai, India, 18–19 April 2024; pp. 1–6. [Google Scholar]
- Yu, Z.; Huang, H.; Chen, W.; Su, Y.; Liu, Y.; Wang, X. YOLO-FaceV2: A scale and occlusion aware face detector. Pattern Recognit. 2024, 155, 110714. [Google Scholar] [CrossRef]
- Muhammad, W.; Aramvith, S.; Onoye, T. Multi-scale Xception based depthwise separable convolution for single image super-resolution. PLoS ONE 2021, 16, e0249278. [Google Scholar] [CrossRef]
- Araujo, A.; Norris, W.; Sim, J. Computing receptive fields of convolutional neural networks. Distill 2019, 4. [Google Scholar] [CrossRef]
- Chen, Q.; Li, C.; Ning, J.; Lin, S.; He, K. GMConv: Modulating Effective Receptive Fields for Convolutional Kernels. IEEE Trans. Neural Netw. Learn. Syst. 2025, 36, 6669–6678. [Google Scholar] [CrossRef]
- Markoulidakis, I.; Markoulidakis, G. Probabilistic Confusion Matrix: A Novel Method for Machine Learning Algorithm Generalized Performance Analysis. Technologies 2024, 12, 113. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Khanam, R.; Hussain, M. YOLOv11: An Overview of the Key Architectural Enhancements. arXiv 2024, arXiv:2410.17725. [Google Scholar]
Indicators | Thermal Camera | Visual Camera |
---|---|---|
Spectral Band | (8–14) μm | (0.38–0.7) μm |
Resolution | 640 × 512 | 3840 × 2160/1920 × 1080 |
Sensors | Uncooled VOx Microbolometer | 1/2 CMOS |
Actual/Predicted | Positive | Negative |
---|---|---|
Positive | True Positive (TP) | False Negative (FN) |
Negative | False Positive (FP) | True Negative (TN) |
Method | P | R | F1 | mAP50 | mAP50-95 | GFLOPs | Params (M) | FPS |
---|---|---|---|---|---|---|---|---|
Faster R-CNN | 0.73 | 0.59 | 0.653 | 0.674 | – | 207 | 41 | – |
YOLOv8 | 0.804 | 0.778 | 0.791 | 0.823 | 0.616 | 8.1 | 3.1 | 26 |
YOLOv11 | 0.809 | 0.782 | 0.79 | 0.828 | 0.622 | 6.3 | 2.6 | 50 |
Ours | 0.807 | 0.787 | 0.797 | 0.834 | 0.625 | 8.2 | 5.4 | 41 |
Method | P | R | F1 | mAP50 | mAP50-95 |
---|---|---|---|---|---|
Faster R-CNN | 0.887 | 0.78 | 0.83 | 0.823 | – |
YOLOv8 | 0.931 | 0.96 | 0.945 | 0.971 | 0.63 |
YOLOv11 | 0.933 | 0.957 | 0.945 | 0.975 | 0.638 |
Ours | 0.926 | 0.962 | 0.944 | 0.978 | 0.64 |
Dataset | MultiSEAM | C2f_DWR | mAP50 | mAP50-95 | GFLOPs | Params (M) |
---|---|---|---|---|---|---|
DroneVehicle | 0.823 | 0.616 | 8.1 | 3.1 | ||
✓ | 0.827 | 0.619 | 8.6 | 5.6 | ||
✓ | 0.825 | 0.618 | 7.8 | 2.8 | ||
✓ | ✓ | 0.834 | 0.625 | 8.2 | 5.4 | |
DroneRoadVehicles | 0.971 | 0.630 | 8.1 | 3.1 | ||
✓ | 0.974 | 0.639 | 8.6 | 5.6 | ||
✓ | 0.972 | 0.642 | 7.8 | 2.8 | ||
✓ | ✓ | 0.978 | 0.640 | 8.2 | 5.4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, X.; Sheng, Y.; Hao, Q.; Hou, H.; Nie, S. YOLO-HVS: Infrared Small Target Detection Inspired by the Human Visual System. Biomimetics 2025, 10, 451. https://doi.org/10.3390/biomimetics10070451
Wang X, Sheng Y, Hao Q, Hou H, Nie S. YOLO-HVS: Infrared Small Target Detection Inspired by the Human Visual System. Biomimetics. 2025; 10(7):451. https://doi.org/10.3390/biomimetics10070451
Chicago/Turabian StyleWang, Xiaoge, Yunlong Sheng, Qun Hao, Haiyuan Hou, and Suzhen Nie. 2025. "YOLO-HVS: Infrared Small Target Detection Inspired by the Human Visual System" Biomimetics 10, no. 7: 451. https://doi.org/10.3390/biomimetics10070451
APA StyleWang, X., Sheng, Y., Hao, Q., Hou, H., & Nie, S. (2025). YOLO-HVS: Infrared Small Target Detection Inspired by the Human Visual System. Biomimetics, 10(7), 451. https://doi.org/10.3390/biomimetics10070451