An Improved Model Based on YOLOv8 for Small Object Detection and Recognition
Abstract
1. Introduction
- Network Architecture: An enhanced multi-scale feature fusion mechanism incorporates a modified BiFPN structure, strengthening small object feature aggregation through cross-scale connections and adaptive weight allocation. We further explore dilated convolutions to expand the receptive field without resolution loss, capturing richer small object details.
- Data Processing: Optimized augmentation strategies specifically integrate small object zooming and cropping, improving target diversity and representation in training data. Image preprocessing combined with Super-Resolution (SR) techniques enhances small object clarity and feature discriminability.
- Training Optimization: Loss function adjustments significantly increase small object detection weights, while variants of focal loss suppress interference from simple background samples, concentrating optimization on challenging small object instances.
2. Related Work
2.1. Foundational Object Detection Architectures Using Pre-YOLO and Early YOLO
2.2. Evolution of YOLO for Enhanced Performance from YOLOv2 to YOLOv7
2.3. YOLOv8 and Contemporary SOD-Specific YOLO Variants
2.4. Techniques Specifically Addressing Small Object Challenges
2.5. Application-Oriented SOD in Remote Sensing and Other Domains
2.6. Summary and Research Gap
3. The Improved YOLOv8
3.1. Motivation
3.2. YOLOv8 Network Structure
3.3. The Improved YOLOv8 Network Structure
3.3.1. Small Object Detection Layer
3.3.2. GSConv Module
3.3.3. SIoU Loss Function
3.3.4. SPPCSPC Module
4. Experimental Results and Analysis
4.1. Dataset and Experimental Environment
4.2. Evaluation Metrics
4.3. Training Results
4.4. F1 Score Curves
4.5. Precision–Recall Curves
4.6. Comparative Experiments
4.7. Visualization of Detection Results
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Huang, L.L.; Shimizu, A. A Multi-Expert Approach for Robust Face Detection. Pattern Recognit. 2006, 39, 1695–1703. [Google Scholar] [CrossRef]
- Liu, H.I.; Tseng, Y.W.; Chang, K.C.; Wang, P.J.; Shuai, H.H.; Cheng, W.H. A Denoising Fpn with Transformer R-Cnn for Tiny Object Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4704415. [Google Scholar] [CrossRef]
- Sheng, W.; Yu, S.; Lin, J.; Chen, X. Faster Rcnn Target Detection Algorithm Integrating CBAM and FPN. Appl. Sci. 2023, 13, 6913. [Google Scholar] [CrossRef]
- Zhai, S.; Shang, D.; Wang, S.; Dong, S. Df-Ssd: An Improved SSD Object Detection Algorithm Based on Densenet and Feature Fusion. IEEE Access 2020, 8, 24344–24357. [Google Scholar] [CrossRef]
- Ultralytics: Yolov8—Ultralytics Yolov8 Documentation. 2023. Available online: https://docs.ultralytics.com/models/yolov8/ (accessed on 19 May 2025).
- Jocher, G.; Chaurasia, A.; Qiu, J. Ultralytics Yolov8. 2023. Available online: https://github.com/ultralytics/ultralytics/ (accessed on 19 May 2025).
- Zhang, Y.; Gao, G.; Chen, Y.; Yang, Z. Odd-Yolov8: An Algorithm for Small Object Detection in UAV Imagery. J. Supercomput. 2025, 81, 202. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-Cnn: Towards Real-Time Object Detection with Region Proposal Networks. In Advances in Neural Information Processing Systems (NeurIPS); IEEE: New York, NY, USA, 2015; p. 28. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. Ssd: Single Shot Multibox Detector. In European Conference on Computer Vision (ECCV); Springer: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. Yolo9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 7263–7271. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. Yolov3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. Yolov4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. Scaled-Yolov4: Scaling Cross Stage Partial Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 13029–13038. [Google Scholar] [CrossRef]
- Wang, J.; Xu, C.; Yang, W.; Yu, L. A Normalized Gaussian Wasserstein Distance for Tiny Object Detection. IEEE Trans. Image Process. 2022, 31, 7325–7338. [Google Scholar]
- Yang, F.; Wu, Y.; Zhang, S.; Li, G.; Zhang, W. Afpn: Asymptotic Feature Pyramid Network for Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, 1–4 October 2023. [Google Scholar] [CrossRef]
- Chen, Q.; Wang, Y.; Yang, T.; Zhang, X.; Cheng, J.; Sun, J. You Only Look One-Level Feature. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 13039–13048. [Google Scholar] [CrossRef]
- Xu, S.; Wang, X.; Lv, W.; Chang, Q.; Cui, C.; Deng, K.; Wang, G.; Dang, Q.; Wei, S.; Du, Y. Pp-Yoloe: An Evolved Version of Yolo. arXiv 2023, arXiv:2203.16250. [Google Scholar]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollr, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar] [CrossRef]
- Haris, M.; Shakhnarovich, G.; Ukita, N. Deep Back-Projection Networks for Super-Resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2019; pp. 1664–1673. [Google Scholar] [CrossRef]
- Yu, F.; Koltun, V.; Funkhouser, T. Dilated Residual Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 472–480. [Google Scholar] [CrossRef]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and Efficient Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar] [CrossRef]
- Li, K.; Wan, G.; Cheng, G.; Meng, L.; Han, J. Object Detection in Optical Remote Sensing Images: A Survey and a New Benchmark. ISPRS J. Photogramm. Remote Sens. 2020, 159, 296–307. [Google Scholar] [CrossRef]
- Zhang, H.; Zhang, Y.; Jia, K.; Zhang, L. H2FA R-Cnn: Holistic and Hierarchical Feature Alignment for Cross-Domain Weakly Supervised Object Detection. IEEE Trans. Multimed. 2022, 24, 374–385. [Google Scholar]
- Nie, H.; Pang, H.; Ma, M.; Zheng, R. A Lightweight Remote Sensing Small Target Image Detection Algorithm Based on Improved Yolov8. Sensors 2024, 24, 2952. [Google Scholar] [CrossRef] [PubMed]
- Zhang, R.; Bai, X.; Fan, J. Crop Pest Target Detection Algorithm in Complex Scenes: Yolov8-Extend. Smart Agric. 2024, 6, 49–61. [Google Scholar] [CrossRef]
- Zhou, C.; Song, Q.; Zhang, Y. Small Target Detection Algorithm Based on Improved Yolov8 for Staring Radar. J. Signal Process. 2025, 41, 853–866. [Google Scholar] [CrossRef]
- Xu, W.; Cui, C.; Ji, Y.; Li, X.; Li, S. Yolov8-Mpeb Small Target Detection Algorithm Based on Uav Images. Heliyon 2024, 10, e29501. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.; Zhao, J.; Zhao, D. Precision and Speed: Lsod-Yolo for Lightweight Small Object Detection. Expert Syst. Appl. 2025, 238, 122–135. [Google Scholar] [CrossRef]
- Cao, L.; Ma, Z.; Hu, Q.; Xia, Z.; Zhao, M. DCE-Net: An Improved Method for Sonar Small-Target Detection Based on YOLOv8. J. Mar. Sci. Eng. 2025, 13, 1478. [Google Scholar] [CrossRef]
- Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. Ghostnet: More Features from Cheap Operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 1580–1589. [Google Scholar] [CrossRef]
- Gevorgyan, Z. Siou Loss: More Powerful Learning for Bounding Box Regression. Expert Syst. Appl. 2024, 250, 124539. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [PubMed]
- Wang, C.-Y.; Liao, H.-Y.M.; Wu, Y.-H.; Chen, P.-Y.; Hsieh, J.-W.; Yeh, I.-H. Cspnet: A New Backbone That Can Enhance Learning Capability of Cnn. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 390–391. [Google Scholar] [CrossRef]
- Cheng, G.; Han, J.; Guo, L.; Liu, Z.; Bu, S.; Ren, J. Object Detection in Remote Sensing Imagery Using a Discriminatively Trained Mixture Model. ISPRS J. Photogramm. Remote Sens. 2014, 85, 32–43. [Google Scholar] [CrossRef]











| Parameters | Setup |
|---|---|
| Epochs | 300 |
| Batch size | 32 |
| Image size | 640 × 640 |
| Optimizer | AdamW |
| Automatic mixed precision | True |
| Learning rate | 0.01 |
| Momentum | 0.937 |
| Weight decay | 0.001 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
He, J.; Luo, S. An Improved Model Based on YOLOv8 for Small Object Detection and Recognition. Information 2026, 17, 173. https://doi.org/10.3390/info17020173
He J, Luo S. An Improved Model Based on YOLOv8 for Small Object Detection and Recognition. Information. 2026; 17(2):173. https://doi.org/10.3390/info17020173
Chicago/Turabian StyleHe, Jia, and Suyun Luo. 2026. "An Improved Model Based on YOLOv8 for Small Object Detection and Recognition" Information 17, no. 2: 173. https://doi.org/10.3390/info17020173
APA StyleHe, J., & Luo, S. (2026). An Improved Model Based on YOLOv8 for Small Object Detection and Recognition. Information, 17(2), 173. https://doi.org/10.3390/info17020173
