Research on Small Dataset Object Detection Algorithm Based on Hierarchically Deployed Attention Mechanisms
Abstract
1. Introduction
2. Materials and Methods
2.1. Network Model
2.1.1. YOLOv8 Network Model
2.1.2. Rep-YOLOv8 Network Model
2.1.3. Structural Re-Parameterization
2.2. Algorithm
2.2.1. Attention Mechanism
2.2.2. ECA Module in Shallow Network
- Perform Global Average Pooling (GAP) on the input feature map to compress spatial information, generating a global feature descriptor vector for each channel.
- Use 1D convolution (1D Conv) for local cross-channel interaction on the channel vector, dynamically adjusting the local range of cross-channel interaction to avoid the high parameter count and information redundancy of traditional fully connected layers. The adaptive formula for the convolution kernel size k is:where k represents the size of the convolution kernel, C represents the number of channels, γ and b are hyperparameters (default γ = 2, b = 1), and | |odd means taking the nearest odd integer.
- Use the Sigmoid function to map the convolution output to values between 0 and 1, generating weight coefficients for each channel. These weight coefficients are then multiplied channel-wise with the original input feature map to complete feature recalibration, enhancing the response of key channels and suppressing secondary information. The output maintains the same dimensions as the input.
2.2.3. CBAM-Spatial Module in Shallow Network

- It integrates multi-granularity spatial information through a dual-pooling strategy, capturing both the overall distribution characteristics of target regions (average pooling) and retaining the salient responses of local details (max pooling), overcoming the information bias problem of single pooling operations.
- It uses a fixed-size 3 × 3 convolutional kernel instead of complex branch structures, reducing parameters by 60% (compared to more complex spatial attention designs).
- The module overall follows the lightweight principle and can be seamlessly embedded into the RepVGG network architecture.
2.2.4. eSE Module in Deep Network
3. Results and Discussion
3.1. Algorithm Validation
3.1.1. Ablation Study
3.1.2. Multi-Sample Object Recognition
3.1.3. Small Dataset Object Detection
3.2. Experiment and Analysis
3.2.1. Database Construction
3.2.2. Self-Built Database Experiment
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Shuai, Z.H. Adaptive Neuroevolutionary Methods for Deep Neural Networks in Computer Vision; Dalian Maritime University: Dalian, China, 2023. [Google Scholar]
- Li, Y.P.; Hou, L.Y.; Wang, C. Moving Object Detection in Autonomous Driving Based on YOLOv3. Comput. Eng. Des. 2019, 4, 1139–1144. [Google Scholar]
- Jiao, R.D.; Gao, D.H.; Huang, Y.H.; Liu, S.; Duan, X.P.; Wang, R.; Liu, W.D. Research and Validation of Few-Shot Evaluation Method for Production Line AI Quality Inspection. Comput. Sci. 2024, S1, 1161–1168. [Google Scholar]
- Xu, K. Research on Intelligent Security Video Surveillance Based on Multi-object Recognition. Tech. Autom. Appl. 2024, 7, 168–171. [Google Scholar]
- Shi, Z. Object detection models and research directions. In Proceedings of the 2021 IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE), Virtual, 15–17 January 2021; pp. 546–550. [Google Scholar]
- Bustos, N.; Mashhadi, M.; Lai-Yuen, S.K.; Mashhadi, M.; Lai-Yuen, S.K.; Sarkar, S.; Das, T.K. A systematic literature review on object detection using near infrared and thermal images. Neurocomputing 2023, 560, 126804. [Google Scholar] [CrossRef]
- Xiao, Y.; Tian, Z.; Yu, J.; Zhang, Y.; Liu, S.; Du, S.; Lan, X. A review of object detection based on deep learning. Multimed. Tools Appl. 2020, 79, 23729–23791. [Google Scholar] [CrossRef]
- Viola, P.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Kauai, HI, USA, 8–14 December 2001; Volume 1. [Google Scholar]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–26 June 2005; Volume 1, pp. 886–893. [Google Scholar]
- Felzenszwalb, P.F.; Girshick, R.B.; McAllester, D.; Ramanan, D. Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 32, 1627–1645. [Google Scholar] [CrossRef] [PubMed]
- Tang, H.; Li, Z.; Zhang, D.; He, S.; Tang, J. Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2025, 47, 1958–1974. [Google Scholar] [CrossRef] [PubMed]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Farhadi, A.; Redmon, J. Yolov3: An incremental improvement. In Computer Vision and Pattern Recognition; Springer: Berlin/Heidelberg, Germany, 2018; Volume 1804, pp. 1–6. [Google Scholar]
- Wang, G.; Chen, Y.; An, P.; Hong, H.; Hu, J.; Huang, T. UAV-YOLOv8: A small-object-detection model based on improved YOLOv8 for UAV aerial photography scenarios. Sensors 2023, 23, 7190. [Google Scholar] [CrossRef] [PubMed]
- Wang, X.; Gao, H.; Jia, Z.; Li, Z. BL-YOLOv8: An Improved Road Defect Detection Model Based on YOLOv8. Sensors 2023, 23, 8361. [Google Scholar] [CrossRef] [PubMed]
- Xiao, X.; Feng, X. Multi-object pedestrian tracking using improved YOLOv8 and OC-SORT. Sensors 2023, 23, 8439. [Google Scholar] [CrossRef] [PubMed]
- Liu, Z.; Ye, K. YOLO-IMF: An Improved YOLOv8 Algorithm for Surface Defect Detection in Industrial Manufacturing Field. In International Conference on Metaverse; Springer Nature: Cham, Switzerland, 2023; pp. 15–28. [Google Scholar]
- Du, J. Research on the Design of Assembly System for Complex Equipment Based on Augmented Reality; Nanjing University of Posts and Telecommunications: Nanjing, China, 2023. [Google Scholar]
- Liu, H. Innovative Design and Application Research of Minimally Invasive Spinal Surgery Guidance System Based on Augmented Reality Technology; Army Medical University of the Chinese People’s Liberation Army: Chongqing, China, 2021. [Google Scholar]
- Hu, Y. Research on AR Game Design for Children’s Safety Education Based on Interactive Narrative; Jiangsu University: Zhenjiang, China, 2021. [Google Scholar]
- Kirillova, A.; Girshick, R.; He, K.; Dollar, P. Panoptic feature pyramid networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New York, NY, USA, 15–20 June 2019; pp. 6399–6408. [Google Scholar]
- Huang, Y.; Chen, R.; Chen, Y.; Ding, S.; Yao, J. A Fast bearing Fault diagnosis method based on lightweight Neural Network RepVGG. In Proceedings of the 4th International Conference on Advanced Information Science and System, Sanya, China, 25–27 November 2022; pp. 1–6. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Duan, X.; Sun, Y.; Wang, J. ECA-UNet for coronary artery segmentation and three-dimensional reconstruction. Signal Image Video Process 2023, 17, 783–789. [Google Scholar] [CrossRef]
- Niu, C.; Nan, F.; Wang, X. A super resolution frontal face generation model based on 3DDFA and CBAM. Displays 2021, 69, 102043. [Google Scholar] [CrossRef]
- Hoiem, D.; Divvala, S.K.; Hays, J.H. Pascal VOC 2008 challenge. World Lit. Today 2009, 24, 1–4. [Google Scholar]
















| Model Structure | mAP@0.5/(%) |
|---|---|
| Rep-YOLOv8 | 78.9 |
| Rep-YOLOv8 + ECA | 83.6 |
| Rep-YOLOv8 + ECA + CBAM-spatial | 84.3 |
| Rep-YOLOv8 + ECA + CBAM-spatial + eSE | 89.6 |
| Model | mAP@0.5/(%) | Params/(M) | FPS |
|---|---|---|---|
| YOLOv5m | 61.2 | 21.1 | 86.2 |
| YOLOv8s | 77.3 | 13.6 | 65.1 |
| Faster R-CNN | 73.2 | - | 7.5 |
| SSD | 77.2 | 26.2 | 44.6 |
| Cascade R-CNN | 82.76 | - | 12.8 |
| Hierarchical Rep-YOLOv8 | 89.6 | 13.9 | 83.9 |
| Model | mAP@0.5/(%) | Params/(M) |
|---|---|---|
| SegNet | 66.7 | 50.8 |
| CCNet | 71.4 | 60.8 |
| DANet | 73.9 | 58.9 |
| UperNet | 72.5 | 86.4 |
| OCRNet | 73.4 | 72.8 |
| DeepLabv3+ | 74.2 | 52.7 |
| Segformer | 75.9 | 101.2 |
| SETR-MLA | 76.1 | 311.8 |
| Hierarchical Rep-YOLOv8 | 75.5 | 13.9 |
| Method | TB | IFM | GDS | AR | OC-A | PP | CMM | OC-B | LMI |
| Rep-YOLOv8 | 0.882 | 0.933 | 0.995 | 0.854 | 0.704 | 0.995 | 0.947 | 0.504 | 0.871 |
| Hierarchical Rep-YOLOv8 | 0.993 | 0.995 | 0.995 | 0.995 | 0.995 | 0.995 | 0.995 | 0.995 | 0.995 |
| Method | UTM | GDS | LBL | GC | SRG | VLV | GCC | GB | ALL |
| Rep-YOLOv8 | 0.709 | 0.697 | 0.847 | 0.938 | 0.995 | 0.995 | 0.995 | 0.995 | 0.871 |
| Hierarchical Rep-YOLOv8 | 0.995 | 0.995 | 0.735 | 0.995 | 0.995 | 0.841 | 0.995 | 0.995 | 0.971 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhao, Y.; Lu, J.; Xu, J.; Miu, J.; Hua, H. Research on Small Dataset Object Detection Algorithm Based on Hierarchically Deployed Attention Mechanisms. Signals 2025, 6, 63. https://doi.org/10.3390/signals6040063
Zhao Y, Lu J, Xu J, Miu J, Hua H. Research on Small Dataset Object Detection Algorithm Based on Hierarchically Deployed Attention Mechanisms. Signals. 2025; 6(4):63. https://doi.org/10.3390/signals6040063
Chicago/Turabian StyleZhao, Yonggang, Jiongming Lu, Jixia Xu, Jiechu Miu, and Hangbo Hua. 2025. "Research on Small Dataset Object Detection Algorithm Based on Hierarchically Deployed Attention Mechanisms" Signals 6, no. 4: 63. https://doi.org/10.3390/signals6040063
APA StyleZhao, Y., Lu, J., Xu, J., Miu, J., & Hua, H. (2025). Research on Small Dataset Object Detection Algorithm Based on Hierarchically Deployed Attention Mechanisms. Signals, 6(4), 63. https://doi.org/10.3390/signals6040063

