Research on a Lightweight YOLOv9 Object Detection Algorithm Fused with Adaptive Gated Coordinate Attention
Abstract
1. Introduction
2. Overview of YOLOv9
3. Research Methods
3.1. AGCA-GELAN Deeply Integrated Backbone Network Architecture
3.2. Dual-Path Hybrid Pooling Feature Aggregation Strategy
3.3. Adaptive Gated Weight Fusion Mechanism
4. Experiments
4.1. Dataset
4.2. Experimental Environment
4.3. Evaluation Metrics
4.4. Ablation Experiments
4.5. Comparative Experiments
4.6. Limitations and Future Work
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Huang, L.; Fu, Q.; He, M.; Jiang, D.; Hao, Z. Detection algorithm of safety helmet wearing based on deep learning. Concurr. Comput. Pract. Exp. 2021, 33, e6234. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Liu, Y.; Chen, J. Detrs beat yolos on real-time object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 16965–16974. [Google Scholar]
- Chen, Y.; Wang, H.; Li, W.; Sakaridis, C.; Dai, D.; Van Gool, L. Scale-aware domain adaptive faster r-cnn. Int. J. Comput. Vis. 2021, 129, 2223–2243. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Han, K.; Zhang, T.; Peng, B.; Zhong, L.; Wu, S. Safety helmet detection algorithm based on improved YOLOv5. Mod. Electron. Tech. 2024, 47, 85–92. (In Chinese) [Google Scholar] [CrossRef]
- Hou, G.; Chen, Q.; Yang, Z.; Zhang, Y.; Zhang, D.; Li, H. Helmet detection method based on improved YOLOv5. Chin. J. Eng. 2024, 46, 329–342. (In Chinese) [Google Scholar] [CrossRef]
- Xiao, Z.; Yan, S.; Qu, H. Safety helmet detection method in complex environments based on multi-mechanism optimized YOLOv8. Comput. Eng. Appl. 2024, 60, 172–182. (In Chinese) [Google Scholar]
- Lei, Y.; Zhu, W.; Liao, H. Improved YOLOv8n safety helmet wearing detection algorithm in complex scenes. Softw. Eng. 2023, 26, 46–51. (In Chinese) [Google Scholar] [CrossRef]
- Han, B.; Zhang, J.; Lu, Z. FEV-YOLOv8n: Lightweight safety helmet wearing detection method. Comput. Meas. Control 2025, 33, 69–77. (In Chinese) [Google Scholar] [CrossRef]
- Hu, L.; Ren, J. YOLO-LHD: An enhanced lightweight approach for helmet wearing detection in industrial environments. Front. Built Environ. 2023, 9, 1288445. [Google Scholar] [CrossRef]
- Han, G.; Zhu, M.; Zhao, X.; Gao, H. Method based on the cross-layer attention mechanism and multiscale perception for safety helmet-wearing detection. Comput. Electr. Eng. 2021, 95, 107458. [Google Scholar] [CrossRef]
- Xiao, J.; Guo, H.; Yao, Y.; Zhang, S.; Zhou, J.; Jiang, Z. Multi-scale object detection with the pixel attention mechanism in a complex background. Remote Sens. 2022, 14, 3969. [Google Scholar] [CrossRef]
- Wang, C.Y.; Yeh, I.H.; Mark Liao, H.Y. Yolov9: Learning what you want to learn using programmable gradient information. In Proceedings of the European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2024; pp. 1–21. [Google Scholar]
- Pan, H.; Wei, Z.; Lei, X.; Yao, C.; Jiang, Z.; Zhang, L. CoordEF-YOLOv9t-based personnel behavior recognition in underground coal mines. Ind. Mine Autom. 2025, 51, 59–66. (In Chinese) [Google Scholar] [CrossRef]
- Liu, W.; Zhang, D. Pavement distress detection model based on improved YOLOv9. China Meas. Test 2025, 51, 19–29. (In Chinese) [Google Scholar]
- Wang, C.Y.; Liao, H.Y.M.; Wu, Y.H.; Chen, P.Y.; Hsieh, J.W.; Yeh, I.H. CSPNet: A new backbone that can enhance learning capability of CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 390–391. [Google Scholar]
- Wang, C.Y.; Liao, H.Y.M.; Yeh, I.H. Designing network design strategies through gradient path analysis. arXiv 2022, arXiv:2211.04800. [Google Scholar] [CrossRef]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13713–13722. [Google Scholar]
- Zhang, X.; Wang, H.; Zhang, Y. LRM-YOLO: A lightweight safety helmet wearing detection method for industrial scenes. J. Saf. Environ. 2026, 26, 151–159. (In Chinese) [Google Scholar] [CrossRef]
- Ding, X.; Zhang, X.; Ma, N.; Han, J.; Ding, G.; Sun, J. Repvgg: Making vgg-style convnets great again. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13733–13742. [Google Scholar]
- Gupta, S.; Tripathi, A.K. Flora-NET: Integrating dual coordinate attention with adaptive kernel based convolution network for medicinal flower identification. Comput. Electron. Agric. 2025, 230, 109834. [Google Scholar] [CrossRef]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Li, W.; Liu, K.; Zhang, L.; Cheng, F. Object detection based on an adaptive attention mechanism. Sci. Rep. 2020, 10, 11307. [Google Scholar] [CrossRef] [PubMed]
- Cao, Z.; Xu, L.; Zhang, R.; Zhang, J.; Pei, H.; Zhou, D.; Qiu, J. ADP: Graph Adaptive Pooling based on Edge Understanding with Graph Pooling Information Bottleneck. IEEE Trans. Consum. Electron. 2025, 72, 692–704. [Google Scholar] [CrossRef]
- Zhang, J.; Zhang, R.; Cao, Z.; Xu, L.; Chen, X.; Xu, M. It Takes Two: Multi-frequency Perception with Complementary Fusion Network for Complex Scene Segmentation. IEEE Trans. Circuits Syst. Video Technol. 2025, 36, 5288–5300. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Li, X.; Wang, W.; Hu, X.; Yang, J. Selective kernel networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–21 June 2019; pp. 510–519. [Google Scholar]
- Gupta, H.; Kotlyar, O.; Andreasson, H.; Lilienthal, A.J. Robust object detection in challenging weather conditions. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2024; pp. 7523–7532. [Google Scholar]
- Malaikrisanachalee, S.; Wongwai, N.; Kowcharoen, E. ESPCN-YOLO: A high-accuracy framework for personal protective equipment detection under low-light and small object conditions. Buildings 2025, 15, 1609. [Google Scholar] [CrossRef]
- Setyanto, A.; Sasongko, T.B.; Fikri, M.A.; Ariatmanto, D.; Agastya, I.M.A.; Rachmanto, R.D.; Ardana, A.; Kim, I.K. Knowledge distillation in object detection for resource-constrained edge computing. IEEE Access 2025, 13, 18200–18214. [Google Scholar] [CrossRef]






| Layer | Module/Component | Input Tensor () | Output Tensor () | Kernel Size/Stride | Params Inc. | GFLOPs Inc. |
|---|---|---|---|---|---|---|
| 0–3 | Stem & ELAN-1 | Various | – | – | ||
| 4 | RepNCSP_AGCA | , /1 | +5492 | +0.0063 | ||
| 5 | AConv (Downsample P4) | /2 | – | – | ||
| 6 | RepNCSP_AGCA | , /1 | +9748 | +0.0029 | ||
| 7 | AConv (Downsample P5) | /2 | – | – | ||
| 8 | RepNCSP_AGCA | , /1 | +15,028 | +0.0013 | ||
| 9–29 | Neck & Detect Head | Various | Various | – | – | |
| Total | AGCA-YOLOv9 (Ours) | – | – | – | +30,268 | +0.0105 |
| Attribute | Dataset 1 | Dataset 2 |
|---|---|---|
| Dataset Name | Safety Helmet and Reflective Jacket | Helmet-Vest-Belt |
| Total Images | 10,500 | 9270 |
| Classes | Safety Helmet, Reflective Jacket | Helmet, Vest, Belt |
| Train Set | 7350 (70%) | 7075 (76%) |
| Valid Set | 1575 (15%) | 1537 (17%) |
| Test Set | 1575 (15%) | 658 (7%) |
| Source (Ver.) | Kaggle (v1) | Roboflow (v1) |
| Dataset ID | niravnaik/safety-helmet-and-reflective-jacket | safety-detection-ftkxk/helmet-vest-belt |
| URL | https://www.kaggle.com/datasets/niravnaik/safety-helmet-and-reflective-jacket (accessed on 14 May 2026) | https://universe.roboflow.com/safety-detection-ftkxk/helmet-vest-belt/dataset/1 (accessed on 14 May 2026) |
| Parameter | Value |
|---|---|
| epochs | 150 |
| batch_size | 16 |
| imgsz | 640 |
| optimizer | SGD |
| initial_lr () | 0.01 |
| momentum | 0.937 |
| weight_decay | 0.00075 |
| warmup_epochs | 3.0 |
| warmup_momentum | 0.8 |
| box | 7.5 |
| cls | 0.5 |
| dfl | 1.2 |
| close_mosaic | 15 |
| mixup | 0.15 |
| copy_paste | 0.3 |
| conf_thres | 0.001 |
| iou_thres | 0.7 |
| Algorithm | A | B | C | P/% | R/% | mAP@50/% | mAP@50:95/% |
|---|---|---|---|---|---|---|---|
| 0 | × | × | × | 0.938 | 0.915 | 0.969 | 0.803 |
| 1 | ✓ | × | ✓ | 0.934 | 0.922 | 0.968 | 0.807 |
| 2 | ✓ | ✓ | × | 0.926 | 0.929 | 0.968 | 0.807 |
| 3 | ✓ | ✓ | ✓ | 0.937 | 0.919 | 0.967 | 0.809 |
| Algorithm Model | Params/ | GFLOPs | Latency/ms | P/% | R/% | mAP@0.5/% | mAP@50:95/% |
|---|---|---|---|---|---|---|---|
| Dataset 1: Safety Helmet and Reflective Jacket | |||||||
| YOLOv5s | 7.03 | 16.0 | 1.6 | 91.8 | 92.0 | 95.6 | 78.6 |
| YOLOv8s | 11.14 | 28.6 | 2.3 | 92.8 | 91.1 | 95.5 | 79.7 |
| YOLOv9s | 9.74 | 39.6 | 4.1 | 93.8 | 91.5 | 96.9 | 80.3 |
| YOLOv10s | 8.06 | 24.8 | 2.1 | 92.5 | 91.0 | 95.7 | 80.2 |
| YOLOv11s | 9.42 | 21.6 | 1.8 | 92.8 | 91.9 | 96.1 | 80.2 |
| AGCA-YOLOv9 | 9.77 | 39.6 | 4.6 | 93.7 | 91.9 | 96.7 | 80.9 |
| Dataset 2: Helmet-Vest-Belt | |||||||
| YOLOv5s | 7.03 | 16.0 | 1.4 | 90.3 | 78.8 | 86.2 | 55.5 |
| YOLOv8s | 11.14 | 28.6 | 1.8 | 86.5 | 82.5 | 86.6 | 58.2 |
| YOLOv9s | 9.74 | 39.6 | 4.2 | 89.5 | 81.6 | 86.9 | 59.0 |
| YOLOv10s | 8.06 | 24.8 | 2.2 | 88.3 | 79.7 | 85.1 | 58.0 |
| YOLOv11s | 9.42 | 21.6 | 1.9 | 89.1 | 80.5 | 85.8 | 58.6 |
| AGCA-YOLOv9 | 9.77 | 39.6 | 4.6 | 89.4 | 81.1 | 87.6 | 60.5 |
| Dataset | Model | mAP@0.5 (%) | mAP@0.5:0.95 (%) |
|---|---|---|---|
| Safety Helmet and Reflective Jacket | YOLOv9s (Baseline) | 96.93 ± 0.06 | 80.37 ± 0.06 |
| AGCA-YOLOv9 (Ours) | 96.77 ± 0.12 | 80.80 ± 0.10 | |
| Helmet-Vest-Belt | YOLOv9s (Baseline) | 87.07 ± 0.29 | 59.13 ± 0.15 |
| AGCA-YOLOv9 (Ours) | 87.53 ± 0.31 | 60.37 ± 0.12 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Lv, C.; Zhou, W.; Li, Y.; Song, Y.; Zhang, X. Research on a Lightweight YOLOv9 Object Detection Algorithm Fused with Adaptive Gated Coordinate Attention. Mathematics 2026, 14, 1738. https://doi.org/10.3390/math14101738
Lv C, Zhou W, Li Y, Song Y, Zhang X. Research on a Lightweight YOLOv9 Object Detection Algorithm Fused with Adaptive Gated Coordinate Attention. Mathematics. 2026; 14(10):1738. https://doi.org/10.3390/math14101738
Chicago/Turabian StyleLv, Condong, Wenjie Zhou, Yi Li, Yupeng Song, and Xiaodong Zhang. 2026. "Research on a Lightweight YOLOv9 Object Detection Algorithm Fused with Adaptive Gated Coordinate Attention" Mathematics 14, no. 10: 1738. https://doi.org/10.3390/math14101738
APA StyleLv, C., Zhou, W., Li, Y., Song, Y., & Zhang, X. (2026). Research on a Lightweight YOLOv9 Object Detection Algorithm Fused with Adaptive Gated Coordinate Attention. Mathematics, 14(10), 1738. https://doi.org/10.3390/math14101738

