An Improved YOLOv11 Recognition Algorithm for Heavy-Duty Trucks on Highways
Abstract
1. Introduction
- (1)
- Given the dual requirements of real-time performance and accuracy in expressway truck tarpaulin detection, coupled with constrained hardware resources, we replace the convolutional layers in YOLOv11’s backbone feature extraction network with a lightweight SCConv module. This modification reduces model parameters and computational load while preserving robust feature extraction capabilities.
- (2)
- To enhance the model’s focus on critical features and thereby improve overall object detection performance, we integrate the CSMAM mechanism following the channel concatenation layer of the C3k2 module. This enables the model to precisely concentrate on essential features, effectively boosting detection accuracy while strengthening adaptability across diverse scenarios and simultaneously enhancing generalization capability.
- (3)
- To substantially enhance object detection performance and address the suboptimal performance of the original detection head in complex scenarios, the DEC-Head detector is employed, which has superior architectural design and algorithmic mechanisms. It can enhance detection accuracy while improving adaptability to challenging environments and boosting detection robustness.
- (4)
- To ensure effective convergence of the loss function network during training, we replace the bounding box loss function with Scylla-IoU (SIoU) [19]. This modification enhances the regression accuracy of predicted bounding boxes while optimizing gradient behavior for stable model optimization.
2. Materials and Methods
2.1. YOLOv11 Network
- (1)
- Backbone Feature Extraction Network (Backbone): It employs C3k2 modules (a highly efficient convolutional block combining standard convolutions and residual connections) as its core component, splits the feature maps, and applies a 3 × 3 small convolutional kernel to optimize information flow, which can reduce parameter count while enhancing feature representation capability. The integration of the C2PSA module (a channel-to-space transformation module) and Partial Spatial Attention (PSA) strengthens spatial information awareness and dynamically focuses on critical regions. Residual connections and bottleneck structures further mitigate gradient vanishing and improve training efficiency. On the COCO dataset, the backbone achieves higher mAP with significantly fewer parameters (approximately 22% reduction compared to YOLOv8m), balancing accuracy and lightweight design.
- (2)
- Feature Fusion Network (Neck): Employing a Path Aggregation Network (PAN) structure, it integrates the C3k2 module and Spatial Pyramid Pooling Fast (SPPF) module (a rapid variant of spatial pyramid pooling that generates fixed-size representations from feature maps of any scale), fusing shallow details with deep semantics through cross-layer connections. Adaptive Spatial Feature Modulation (ASFM) is introduced to dynamically adjust feature weights across spatial dimensions, enhancing localization accuracy in complex scenarios. Furthermore, by incorporating the High-level Screening Feature Pyramid Network (HS-FPN), which utilizes channel attention mechanisms to filter and refine low-level features, the network effectively mitigates detection errors caused by scale variation while significantly boosting small-object detection capability.
- (3)
- Detection Head (Head): Maintaining the multi-scale prediction strategy, the head outputs bounding boxes from P3, P4, and P5 feature maps for small, medium, and large targets, respectively. Computational efficiency is optimized through depthwise separable convolutions, where traditional convolutions in classification branches are replaced by Depthwise Separable Convolution (DW-Conv, which factorizes a standard convolution into a depthwise convolution followed by a pointwise convolution, thereby significantly reducing computational cost and model parameters). Through dynamic weight allocation, the head enhances detection stability for dense scenes and targets with extreme aspect ratios in complex traffic environments.
2.2. Enhanced YOLOv11 Architecture
2.2.1. Enhanced Backbone Feature Extraction Network
2.2.2. Attention Mechanism Integration
2.2.3. Detection Head Replacement
2.2.4. Loss Function Optimization
3. Results
3.1. Experimental Environment and Parameters
3.2. Dataset
3.3. Evaluation Metrics
3.4. Visualization Analysis
3.5. Comparative Experimental Setup
3.6. Ablation Study
3.7. Comparative Experiments
3.8. Visual Comparative Results of Recognition
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Yang, S.; Yang, Y.; Li, X.; Yang, Z.; Guo, Y.; Yang, C. Exploring the State-of-the-Practices in Smart Expressway: Multi-Country Comparison. In Proceedings of the 2023 IEEE 8th International Conference on Intelligent Transportation Engineering (ICITE), Beijing, China, 28–30 October 2023; pp. 315–320. [Google Scholar]
- Yang, Z.; Hao, L.; Liu, Y.; Duan, L.; Jia, C. Research on New Framework Based on Existing Smart Expressway Construction Guides. J. Highw. Transp. Res. Dev. 2024, 18, 54–62. [Google Scholar] [CrossRef]
- Xiong, L.Y.; Tu, S.C.; Huang, X.H.; Yu, J.Y.; Xie, Y.C. Vehicle detection method based on the MobileVit lightweight network. Appl. Res. Comput. 2022, 2545–2549. [Google Scholar] [CrossRef]
- Tian, Z.; Chen, F.; Ma, S.; Guo, M. Analysis of the Severity of Heavy Truck Traffic Accidents Under Different Road Conditions. Appl. Sci. 2024, 14, 10751. [Google Scholar] [CrossRef]
- Mordia, R.; Verma, A.K. Nondestructive testing methods for rail defects detection. High-Speed Railw. 2025, 3, 163–173. [Google Scholar] [CrossRef]
- Wen, H.Y.; Huang, K.H.; Zhao, S. Prediction of rear-end collision risk for heavy trucks on expressway based on machine Learning. China Saf. Sci. J. 2023, 173–180. [Google Scholar] [CrossRef]
- Tang, X.; Zhang, Z.Q.; Qin, Y.H. On-road object detection and tracking based on radar and vision fusion: A review. IEEE Intell. Transp. Syst. Mag. 2021, 14, 103–128. [Google Scholar] [CrossRef]
- Wei, W.; Liu, M.X.; Chen, X.B. Research status and development trend of underground intelligent load-haul-dump vehicle—A comprehensive review. Appl. Sci. 2022, 12, 9290. [Google Scholar]
- Negri, P.; Clady, X.; Milgram, M.; Poulenard, R. An Oriented-Contour Point-Based Voting Algorithm for Vehicle Type Classification. In Proceedings of the 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong, China, 20–24 August 2006; pp. 1051–4651. [Google Scholar]
- Sun, W.; Chen, X.; Zhang, X.; Dai, G.; Chang, P.; He, X. A multi-feature learning model with enhanced local attention for vehicle re-identification. Comput. Mater. Contin. 2021, 69, 3549–3561. [Google Scholar] [CrossRef]
- Wang, J.; Zheng, H.; Huang, Y.; Ding, X. Vehicle type recognition in surveillance images from labeled web-nature data using deep transfer learning. IEEE Trans. Intell. Transp. Syst. 2017, 19, 2913–2922. [Google Scholar] [CrossRef]
- Suhao, L.; Jinzhao, L.; Guoquan, L.; Tong, B.; Huiqian, W.; Yu, P. Vehicle type detection based on deep learning in traffic scenes. Procedia Comput. Sci. 2018, 131, 564–572. [Google Scholar] [CrossRef]
- Sun, W.; Dai, L.; Zhang, X.; Chang, P.; He, X. RSOD: Real-time small object detection algorithm in UAV-based traffic monitoring. Appl. Intell. 2022, 131, 8448–8463. [Google Scholar] [CrossRef]
- Li, L. Improved Faster R-CNN-Based Anomaly Target Detection for Truck Driving Environment. Acad. J. Comput. Inf. Sci. 2023, 2, 61–67. [Google Scholar]
- Trivedi, M.M.; Gandhi, T.; McCall, J. Looking-in and looking-out of a vehicle: Computer-vision-based enhanced vehicle safety. IEEE Trans. Intell. Transp. Syst. 2007, 8, 108–120. [Google Scholar] [CrossRef]
- Li, G.; Hu, Z.; Zhang, H. HR-YOLO: Segmentation and detection of emergency escape ramp scenes using an integrated HR-net and improved YOLOv12 model. Traffic Inj. Prev. 2025, 1–8. [Google Scholar] [CrossRef] [PubMed]
- Yaseen. Real-Time Face Gesture-Based Robot Control Using GhostNet in a Unity Simulation Environment. Sensors 2025, 25, 6090. [Google Scholar] [CrossRef] [PubMed]
- Yoon, M.; Seo, D.; Kim, S.; Kim, K. V2X Network-Based Enhanced Cooperative Autonomous Driving for Urban Clusters in Real Time: A Model for Control, Optimization and Security. Electronics 2025, 14, 1629. [Google Scholar] [CrossRef]
- Gevorgyan, Z. SIoU loss: More powerful learning for bounding box regression. arXiv 2022, arXiv:2205.12740. [Google Scholar] [CrossRef]
- Li, J.; Wen, Y.; He, L. Scconv: Spatial and channel reconstruction convolution for feature redundancy. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 6153–6162. [Google Scholar]
- Yu, Z.; Huang, H.; Chen, W.; Su, Y.; Liu, Y.; Wang, X. Yolo-facev2: A scale and occlusion aware face detector. Pattern Recognit. 2024, 155, 110714. [Google Scholar] [CrossRef]
- Chen, Y.; Chen, S.; Liu, H.; Xiong, H.; Zhang, Y. Low-light image enhancement network based on central difference convolution. Eng. Appl. Artif. Intell. 2025, 158, 111492. [Google Scholar] [CrossRef]













| Configuration Item | Specification |
|---|---|
| Operating System | Windows10 |
| GPU | NVIDIA GeForce RTX 3080Ti+CUDA 11.6 |
| CPU | Intel(R) Core(TM) i9-11900K @3.50 GHz |
| Framework | Pytorch 2.2.2 |
| Language | Python 3.8.19 |
| Axle Type | Tarpaulin Material | Train Set | Val Set | Test Set |
|---|---|---|---|---|
| Four-Axle | Canvas | 432 | 114 | 46 |
| Four-Axle | PE | 414 | 121 | 52 |
| Four-Axle | PVC | 501 | 143 | 63 |
| Six-Axle | Canvas | 528 | 143 | 66 |
| Six-Axle | PE | 693 | 209 | 83 |
| Six-Axle | PVC | 627 | 211 | 72 |
| Total Targets | 4381 | 1238 | 622 | |
| Total Images | 4017 | 1148 | 574 | |
| Exp.ID | SCConv | C3k2_CSMAM | DEC-Head | SIoU | P/% | R/% | mAP@0.5:0.95/% | GFLOPs | FPS | Model Size/MB |
|---|---|---|---|---|---|---|---|---|---|---|
| YOLOv11n | 80.8 | 83.9 | 84.6 | 9.2 | 128 | 6.3 | ||||
| A | √ | 78.5 | 84.3 | 86.8 | 7.0 | 180 | 5.1 | |||
| B | √ | √ | 82.0 | 87.7 | 89.2 | 7.8 | 146 | 5.5 | ||
| C | √ | √ | √ | 84.6 | 89.4 | 91.4 | 7.1 | 162 | 5.9 | |
| D | √ | √ | √ | √ | 85.2 | 89.1 | 91.8 | 7.2 | 169 | 6.1 |
| Model | P/% | R/% | mAP/% | Model Size/MB | Model |
|---|---|---|---|---|---|
| Faster RCNN | 64.8 | 59.1 | 71.9 | 26.1 | Faster RCNN |
| SSD | 71.1 | 64.5 | 74.5 | 28.2 | SSD |
| YOLOv5 | 70.2 | 69.8 | 72.8 | 9.1 | YOLOv5 |
| YOLOv8 | 77.4 | 76.8 | 82.7 | 6.7 | YOLOv8 |
| YOLOv10 | 77.9 | 79.4 | 81.4 | 6.6 | YOLOv10 |
| YOLOv11 | 80.8 | 83.9 | 84.6 | 6.3 | YOLOv11 |
| Prop. Alg. | 85.2 | 89.1 | 91.8 | 6.1 | Prop. Alg. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Guo, J.; Zhang, M. An Improved YOLOv11 Recognition Algorithm for Heavy-Duty Trucks on Highways. Electronics 2025, 14, 4621. https://doi.org/10.3390/electronics14234621
Guo J, Zhang M. An Improved YOLOv11 Recognition Algorithm for Heavy-Duty Trucks on Highways. Electronics. 2025; 14(23):4621. https://doi.org/10.3390/electronics14234621
Chicago/Turabian StyleGuo, Junkai, and Mingjiang Zhang. 2025. "An Improved YOLOv11 Recognition Algorithm for Heavy-Duty Trucks on Highways" Electronics 14, no. 23: 4621. https://doi.org/10.3390/electronics14234621
APA StyleGuo, J., & Zhang, M. (2025). An Improved YOLOv11 Recognition Algorithm for Heavy-Duty Trucks on Highways. Electronics, 14(23), 4621. https://doi.org/10.3390/electronics14234621

