Subgrade Distress Detection in GPR Radargrams Using an Improved YOLOv11 Model
Abstract
1. Introduction
2. Object Detection Algorithms
2.1. Single Shot MultiBox Detector
- (1)
- Multi-scale Feature Maps: The architectural transition from the original VGG16 to a multi-scale framework is initiated by the convolutionalization of fully connected layers (FC6, FC7) and the subsequent integration of auxiliary stages (Conv8–Conv11). This structural reconfiguration establishes a hierarchical feature pyramid that facilitates a dual-pathway extraction mechanism. Within this hierarchy, shallow layers retain high-resolution spatial cues essential for identifying subtle distress details, while the deepened semantic layers capture the macro-scale signatures of extensive subgrade anomalies.
- (2)
- Default boxes: At every location on each feature map, a fixed set of default boxes is assigned. The scales are determined by the following expression:
- (3)
- Detection Head. Default boxes produce two primary outputs: class confidence scores and localization offsets for each box.
2.2. Faster Region-Based Convolutional Neural Network
- (1)
- Feature Extraction Backbone: A Convolutional Neural Network (CNN) serves as the primary architecture for this module. Through a succession of convolutional layers, the network processes input GPR radargrams to extract multi-level, shared feature maps. This hierarchical mapping enables the internal representation of complex subgrade characteristics from the raw radar data.
- (2)
- Region Proposal Network (RPN): This module uses an anchor mechanism that presets multiple anchors at each spatial location of the feature map. It performs a multi-task objective—predicting objectness scores and bounding-box regression offsets—and subsequently selects the Top-N proposals based on their confidence.
- (3)
- Region of Interest (RoI) Pooling: The candidate regions generated by the RPN are initially mapped onto the feature maps. This mapping process is succeeded by an RoI Pooling stage, which normalizes variable-sized regions into uniform, fixed-dimension feature blocks. By standardizing these representations, the layer effectively resolves the discrepancy in candidate box scales while preserving the essential spatial integrity of the target signatures.
- (4)
- Classification and Regression Heads: These standardized feature blocks are propagated through fully connected (FC) layers to achieve high-level feature integration. Such integrated representations then serve as the common input for two parallel prediction branches: a Softmax layer determines the specific distress category, while a bounding box regression branch simultaneously refines the spatial coordinates. This dual-branch optimization culminates in the precise localization and identification of the subgrade anomalies within the GPR radargrams.
2.3. You Only Look Once v11
- (1)
- Backbone network: The backbone is designed to extract discriminative features from input geophysical radar radargrams. This extraction begins with a Conv module, which performs initial feature construction via 2-D convolution, normalization, and SiLU activation. The resulting representations are further refined by C3k2 modules utilizing multi-scale feature bottlenecks to enhance the network’s expressive capacity. The processing sequence concludes with an SPPF layer, which leverages multi-scale pooling to strengthen the receptive field for subgrade anomalies situated at varying depths.
- (2)
- Neck network: The neck network is configured with a Path Aggregation Network (PAN) architecture. Within this framework, the collaborative operation of Upsample, Concat, and C3k2 units facilitates the integration of multi-scale features. This integrative process enables the bidirectional transfer and optimization of hierarchical representations from the backbone, thereby enhancing the capture of multi-scale distress targets within the subgrade.
- (3)
- Head network: The “Detect” module executes the final prediction task on feature maps of varying resolutions. Facilitating this predictive task is the adoption of depthwise separable convolution, which decouples the process into channel-wise and point-wise components. This mechanism enables the precise output of distress categories and bounding box coordinates.
- (4)
- Specialized modules:
3. Evaluation of Object Detection Algorithms
3.1. Dataset Processing
3.2. Experimental Evaluation Metrics
- (1)
- Precision (P)
- (2)
- Recall (R)
- (3)
- F1-Score (F1)
- (4)
- mean Average Precision (mAP)
- (5)
- Frames Per Second (FPS)
3.3. Analysis of Tunnel Burial Depth Effect
4. Improvements to the YOLOv11 Algorithm
4.1. Multi-Scale Edge Enhancement Module
- (1)
- Multi-scale Response Path: Within GPR radargrams, reflected energy from various subgrade distresses and burial depths exhibits substantial spatial scale variations. Such multi-scale characteristics challenge traditional 3 × 3 convolutional kernels, whose limited receptive fields struggle to balance macroscopic structural outlines with microscopic localized details. To address this, the MEEM employs a parallel processing architecture that integrates multiple sets of 1 × 1 convolutional (Conv) and 3 × 3 average pooling (AP) layers. This structural configuration simulates diverse receptive fields, enabling the simultaneous capture of large-scale distress boundaries and minor anomaly features. Consequently, the reliance on this multi-scale representation effectively mitigates missed detections stemming from the significant size spans of subgrade targets.
- (2)
- Edge Enhancer (EE) Branch: Subgrade distress is primarily manifested in radargrams as amplitude fluctuations or phase reversals at dielectric interfaces. These interfacial reflections typically form hyperbolic diffraction arcs; however, soil attenuation and clutter interference frequently render their edges blurry and degrade the signal-to-noise ratio. To mitigate such signal degradation, the module incorporates a specialized EE branch that extracts image gradient information. By locking onto and amplifying energy jumps at the reflection interfaces while suppressing random background clutter, this branch effectively transforms blurry defect boundaries into sharp, clear feature-level representations.
- (3)
- Feature Integration and Optimization: The inherent complexity of subgrade environmental data necessitates the precise extraction of distress features from chaotic reflection signals. This requirement is addressed by concatenating (C) the enhanced multi-scale edge features with the branches of the Detail Enhancement Module (DEM). The resulting deep features are subsequently integrated through a 1 × 1 convolution layer (Conv) to concentrate the model’s energy on the core distress area. This focused energy distribution ultimately optimizes positioning accuracy and robustness within complex subgrade environments.
4.2. Multi-Feature Multi-Scale Attention
- (1)
- Multi-Scale Decomposition: Within GPR radargrams, signal responses exhibit pronounced multi-scale characteristics resulting from the diverse burial depths and geometric dimensions of detection targets. Addressing these characteristics begins with a ResNest backbone to extract fundamental features, followed by a Split-Attention mechanism that directs the feature flow into parallel branches. This branched architecture facilitates a Multi-Scale Decomposition stage, where differentiated physical receptive fields are constructed to achieve the simultaneous extraction of macroscopic structures and subtle distress details.
- (2)
- Multi-Frequency Channel Attention (MFCA): To effectively suppress the frequent background clutter and random noise in measured radar data, the MFCA module utilizes the 2D Discrete Cosine Transform (DCT) to map features into the frequency domain [37]. This spectral transformation facilitates the analysis of global frequency components, enabling a precise focus on reflection interfaces with physical significance. Compared to traditional spatial domain pooling, such targeted frequency-domain analysis significantly enhances signal discrimination in complex environments.
- (3)
- Multi-Scale Spatial Attention (MSSA): Building upon the frequency-domain calibration, the MSSA module implements adaptive weight parameters to modulate the information transfer ratio between distress targets and background strata. This dynamic modulation facilitates the precise capture of irregular scattering edges, an enhancement that ultimately strengthens the model’s positioning robustness within complex subgrade environments.
- (4)
- Feature Fusion and Calibration: Features extracted from individual branches are integrated via a Multi-scale Feature Fusion layer. This integration is subsequently refined through depth calibration using a 1 × 1 convolution (Conv) layer, a process that ultimately yields highly robust and discriminative features.
4.3. Analysis of Ablation Experiment
5. Conclusions
- (1)
- Stable convergence on the GPR dataset was achieved across all three architectures. This convergent stability, maintained without signs of overfitting, facilitated the effective learning of distress characteristics. YOLOv11 demonstrates the best overall performance balance among the evaluated models. While Faster R-CNN excels in mean Average Precision@0.5:0.95, highlighting its potential for high-precision localization, its low Frames Per Second restricts practical engineering deployment. Conversely, although SSD offers the fastest convergence, its inferior core metrics make it unsuitable for the demands of complex GPR scenarios.
- (2)
- YOLOv11_MEEM achieves a 0.2% increase in Precision and a 0.3% improvement in mean Average Precision@0.5:0.95 at the cost of a marginal 0.2% reduction in Recall, making it ideal for multi-scale distress detection in noisy environments. While YOLOv11_MFMSA yields comparable Precision, its significantly lower Recall highlights a need for improved multi-frequency fusion efficiency. The YOLOv11_MEEM+MFMSA exhibits the poorest performance across core metrics, confirming the functional incompatibility of these two modules.
- (3)
- By leveraging the dynamic edge enhancement mechanism, the MEEM achieves superior false detection control under complex road conditions, effectively balancing detection accuracy and computational efficiency. The YOLOv11_MEEM model is particularly well-suited for subgrade image recognition tasks. This model significantly reduces the detection and diagnosis cycle for subgrade defects, thereby providing reliable technical support for rapid infrastructure repair and resilience enhancement.
- (4)
- The advancements in the intelligent interpretation of GPR images in this study provide significant value for the construction of sustainable transportation systems. By enabling early identification and real-time monitoring of subgrade distresses, the proposed model effectively reduces long-term maintenance costs and prevents catastrophic traffic accidents caused by subgrade collapse. Consequently, it enhances the operational sustainability of urban infrastructure from both economic and social perspectives, while supporting the digital management of smart city assets.
- (5)
- Although effective, the model’s detection of small-scale distresses and the efficiency of feature fusion require further optimization. Given that data augmentation currently relies on manual screening, future research will introduce adaptive data augmentation techniques to enhance automation and systematically expand dataset diversity.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Qi, Y.L.; Bai, M.Z.; Li, Z.L.; Zhang, Z.L.; Wang, Q.H.; Tian, G. Study on intelligent recognition of urban road subgrade defect based on deep learning. Sci. Rep. 2024, 14, 28119. [Google Scholar] [CrossRef] [PubMed]
- Wang, D.Y.; Qi, Y.L.; Bai, M.Z.; Li, Z.L.; Song, L.L.; Tian, G. Research on Radar Forward Modeling for Detecting Urban Road Subgrade Disease Based on Radar. In Engineering Geology for a Habitable Earth: IAEG XIV Congress 2023 Proceedings, Chengdu, China; Springer: Berlin/Heidelberg, Germany, 2024; pp. 143–159. [Google Scholar]
- Cheng, Z.H.; Song, X.G.; Wang, J.Z.; Du, C.; Wu, J.Q. Intelligent identification for subgrade disease based on multi-source data. Measurement 2025, 251, 117200. [Google Scholar] [CrossRef]
- Wubuli, A.; Li, F.F.; Zhou, C.Z.; Zhang, L.L.; Jiang, J.R. Knowledge Graph- and Bayesian Network-Based Intelligent Diagnosis of Highway Diseases: A Case Study on Maintenance in Xinjiang. Sustainability 2025, 17, 1450. [Google Scholar] [CrossRef]
- Li, Y.N.; Liu, H.B.; Wang, S.L.; Jiang, B.; Fischer, S. Method of Railway Subgrade Diseases (defects) Inspection, based on Ground Penetrating Radar. Acta Polytech. Hung 2022, 19, 199–211. [Google Scholar] [CrossRef]
- Xiong, H.Q.; Li, J.; Li, Z.L.; Zhang, Z.Y. GPR-GAN: A Ground-Penetrating Radar Data Generative Adversarial Network. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5200114. [Google Scholar] [CrossRef]
- Jin, G.L.; Liu, Q.L.; Cai, W.L.; Li, M.J.; Lu, C.D. Performance Evaluation of Convolutional Neural Network Models for Classification of Highway Hidden Distresses with GPR B-Scan Images. Appl. Sci. 2024, 14, 4226. [Google Scholar] [CrossRef]
- Liu, Z.H.; Xiao, J.P.; Shen, R.J.; Liu, J.X.; Guo, Z.W. Deep Learning-Based Suppression of Strong Noise in GPR Data for Railway Subgrade Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5915709. [Google Scholar] [CrossRef]
- Yang, Y.; Huang, L.M.; Zhang, Z.H.; Zhang, J.; Zhao, G.M. CycleGAN-Based Data Augmentation for Subgrade Disease Detection in GPR Images with YOLOv5. Electronics 2024, 13, 830. [Google Scholar] [CrossRef]
- Huang, Z.Y.; Xu, G.Y.; Tang, J.M.; Yu, H.Y.; Wang, D.Y. Research on Void Signal Recognition Algorithm of 3D Ground-Penetrating Radar Based on the Digital Image. Front. Mater. 2022, 9, 850694. [Google Scholar] [CrossRef]
- Liu, H.; Wang, S.L.; Jing, G.Q.; Yu, Z.Y.; Yang, J.; Zhang, Y.; Guo, Y.L. Combined CNN and RNN Neural Networks for GPR Detection of Railway Subgrade Diseases. Sensors 2023, 23, 5383. [Google Scholar] [CrossRef]
- Zhao, K.; Ren, X.X.; Kong, Z.Z.; Liu, M. Object detection on remote sensing images using deep learning: An improved single shot multibox detector method. J. Electron. Imaging 2019, 28, 033026. [Google Scholar]
- Lenatti, M.; Narteni, S.; Paglialonga, A.; Rampa, V.; Mongelli, M. Dual-View Single-Shot Multibox Detector at Urban Intersections: Settings and Performance Evaluation. Sensors 2023, 23, 3195. [Google Scholar]
- Xi, R.; Hou, J.; Lou, W. Potato Bud Detection with Improved Faster R-CNN. Trans. Asabe 2020, 63, 557–569. [Google Scholar]
- Xu, X.Y.; Zhao, M.; Shi, P.X.; Ren, R.Q.; He, X.H.; Wei, X.J.; Yang, H. Crack Detection and Comparison Study Based on Faster R-CNN and Mask R-CNN. Sensors 2022, 22, 1215. [Google Scholar] [CrossRef]
- Tian, Z.; Yang, F.; Yang, L.; Wu, Y.J.; Chen, J.Y.; Qian, P. An Optimized YOLOv11 Framework for the Efficient Multi-Category Defect Detection of Concrete Surface. Sensors 2025, 25, 1291. [Google Scholar] [CrossRef]
- Zou, C.; Yu, S.Q.; Yu, Y.K.; Gu, H.T.; Xu, X.L. Side-Scan Sonar Small Objects Detection Based on Improved YOLOv11. J. Mar. Sci. Eng. 2025, 13, 162. [Google Scholar] [CrossRef]
- Zha, X.; Peng, H.; Qin, X.; Li, G.; Yang, S.H. A Deep Learning Framework for Signal Detection and Modulation Classification. Sensors 2019, 19, 4042. [Google Scholar] [CrossRef] [PubMed]
- Chen, Y.C.; Yu, K.M.; Kao, T.H.; Hsieh, H.L. Deep learning based real-time tourist spots detection and recognition mechanism. Sci. Prog. 2021, 104, 00368504211044228. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.Z.; Niu, P.H.; Guo, X.Y.; Yang, G.E.; Chen, J. Single Shot Multibox Detector with Deconvolutional Region Magnification Procedure. IEEE Access 2021, 9, 47767–47776. [Google Scholar] [CrossRef]
- Zhang, L.; Xing, B.W.; Wang, W.G.; Xu, J.X. Sea Cucumber Detection Algorithm Based on Deep Learning. Sensors 2022, 22, 5717. [Google Scholar] [CrossRef]
- Li, H.L.; Huang, Y.Q.; Zhang, Z.J. An Improved Faster R-CNN for Same Object Retrieval. IEEE Access 2017, 5, 13665–13676. [Google Scholar] [CrossRef]
- Qi, L.; Li, B.Y.; Chen, L.K.; Wang, W.; Dong, L.; Jia, X.; Huang, J.; Ge, C.W.; Xue, G.M.; Wang, D. Ship Target Detection Algorithm Based on Improved Faster R-CNN. Electronics 2019, 8, 959. [Google Scholar] [CrossRef]
- Ding, X.T.; Li, Q.D.; Cheng, Y.Q.; Wang, J.B.; Bian, W.X.; Jie, B. Local keypoint-based Faster R-CNN. Appl. Intell. 2020, 50, 3007–3022. [Google Scholar] [CrossRef]
- Huang, H.; Wang, C.; Liu, S.C.; Sun, Z.H.; Zhang, D.J.; Liu, C.C.; Jiang, Y.; Zhan, S.Y.; Zhang, H.F.; Xu, R. Single spectral imagery and faster R-CNN to identify hazardous and noxious substances spills. Environ. Pollut. 2020, 258, 113688. [Google Scholar] [CrossRef]
- Cheng, C.; Cheng, X.Y.; Li, D.B.; Zhang, J.W. Drill pipe detection and counting based on improved YOLOv11 and Savitzky-Golay. Sci. Rep. 2025, 15, 167779. [Google Scholar] [CrossRef]
- He, L.H.; Zhou, Y.Z.; Liu, L.; Cao, W.; Ma, J.H. Research on object detection and recognition in remote sensing images based on YOLOv11. Sci. Rep. 2025, 15, 14032. [Google Scholar] [CrossRef]
- Gao, Y.L.; Xin, Y.B.; Yang, H.; Wang, Y.J. A Lightweight Anti-Unmanned Aerial Vehicle Detection Method Based on Improved YOLOv11. Drones 2025, 9, 11. [Google Scholar] [CrossRef]
- Sapkota, R.; Flores-Calero, M.; Qureshi, R.; Badgujar, C.; Nepal, U.; Poulose, A.; Zeno, P.; Vaddevolu, U.; Khan, S.; Shoman, M.; et al. YOLO advances to its genesis: A decadal and comprehensive review of the You Only Look Once (YOLO) series. Artif. Intell. Rev. 2025, 58, 274. [Google Scholar] [CrossRef]
- He, L.H.; Zhou, Y.Z.; Liu, L.; Ma, J.H. Research and Application of YOLOv11-Based Object Segmentation in Intelligent Recognition at Construction Sites. Buildings 2024, 14, 3777. [Google Scholar] [CrossRef]
- Cheng, S.; Han, Y.; Wang, Z.Q.; Liu, S.J.; Yang, B.; Li, J.R. An Underwater Object Recognition System Based on Improved YOLOv11. Electronics 2025, 14, 201. [Google Scholar] [CrossRef]
- Zhang, L.; Zheng, A.; Sun, X.Y.; Sun, Z.P. Enhanced YOLOv11-Based River Aerial Image Detection Research. IEEE Geosci. Remote Sens. Lett. 2025, 22, 8002405. [Google Scholar] [CrossRef]
- Liu, J.; Zhao, J.Y.; Cao, Y.Y.; Wang, Y.; Dong, C.Y.; Guo, C.P. Road manhole cover defect detection via multi-scale edge enhancement and feature aggregation pyramid. Sci. Rep. 2025, 15, 10346. [Google Scholar] [CrossRef]
- Jia, L.; He, X.; Huang, A.; Jia, B.B.; Wang, X.F. Highly efficient encoder-decoder network based on multi-scale edge enhancement and dilated convolution for LDCT image denoising. Signal Image Video Process. 2024, 18, 6081–6091. [Google Scholar] [CrossRef]
- Chen, J.W.; Yue, J.H.; Zhou, H.; Hu, Z.Q. NAF-MEEF: A Nonlinear Activation-Free Network Based on Multi-Scale Edge Enhancement and Fusion for Railway Freight Car Image Denoising. Sensors 2025, 25, 2672. [Google Scholar] [CrossRef]
- Nam, J.H.; Syazwany, N.S.; Kim, S.J.; Lee, S.C. Modality-Agnostic Domain Generalizable Medical Image Segmentation by Multi-Frequency in Multi-Scale Attention. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 11480–11491. [Google Scholar]
- Mukherjee, D. Parallel implementation of discrete cosine transform and its inverse for image compression applications. J. Supercomput. 2024, 80, 23712–23735. [Google Scholar] [CrossRef]










| Category | Parameter Name | Setting Value |
|---|---|---|
| Basic configuration | Model size | 1.5 m × 1.7 m × 1.5 m |
| Excitation source/frequency | Ricker wavelet/400 MHz | |
| Layer thickness Permittivity Conductivity | Air layer | 0.2 m |
| Asphalt layer | 0.1 m, ϵ = 4, σ = 0.001 S/m | |
| Concrete layer | 0.2 m, ϵ = 9, σ = 0.05 S/m | |
| Soil layer | 1.2 m, ϵ = 23, σ = 0.1 S/m | |
| Disease noise | Boundary random noise | 0.02 m |
| Defect Types | Original | Augmented | Excluded | Total | Train | Val |
|---|---|---|---|---|---|---|
| loosening; voids; cavities; underground structures | 1923 | 13,461 | 294 | 15,090 | 12,071 | 3019 |
| Precision | Recall | F1 | mAP@0.5 | mAP@ 0.5:0.95 | FPS | |
|---|---|---|---|---|---|---|
| SSD | 0.971 | 0.881 | 0.924 | 0.946 | 0.886 | 126.9 |
| Faster R-CNN | 0.976 | 0.891 | 0.931 | 0.947 | 0.905 | 77.9 |
| YOLOv11 | 0.986 | 0.952 | 0.969 | 0.986 | 0.898 | 286.5 |
| Precision | Recall | F1 | mAP@0.5 | mAP@ 0.5:0.95 | FPS | |
|---|---|---|---|---|---|---|
| YOLOv11 | 0.986 | 0.952 | 0.969 | 0.986 | 0.898 | 286.5 |
| YOLOv11_MEEM | 0.988 | 0.949 | 0.968 | 0.982 | 0.901 | 294.1 |
| YOLOv11_MFMSA | 0.988 | 0.936 | 0.961 | 0.984 | 0.893 | 279.8 |
| YOLOv11_MEEM+MFMSA | 0.983 | 0.939 | 0.961 | 0.978 | 0.866 | 243.9 |
| Distress Type | YOLOv11 | YOLOv11+MEEM | Change |
|---|---|---|---|
| Loose | 0.995 | 0.995 | 0 |
| Void | 0.990 | 0.983 | −0.007 |
| Cavity | 0.967 | 0.965 | −0.002 |
| Underground structure | 0.993 | 0.986 | −0.007 |
| mAP@0.5 | 0.986 | 0.982 | −0.004 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Bai, M.; Ma, Q.; Liu, H.; Zhang, Z. Subgrade Distress Detection in GPR Radargrams Using an Improved YOLOv11 Model. Sustainability 2026, 18, 1273. https://doi.org/10.3390/su18031273
Bai M, Ma Q, Liu H, Zhang Z. Subgrade Distress Detection in GPR Radargrams Using an Improved YOLOv11 Model. Sustainability. 2026; 18(3):1273. https://doi.org/10.3390/su18031273
Chicago/Turabian StyleBai, Mingzhou, Qun Ma, Hongyu Liu, and Zilun Zhang. 2026. "Subgrade Distress Detection in GPR Radargrams Using an Improved YOLOv11 Model" Sustainability 18, no. 3: 1273. https://doi.org/10.3390/su18031273
APA StyleBai, M., Ma, Q., Liu, H., & Zhang, Z. (2026). Subgrade Distress Detection in GPR Radargrams Using an Improved YOLOv11 Model. Sustainability, 18(3), 1273. https://doi.org/10.3390/su18031273

