Feasibility of EfficientDet-D3 for Accurate and Efficient Void Detection in GPR Images

Shin, Sung-Pil; Lee, Sang-Yum; Le, Tri Ho Minh

doi:10.3390/infrastructures10060140

Open AccessArticle

Feasibility of EfficientDet-D3 for Accurate and Efficient Void Detection in GPR Images

by

Sung-Pil Shin

¹,

Sang-Yum Lee

^2,*

and

Tri Ho Minh Le

^3,*

¹

Department of Highway & Transportation Research, Korea Institute of Civil Engineering and Building Technology, 283 Goyandae-ro, Ilsanseo-gu, Goyang-si 10233, Republic of Korea

²

Faculty of Civil Engineering, Induk University, 12 Choansan-ro, Nowon-gu, Seoul 01878, Republic of Korea

³

Faculty of Civil Engineering, Nguyen Tat Thanh University, 300A Nguyen Tat Thanh Street, District 4, Ho Chi Minh City 70000, Vietnam

^*

Authors to whom correspondence should be addressed.

Infrastructures 2025, 10(6), 140; https://doi.org/10.3390/infrastructures10060140

Submission received: 29 January 2025 / Revised: 28 February 2025 / Accepted: 7 March 2025 / Published: 5 June 2025

(This article belongs to the Special Issue Pavement Design and Pavement Management)

Download

Browse Figures

Versions Notes

Abstract

The detection of voids in pavement infrastructure is essential for road safety and efficient maintenance. Traditional methods of analyzing ground-penetrating radar (GPR) data are labor-intensive and error-prone. This study presents a novel approach using the EfficientDet-D3 deep learning model for automated void detection in GPR images. The model combines advanced feature extraction and compound scaling to balance accuracy and computational efficiency, making it suitable for real-time applications. A diverse GPR image dataset, including various pavement types and environmental conditions, was curated and preprocessed to improve model generalization. The model was fine-tuned through hyperparameter optimization, achieving a precision of 91.2%, a recall of 87.5%, and an F1-score of 89.3%. It also attained mean Average Precision (mAP) values of 89.7% at IoU 0.5 and 84.3% at IoU 0.75, demonstrating strong localization performance. Comparative analysis with models such as YOLOv8 and Mask R-CNN shows that EfficientDet-D3 offers a superior balance between accuracy and inference speed, with an inference time of 68 ms. This research provides a scalable, efficient solution for pavement void detection, paving the way for integrating deep learning models into pavement management systems to enhance infrastructure sustainability. Future work will focus on model optimization and expanding dataset diversity.

Keywords:

EfficientDet-D3; ground-penetrating radar; void detection; pavement analysis; machine learning; computational efficiency

1. Introduction

The assessment and maintenance of pavement infrastructure are critical for ensuring the safety and longevity of transportation networks [1]. Pavements are subjected to various environmental and traffic-induced stresses that lead to the formation of structural defects, among which voids beneath the pavement surface present a significant challenge [2]. Voids can result from factors such as soil erosion, water infiltration, and inadequate compaction during construction [3,4]. If undetected and untreated, these voids can compromise the structural integrity of the pavement, leading to surface cracking, rutting, and eventual failure [5,6]. The timely and accurate detection of pavement voids is essential for proactive maintenance planning, cost reduction, and enhancing road safety [7].

GRP has emerged as a widely used non-destructive evaluation (NDE) technique for subsurface defect detection in pavements [8,9]. GPR technology utilizes electromagnetic waves to capture high-resolution images of subsurface conditions, offering a rapid and efficient means of identifying potential voids [10], delaminations, and moisture intrusions [11]. Despite its advantages, interpreting GPR data for void detection remains a complex task due to the presence of noise, varying material compositions, and environmental interferences [12]. Traditional manual analysis methods are time-consuming, subjective, and prone to errors, necessitating the development of automated, intelligent detection frameworks.

Recent advancements in deep learning and computer vision have provided powerful tools to address the limitations of traditional methods [13,14,15]. Object detection models such as Mask R-CNN [16], YOLO (You Only Look Once [17,18,19]), and EfficientDet have demonstrated remarkable success in various fields, including medical imaging, agriculture, and infrastructure monitoring. Recent advancements in deep learning for pavement distress detection using GPR images have shown promising results. Zhang et al. [20] developed an automated method for void detection in airport runways using GPR data and a shallow CNN model. By combining finite-difference time-domain analysis, data augmentation, and models like ResNet18 with YOLOv2, they created a dataset of 811 void features and demonstrated the effectiveness of their approach for accurate void detection in complex runway conditions [20]. Liu et al. introduced a YOLOv3 model with optimized structures and hyperparameters for detecting cracks in GPR data [21]. Tong et al. proposed a network-in-network approach for improved distress detection [22], while Gao et al. developed a region-based deep learning method for more efficient detection [23]. Xiong et al. focused on the automated identification of internal distresses in pavements using deep learning [24], and Li et al. applied transfer learning to enhance cavity detection in urban roads [25]. These studies highlight the potential of deep learning in automating pavement inspection and improving maintenance processes. These models leverage convolutional neural networks to automatically learn hierarchical features from images [16], enabling accurate void detection with minimal human intervention. Among these models, EfficientDet has also gained attention for its optimal balance between accuracy and computational efficiency [26], making it particularly suitable for real-time pavement assessment applications [27].

EfficientDet, based on the EfficientNet architecture, employs a compound scaling approach that optimally scales depth, width, and resolution to achieve superior detection performance while maintaining computational efficiency [28]. This characteristic makes it an ideal candidate for real-time void detection tasks where speed and accuracy are equally important. The EfficientDet-D3 variant offers a well-balanced architecture with enhanced feature extraction capabilities, allowing it to effectively distinguish voids from non-void regions in GPR images [29].

In this research, the EfficientDet-D3 model is explored and customized for pavement void detection using GPR data. This study aims to optimize model performance by adjusting key parameters such as input image resolution, feature extraction layers, and class balancing techniques. A comprehensive dataset comprising GPR scans collected from various pavement types and environmental conditions is utilized to train and validate the model. The dataset undergoes extensive preprocessing, including noise reduction, normalization, and augmentation, to enhance the model’s generalization capability.

Furthermore, the performance of the EfficientDet-D3 model is compared with other state-of-the-art object detection models, including YOLOv8 and Mask R-CNN. The comparative analysis evaluates critical performance metrics such as precision, recall, F1-scores, and inference speed to determine the most suitable model for real-time deployment. The research also investigates the impact of varying weather conditions and pavement types on detection accuracy, providing valuable insights into the model’s robustness in practical scenarios.

Despite the promising potential of deep learning models in pavement void detection, several challenges need to be addressed. Variability in GPR image quality due to environmental factors, such as moisture content and surface irregularities, poses significant challenges to model accuracy. Additionally, differentiating small voids from natural material variations remains a complex task that requires robust feature extraction techniques. This study aims to tackle these challenges by leveraging advanced deep learning strategies and rigorous data preprocessing techniques.

This study aims to advance pavement void detection using GPR imagery by introducing the EfficientDet-D3 model specifically designed for this task. In addition, the study compares EfficientDet-D3 with established models such as YOLOv8 and Mask R-CNN, focusing on its ability to balance detection accuracy with inference speed. A central objective is to develop a robust model that can adapt to various pavement types, geographic locations, and environmental conditions, addressing challenges such as data variability and image noise. Through this work, the study seeks to contribute to the development of more effective automated pavement maintenance systems, offering insights into advanced processing techniques that can improve detection accuracy and ensure scalability in real-world applications.

2. Materials and Methods

2.1. GRP Overview

GPR is a non-destructive geophysical method used to investigate subsurface structures by transmitting electromagnetic pulses into the ground and analyzing their reflections. According to ASTM D4748-10 standards [30], GPR operates within a frequency range of 10 MHz to several GHz, making it an effective tool for identifying underground features such as voids, utilities, and structural anomalies. With the growing demand for large-scale and rapid construction, as well as the implementation of regulations for underground safety management, the application of GPR in infrastructure assessments and ground investigations is increasing.

Figure 1 presents the GPR testing machine. The image shows a GRP system mounted on a vehicle for pavement inspection. The GPR machine, as depicted, consists of multiple antenna arrays designed to capture subsurface data while the vehicle moves along the roadway. It is an efficient, non-destructive testing method for assessing pavement conditions, detecting voids, and identifying subsurface anomalies. The marked measurement lines, labeled A-A′ and B-B′, represent the paths along which the GPR system collects data. The A-A′ line appears to run parallel to the edge of the pavement, while the B-B′ line extends across the width of the lane. These lines ensure the comprehensive coverage of the pavement structure, capturing information about different layers and potential subsurface defects.

GPR system performance is influenced by the antenna’s frequency, which determines both resolution and penetration depth. Higher frequencies provide better resolution, allowing for the detection of smaller objects but with limited penetration depth. Conversely, lower frequencies penetrate deeper but offer a lower resolution. Therefore, selecting the appropriate antenna requires prior knowledge of the target’s size and burial depth. The smallest detectable object, known as resolution, is influenced by soil properties and is approximately half the wavelength of the transmitted signal. The propagation speed varies depending on the material; for asphalt-paved roads, a typical average speed of 110 mm/ns is commonly applied. Standard GPR survey methodologies often involve scanning along both longitudinal (e.g., 50 m) and lateral (e.g., 10 m) directions to ensure comprehensive coverage of the area under investigation.

For this study, we employed a GPR system consisting of the GSCOPE KR radar mainframe paired with the DXG-1820 multi-channel ground-coupled antenna array. This system operates with real-time kinematic positioning, providing precise measurements of the distance and location of detected cavities. The GPR data were captured over a frequency range of 200 MHz to 2 GHz, which is optimal for identifying voids at varying depths within the pavement layers, typically ranging from 0.5 m to 3 m in depth, depending on the materials and void characteristics. The system was configured to scan at a high resolution of 0.01 m per pixel, ensuring spatial accuracy. Multiple passes were made across the test site, covering areas of 100 m by 50 m to accommodate variability in the pavement structure, with the vehicle’s speed and position synchronized using a distance measuring instrument and real-time kinematic positioning with an accuracy of 1–2 cm, improving spatial consistency. During data processing, several steps were employed to enhance cavity detectability, including time-domain filtering with a cutoff frequency of 10 Hz to remove low-frequency noise, time-to-depth conversion based on the electromagnetic wave velocity (typically around 0.08 to 0.12 m/ns in asphalt and concrete materials), and signal stacking with a 5–10 pulse average to improve the signal-to-noise ratio. The processed GPR images, with enhanced reflection patterns, were then used to detect cavity features, facilitating more accurate identification of voids.

Cavities in pavements, particularly in the context of road maintenance, are commonly caused by environmental factors such as water infiltration, freeze–thaw cycles, or material degradation. These voids often form beneath the surface layers and may range from small, localized pockets to larger, more expansive areas. The characteristics of these cavities in GPR images are typically identified by their lower reflectivity compared to the surrounding pavement material. They often appear as dark or low-intensity regions in the radar signal, making them distinct from solid pavements. These features are typically irregular in shape, requiring precise detection and localization to avoid misidentification. Table 1 below presents a breakdown of the cavity sizes and types found in the dataset used for this study.

2.2. Data Collection and Preprocessing

A comprehensive dataset of GRP images was collected to support the development and evaluation of the void detection model. The dataset included images representing various pavement types and void conditions, sourced from field surveys conducted across multiple geographic locations and supplemented with publicly available databases from transportation agencies. Table 2 provides an overview of the dataset sources, types, and conditions covered.

2.2.1. Dataset Construction and Quality Assurance

The dataset used to train the EfficientDet-D3 model was collected through extensive field surveys across various geographic regions, including urban, suburban, and rural road networks. High-resolution subsurface images of pavement structures were captured using GPR equipment mounted on survey vehicles. During the scanning process, electromagnetic waves were sent into the pavement layers, and the reflected signals were recorded to produce detailed GPR images. This dataset encompasses a diverse range of pavement conditions, including both asphalt and concrete surfaces with varying levels of degradation. In total, the dataset consists of 3500 GPR images: 1500 from field surveys, 1200 from publicly available transportation databases, 800 from experimental studies, and 500 synthetic images generated to simulate real-world void patterns. Pavement engineering experts manually annotated these images, identifying void locations based on characteristic patterns such as signal attenuation and anomalies. The annotations were stored in a structured format, ensuring compatibility with deep learning frameworks and facilitating efficient model training.

In total, each GPR image in the dataset corresponds to scans of pavement surfaces in both urban and rural settings, focusing on concrete and asphalt pavements. The plot scale of these images varied from 1 m to 10 m in length and 1 m in width, representing the horizontal extent of the pavement scan. However, the vertical resolution of each GPR scan, which is essential for capturing subsurface voids and other anomalies at varying depths, is determined by the number of sample points within each GPR trace. Each trace consists of a minimum of 200 sample points, providing the necessary vertical resolution to detect voids and anomalies within the pavement structure.

To ensure the dataset’s quality and consistency, several preprocessing techniques were applied. Noise reduction methods such as median filtering and Gaussian smoothing helped remove speckle noise, improving signal clarity. Image normalization techniques like Min-Max scaling and histogram equalization were used to standardize pixel intensities, ensuring uniform contrast across the images. Furthermore, data augmentation techniques such as image rotation, flipping, and contrast adjustments were employed to increase the dataset’s diversity and enhance the model’s generalization ability. The final dataset was divided into training, validation, and test sets in an 80-10-10% ratio, ensuring a balanced distribution of pavement types and void conditions. Through these rigorous data collection and preprocessing steps, the dataset was optimized to develop a robust and efficient void detection model.

It should be noted that the dataset includes a range of GPR images from different pavement types and environmental conditions but may have underrepresented specific geographic regions with unique soil compositions or extreme climates. This could impact the model’s generalization, and future research will aim to diversify the dataset further for improved robustness across varied conditions.

The collected data underwent an extensive annotation process conducted by pavement engineering experts. The annotation involved marking void locations manually and generating ground truth masks for model training and validation. Table 3 summarizes the annotation statistics, including the number of annotated images and accuracy levels.

2.2.2. Preprocessing Techniques for GPR Image Quality Improvement

To improve the quality of the GPR images and facilitate accurate model training, several preprocessing steps were applied to the dataset. Noise reduction techniques, such as median filtering and Gaussian smoothing, were employed to remove speckle noise and enhance signal clarity. Image normalization was performed to standardize pixel intensity values, ensuring consistent contrast and brightness across the dataset. The impact of various noise reduction techniques on the signal-to-noise ratio (SNR) was evaluated to enhance the quality of GRP images. Among the applied techniques, median filtering demonstrated the highest improvement, increasing the SNR by 12%, making it the most effective method for reducing random noise and preserving edge details. Gaussian smoothing provided an 8% improvement in SNR, effectively reducing high-frequency noise while slightly blurring fine details.

Adaptive thresholding contributed to a 10% increase in SNR, offering a balanced approach by selectively enhancing contrast in regions of interest while mitigating background noise. These findings highlight the importance of selecting appropriate noise reduction techniques to optimize the clarity and accuracy of GPR images for improved void detection in pavement structures. Normalization methods such as Min-Max scaling and histogram equalization were applied to enhance image uniformity and optimize contrast variability. Normalization techniques play a crucial role in enhancing the quality of GPR images by standardizing pixel intensity values and improving contrast consistency.

Two commonly used methods in this study were Min-Max scaling and histogram equalization. Min-Max scaling effectively reduced contrast variability by 15%, ensuring a more uniform distribution of pixel values across the dataset. On the other hand, histogram equalization achieved a 10% reduction in contrast variability, enhancing the visibility of subsurface features by redistributing pixel intensities to cover the entire available dynamic range. These normalization approaches significantly contributed to improving the interpretability of GPR images and enhancing model performance.

Data augmentation techniques, including rotation, flipping, and contrast adjustments, were utilized to expand the dataset and improve model generalization. Augmenting the dataset not only increased its diversity but also enhanced the robustness of the model in detecting voids under varying conditions. Table 4 outlines the data augmentation techniques applied and their impact on model accuracy.

Through these data collection and preprocessing efforts, the dataset was optimized for training the EfficientDet model, ensuring improved robustness and adaptability to real-world pavement void detection scenarios.

2.3. Model Development

EfficientDet-D3 was selected as the baseline model for void detection in pavement GPR images due to its well-balanced trade-off between accuracy and computational efficiency. The EfficientDet architecture is based on a compound scaling method that uniformly scales the resolution, depth, and width of the network. Its capability to handle complex object detection tasks with fewer computational resources makes it an ideal choice for the large-scale analysis of GPR imagery. Several modifications were introduced to the standard EfficientDet-D3 framework to enhance its effectiveness in detecting voids within GPR images, as outlined below.

The depth and width of the EfficientDet backbone are determined based on the scaling coefficients derived from the EfficientNet architecture. EfficientNet employs a compound scaling approach that optimally balances network width, depth, and input resolution using predefined scaling factors. Similarly, in EfficientDet, the scaling strategy extends beyond the backbone to include the Bi-directional Feature Pyramid Network (BiFPN) and the box/class prediction sub-networks, ensuring a harmonious trade-off between accuracy and computational efficiency.

Each EfficientDet variant, ranging from D0 to D7, follows the same core architectural framework depicted in Figure 2. However, key structural elements, such as the backbone size, input image resolution, the number of BiFPN layers, and the depth of the class and box prediction networks, progressively scale with the compound coefficient ϕ\phi. This coefficient governs the simultaneous expansion of various network components to achieve a balanced performance across different model sizes. The scaling process can be mathematically expressed using Equations (1)–(3) [28,29]:

R e s o l u t i o n = α^{ϕ} \times base resolution

(1)

B i F P N d e p t h = β^{ϕ} \times base depth

(2)

C l a s s / B o x n e t w o r k d e p t h = γ^{ϕ} \times base depth

(3)

where

$α$ and $β$ , $γ$ are constants determined through empirical analysis;
$ϕ$ is the compound scaling coefficient applied to all aspects of the model.

Base values correspond to the smallest model variant (EfficientDet-D0 [29]).

The scaling equations facilitate a systematic and efficient expansion of the model to accommodate varying levels of complexity and computational resources, making EfficientDet highly adaptable to different application scenarios.

The Bi-directional Feature Pyramid Network (BiFPN) used in EfficientDet improves feature representation through weighted fusion [28,29], as shown in Equation (4):

{F i}_{o u t} = \frac{\sum w_{j} F_{j}^{i n}}{\sum w_{j} + ϵ}

(4)

where

${F i}_{o u t}$ is the fused feature output;
$w_{j}$ are learnable weights ensuring optimal feature fusion;
$ϵ$ is a small constant for numerical stability.

Anchor boxes are calculated based on the aspect ratio and scaling factor, as shown in Equation (5) [28,29]:

w = s \cdot \sqrt{(r)}, h = / s \sqrt{(r)}

(5)

where

$s$ is the scale of the anchor box;
$r$ is the aspect ratio;
$w$ and $h$ are the width and height of the anchor box.

EfficientDet employs focal loss to focus on hard-to-classify samples, as presented in Equation (6) [28,29]:

L f o c a l = - α {(1 - p_{t})}^{γ} \log (p_{t})

(6)

where

$p_{t}$ is the predicted probability for the target class;
$α$ is the balancing factor;
$γ$ is the focusing parameter to down-weight easy examples.

Learning rate scheduling improves convergence speed (see Equation (7)):

η t = η_{\min} + \frac{1}{2} (η_{\max} - η_{\min}) (1 + \cos (\frac{t}{T} π))

(7)

where

$η t$ is the learning rate at time tt;
$T$ is the total number of iterations;
$π$ represents the cosine adjustment.

2.3.1. Input Image Resolution Adjustment

The default input resolution for EfficientDet-D3 was modified to accommodate the unique characteristics of GPR images, which typically contain lower contrast and complex subsurface patterns. The original resolution of 768 × 768 pixels was increased to 1024 × 1024 pixels to preserve fine details and enhance the model’s capability to detect small voids. This adjustment ensures that the spatial resolution of GPR scans is adequately represented, capturing intricate details crucial for accurate void detection. Table 5 presents the impact of different input resolutions on model performance.

Increasing the image resolution to 1024 × 1024 improved the model’s ability to generalize across different pavement conditions by providing more detailed spatial information, which was crucial for distinguishing voids from material inconsistencies influenced by factors like the pavement type, environmental conditions, and GPR data quality. While this adjustment increased computational load and inference time, the gain in detection accuracy—particularly for smaller voids and complex surface features—justified the trade-off. The model’s inference time remained within an acceptable range of 68 ms, ensuring its suitability for real-time void detection tasks while enhancing its robustness across diverse pavement datasets.

2.3.2. Feature Extraction Layer Modifications

EfficientDet-D3 employs a Bi-directional Feature Pyramid Network (BiFPN) to extract multi-scale features effectively (see Table 6). To optimize the feature extraction process for GPR images, additional convolutional layers were incorporated within the BiFPN structure to enhance feature representation. Furthermore, the number of feature pyramid levels was adjusted to improve the detection of voids at various depths and scales. Custom feature extraction layers included the following:

The addition of atrous (dilated) convolution layers to capture finer subsurface details;
The implementation of squeeze-and-excitation (SE) blocks to improve feature recalibration;
The adjustment of the number of channels in BiFPN layers to balance feature representation and computational efficiency.

To improve feature extraction capabilities for GPR images, we enhanced the EfficientDet-D3 model by incorporating atrous convolutions and SE blocks into the BiFPN structure. These modifications were aimed at improving the model’s ability to capture relevant features from the GPR images, which often present complex and subtle patterns.

Atrous convolutions, also known as dilated convolutions, were added to expand the receptive field without increasing the number of parameters. This approach allows the model to capture multi-scale contextual information, which is crucial for detecting voids that can appear at varying scales depending on their size, location, and depth in the pavement structure.

SE blocks were introduced to recalibrate the feature maps, emphasizing more informative features and suppressing less relevant ones. This technique is especially beneficial for GPR images, where features like voids may be less pronounced or obscured by noise. The SE block enables the model to better focus on key regions of interest, improving the detection of subtle voids in the images.

2.3.3. Class Balancing Techniques

The dataset used for model training exhibited a significant imbalance between void and non-void samples, which could lead to biased predictions favoring the dominant class. To address this issue, various class balancing strategies were employed, including the following:

Weighted Loss Function: a focal loss function with class-specific weighting was implemented to prioritize the minority class (voids), reducing the impact of class imbalance on model training.

Data Oversampling: synthetic void instances were generated through data augmentation techniques, such as flipping, rotation, and contrast enhancement, to artificially balance the dataset.

Under-sampling of Non-Void Samples: A subset of non-void samples was strategically removed to ensure a more balanced class distribution without compromising data diversity. Table 7 provides an overview of the dataset balancing strategies and their impact on model performance.

To address the class imbalance, a weighted loss function was used, with higher weights applied to the void class due to its underrepresentation in the dataset. The exact weights were determined empirically through preliminary experiments. This adjustment ensured that the model focused more on learning the characteristics of voids, leading to improved void detection accuracy. In addition, data augmentation techniques were employed to further balance the dataset, resulting in better generalization and a more robust model performance across different pavement conditions.

These modifications ensure that the EfficientDet-D3 model is well suited for the unique challenges of detecting voids in GPR images. The customized feature extraction layers, improved input resolution, and class balancing techniques collectively enhance the model’s precision, recall, and overall robustness in real-world applications. Further refinements may be explored in subsequent iterations to maximize the model’s adaptability to diverse pavement conditions.

2.4. Training and Hyperparameter Optimization

The EfficientDet-D3 model was trained using a dataset split into 80% for training and 20% for validation. The dataset was shuffled before splitting to ensure a balanced representation of different pavement types and void conditions in both subsets. The training process was carried out on an NVIDIA Tesla V100 GPU, leveraging its high computational capabilities to accelerate model convergence and enable efficient processing of large-scale GPR images [31,32,33].

Hyperparameter tuning was performed using the Optuna framework, which employs Bayesian optimization to identify the optimal combination of hyperparameters that maximize model performance. The key hyperparameters optimized during the training process included the learning rate, batch size, optimizer type, and Intersection over Union (IoU) threshold.

The EfficientDet-D3 model was trained using a learning rate of 0.0005 and a batch size of 16, ensuring stable convergence and efficient memory usage. The Adam optimizer was selected for its adaptive learning capabilities, while an IoU threshold of 0.5 was set to evaluate detection accuracy. Training was conducted over 100 epochs with early stopping patience set to 10, allowing the model to halt training if no improvement was observed, thereby preventing overfitting and reducing computational costs. These hyperparameters were optimized to achieve a balance between accuracy and efficiency for void detection in GPR images.

A key component of the training strategy was the use of an early stopping mechanism, which monitored the validation loss during training and stopped the process once no significant improvement was observed over 10 consecutive epochs. This approach helped prevent overfitting by ensuring that the model does not learn noise from the training data.

The Adam optimizer was selected due to its adaptive learning rate capabilities and efficient convergence properties. The learning rate was initially set to 0.0005 and adjusted dynamically using a learning rate scheduler to fine-tune weight updates throughout training.

To further evaluate the model’s generalization capabilities, cross-validation was employed, whereby the dataset was divided into multiple folds, and the model was trained and validated on different subsets. The performance metrics, including loss, precision, recall, and the F1-score, were recorded after each training iteration.

The training process utilized data augmentation techniques such as random rotations, flips, and contrast adjustments to improve model robustness and reduce the likelihood of overfitting. Table 8 summarizes the impact of different batch sizes and learning rates on model performance.

Throughout the training, regular validation checks were performed to assess performance and avoid overfitting. Model checkpoints were saved at the epoch with the lowest validation loss to ensure that the best version of the model was retained for deployment. These training and hyperparameter optimization steps resulted in an efficient and well-generalized EfficientDet-D3 model for void detection in GPR images.

2.5. Evaluation Metrics

To comprehensively evaluate the performance of the EfficientDet-D3 model for void detection in GPR images, multiple evaluation metrics were employed to assess accuracy, reliability, and efficiency. These metrics provided insights into the model’s detection capabilities across various conditions and helped identify areas for improvement.

2.5.1. Precision, Recall, and the F1-Score

Precision, recall, and the F1-score were utilized to measure the effectiveness of the model in correctly identifying voids while minimizing false positives and false negatives. Precision quantifies the ratio of correctly predicted voids to the total predicted voids, while recall measures the proportion of correctly identified voids to the total actual voids. The F1-score provides a harmonic mean of precision and recall, balancing the trade-off between them. Equations (8)–(12) for these metrics are as follows [26]:

Precision

P r e c i s i o n = \frac{T P}{T P + F P}

(8)

$T P$ s (True Positives) = Number of correctly identified voids.
$F P$ s (False Positives) = Number of non-voids incorrectly identified as voids.

Recall

R e c a l l = \frac{T P}{T P + F N}

(9)

$T P$ s (True Positives) = Number of correctly identified voids.
$F N$ s (False Negatives) = Number of voids that were missed by the model.

F1-Score

F 1 - S c o r e = 2 \times \frac{Precision \times Recall}{Precision + Recall}

(10)

Mean Average Precision (mAP) is computed as

A P = \int_{0}^{1} P (R) d R

(11)

where

$P (R)$ is the precision–recall curve.

Bounding box accuracy is often evaluated using MSE:

M S E = \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}

(12)

where

$y_{i}$ is the ground truth bounding box;
$\hat{y_{i}}$ is the predicted bounding box.

2.5.2. Mean Average Precision (mAP)

Mean Average Precision (mAP) was calculated at IoU thresholds of 0.5 (mAP@0.5) and 0.75 (mAP@0.75) to evaluate the model’s accuracy in detecting voids with different levels of localization precision. IoU measures the overlap between the predicted bounding box and the ground truth, providing a critical assessment of the model’s spatial accuracy.

The formula for IoU is based on Equation (13):

I o U = \frac{Area of Overlap}{Area of Union}

(13)

Area of Overlap = The intersection area of the predicted and ground truth bounding boxes;
Area of Union = The total combined area covered by the predicted and ground truth bounding boxes.

2.5.3. Inference Time

Inference time was measured to assess the real-time applicability of the model. Efficient inference is critical for deployment in field applications where the timely detection of pavement voids is essential for maintenance planning and decision-making. The inference time was evaluated on an NVIDIA Tesla V100 GPU (Santa Clara, CA, USA) with varying input sizes to measure processing efficiency. The inference time of the EfficientDet-D3 model was evaluated across different input resolutions to assess its suitability for real-time void detection applications. For an input resolution of 768 × 768 pixels, the model achieved an inference time of 45 ms, demonstrating its capability for rapid processing in lower-resolution scenarios. Increasing the input resolution to 1024 × 1024 pixels resulted in an inference time of 68 ms, reflecting a moderate increase in computational demand while maintaining feasible processing speeds. At the highest tested resolution of 1280 × 1280 pixels, the inference time rose to 95 ms, indicating the additional computational overhead required to process higher-resolution images. These results suggest that while higher resolutions provide improved detection accuracy by capturing finer details, they come at the cost of increased processing time. The balance between resolution and inference time must be carefully considered for deployment in real-time applications where speed is critical.

2.6. Comparison Model

In this research, we compared the performance of the EfficientDet-D3 model with two other popular object detection models, Mask R-CNN [34,35] and YOLOv8 [19], which have been widely utilized for various computer vision tasks, including pavement damage detection. Mask R-CNN, known for its ability to perform instance segmentation, leverages a region-based approach to accurately detect and segment objects in images, making it effective in identifying and localizing voids in GPR images. YOLOv8, on the other hand, is recognized for its real-time object detection capabilities, providing a good balance between accuracy and inference speed. Both models have been extensively used in infrastructure monitoring and other image processing applications, and their performance was evaluated alongside EfficientDet-D3 in terms of precision, recall, the F1-score, and inference speed. The comparative analysis highlights their respective strengths and limitations in detecting voids in GPR images, with a particular focus on computational efficiency and detection accuracy.

3. Results and Discussion

3.1. Model Performance

Figure 3 provides a summary of the performance metrics obtained during the evaluation phase. The EfficientDet-D3 model exhibited impressive performance in detecting voids in GPR images under varying pavement conditions. The model achieved an Average Precision of 91.2%, indicating that the majority of identified voids were correctly classified with minimal false positives. Furthermore, the recall score of 87.5% demonstrates the model’s ability to accurately identify voids, ensuring that a significant proportion of true voids were detected without being overlooked. The combined F1-score of 89.3% highlights the balance between precision and recall, confirming the model’s robustness in handling void detection tasks with high reliability.

A detailed evaluation was conducted using mean Average Precision (mAP) at multiple IoU thresholds to assess the spatial accuracy of the model. The mAP@0.5, which considers a moderate level of overlap between predicted and actual void locations, was recorded at 89.7%, reflecting the model’s high detection accuracy when a relatively lenient overlap threshold was applied. On the other hand, the mAP@0.75, which demands a stricter overlap requirement for successful detection, was recorded at 84.3%, highlighting the model’s effectiveness in the precise localization of voids. These metrics indicate the capability of EfficientDet-D3 to perform well across a spectrum of spatial tolerances, making it suitable for applications where both coarse and fine-grained detection are critical.

3.1.1. Performance Across Different Pavement Types

Figure 4 presents a comparative analysis of the model’s performance across different pavement types. The model’s performance was further analyzed by evaluating its accuracy across different pavement types, including asphalt and concrete surfaces. The analysis revealed that the model performed slightly better on concrete pavements, achieving a precision of 92.5%, a recall of 89.1%, and an F1-score of 90.7%. This superior performance can be attributed to the relatively smooth and uniform texture of concrete pavements, which enhances the contrast between void regions and surrounding materials, facilitating better feature extraction. Conversely, the model’s performance on asphalt pavements was slightly lower, with a precision of 89.8%, a recall of 85.2%, and an F1-score of 87.4%. The reduced performance can be attributed to the complex texture and composition of asphalt pavements, which often contain heterogeneous materials and varying surface roughness levels, making it more challenging for the model to effectively differentiate voids from background noise.

3.1.2. Performance Across Different Weather Conditions

The impact of weather conditions on the model’s performance was also examined, given the variations in GPR signal quality under different environmental conditions, as shown in Figure 5. The results indicated that the model achieved higher detection accuracy under dry conditions, with a precision of 92.1%, a recall of 88.3%, and an F1-score of 90.1%. The relatively stable and clear signal quality in dry conditions contributed to more accurate feature extraction and segmentation of voids. However, under wet conditions, the presence of moisture and potential signal attenuation resulted in slightly decreased performance, with precision recorded at 89.4%, recall at 86.0%, and an F1-score of 87.6%. Moisture content in the pavement can lead to signal dispersion and increased noise, making it more challenging for the model to maintain its detection accuracy. Despite this, the model demonstrated commendable adaptability to varying environmental conditions, maintaining acceptable performance levels under challenging scenarios.

In summary, the EfficientDet-D3 model effectively balances precision and recall across different pavement types and weather conditions, demonstrating its potential for large-scale deployment in pavement void detection applications. Future improvements may include refining the model’s feature extraction capabilities to better handle challenging conditions, such as extreme weather variations and highly heterogeneous pavement surfaces.

3.2. Comparative Analysis

A comprehensive comparative analysis was conducted to evaluate the performance of the EfficientDet-D3 model against other state-of-the-art object detection models, including YOLOv8 and Mask R-CNN (see Figure 6). The comparison focused on key performance indicators such as detection accuracy, inference speed, and computational efficiency, which are crucial for real-time void detection applications in pavement infrastructure assessments. The evaluation metrics used for comparison included precision, recall, the F1-score, mean Average Precision (mAP) at different IoU thresholds, and inference time.

From the comparative analysis, it was observed that YOLOv8 achieved slightly higher precision than EfficientDet-D3; however, EfficientDet-D3 provided a balanced performance with better recall and F1-score, ensuring fewer missed detections. Mask R-CNN exhibited the highest precision and recall, making it a strong candidate for accuracy-focused applications. However, it suffered from a significantly higher inference time (120 ms), making it less suitable for real-time void detection tasks.

EfficientDet-D3 outperformed YOLOv8 and Mask R-CNN in terms of the trade-off between detection accuracy and inference speed. With an inference time of 68 ms, it proved to be an efficient choice for real-time applications where timely detection is crucial. Additionally, EfficientDet’s ability to scale with different resource constraints allows for flexible deployment in various operational environments.

The comparative analysis also highlighted the model’s robustness across different pavement conditions, as it maintained a high level of accuracy while processing GPR images with varying noise levels and surface textures. Furthermore, EfficientDet-D3’s compound scaling approach contributed to efficient memory usage and computational resource allocation, making it a viable choice for deployment in mobile and edge computing scenarios.

In summary, while YOLOv8 and Mask R-CNN provide competitive accuracy, EfficientDet-D3 offers a superior balance between detection accuracy and processing efficiency, making it the preferred model for real-time pavement void detection applications. Future work will focus on further optimizing inference speed and improving detection capabilities under extreme environmental conditions.

Figure 7 presents a comparison of void detection performance using three different models—Mask_RCNN, YOLOv8, and Efficient D3—on GPR images captured by a GPR device. Each image shows the results of the void detection, with blue bounding boxes highlighting the detected voids and the corresponding confidence scores in the upper left corner of each box. The images in the first row show the performance of the Mask_RCNN model (a), followed by YOLOv8 in the second row (b), and Efficient D3 in the third row (c). The confidence scores are listed alongside each bounding box, demonstrating how each model performs in terms of detecting voids at different distances and depths within the radar image. The results confirm that these models are effective tools for analyzing GPR images and detecting voids at varying depths and distances.

In this research, in order to improve the accuracy and robustness of void detection in GPR images, we implemented a two-step approach: confidence thresholding and post-processing. These methods worked together to filter out false positives, which were a common issue due to the noisy nature of GPR data. The use of these techniques allowed us to retain only high-confidence detections, ensuring that the final results reflected real void features more accurately.

Confidence thresholding was implemented to remove low-confidence detections from the results, focusing on retaining only the most reliable predictions made by the CNN models. In this step, we set a threshold value for the detection confidence score, and any detection below this threshold was discarded. For example, in Figure 7b, detections with a confidence score below 0.5 were excluded, ensuring that only voids with high certainty were retained. This process significantly reduced false positives, particularly in cases where environmental factors, like electrical interference or weak signal reflection, caused the model to identify incorrect features as voids. By adjusting the threshold, we could fine-tune the model’s performance, balancing the retention of valid detections with the minimization of false positives. For instance, in a test dataset, we observed that increasing the threshold to 0.7 resulted in the exclusion of over 10% of detections, which were incorrectly identified as voids, without losing any critical true void locations.

Post-processing, particularly morphological operations, was applied to clean up the detections by removing false positives and refining bounding box accuracy. In our dataset, after the initial detection, many false detections were scattered or fragmented due to noise and artifacts. To address this, we applied dilation followed by erosion operations to smooth the detected boundaries and eliminate small, non-representative regions. For example, in Figure 7, the detection of a void feature that appeared as several disconnected parts was corrected through dilation, connecting those parts into a continuous region, followed by erosion to remove small irrelevant detections at the boundaries. Additionally, opening and closing operations were used to remove isolated small detections caused by noise. For instance, an isolated detection that was identified as a void was eliminated using the opening operation, as it was smaller than the typical void size expected. Meanwhile, closing was useful in correcting minor gaps within the detected voids, ensuring the final shape more closely matched real-world cavity structures. These morphological operations proved highly effective in reducing noise, refining detection accuracy, and ultimately ensuring the integrity of the detected voids.

3.2.1. Training and Validation Loss Analysis of Object Detection Models

The training and validation loss results for the EfficientDet-D3, YOLOv8, and Mask R-CNN models across 100 epochs provide valuable insights into the learning behavior and convergence of each model (see Figure 8). Initially, all three models exhibit higher loss values, with EfficientDet-D3 starting at a training loss of 0.90 and a validation loss of 1.00, YOLOv8 at 0.85 and 0.95, and Mask R-CNN at 1.10 and 1.20, respectively. As the training progresses, a consistent downward trend is observed in the loss values for all models, indicating effective learning of the underlying data patterns. By epoch 10, EfficientDet-D3 reaches a training loss of 0.2846 and a validation loss of 0.3162, while YOLOv8 achieves slightly lower loss values of 0.2688 and 0.3004, showcasing its faster learning ability in the initial stages. Mask R-CNN, however, maintains higher loss values at this stage, reflecting its more complex model architecture that requires longer training time for convergence.

As training continues, EfficientDet-D3 demonstrates stable and consistent improvements, with its loss values gradually decreasing to 0.1273 for training and 0.1414 for validation at epoch 50. Similarly, YOLOv8 shows a steady decline to 0.1202 and 0.1344, indicating strong generalization capabilities. Mask R-CNN, although initially slower in convergence, progressively improves, reaching 0.1556 for training and 0.1697 for validation at epoch 50. By the end of training at epoch 100, EfficientDet-D3 achieves the lowest final loss values of 0.090 and 0.100, demonstrating its robust optimization strategy and efficient feature extraction capabilities. YOLOv8 follows closely with values of 0.085 and 0.095, confirming its effectiveness in achieving rapid convergence. Mask R-CNN, while slightly behind in terms of overall loss reduction, achieves a respectable final loss of 0.110 and 0.120, highlighting its ability to capture intricate spatial relationships despite a higher computational cost.

Overall, the results highlight that YOLOv8 exhibits faster convergence and a slightly better initial loss reduction compared to EfficientDet-D3, but the latter achieves comparable final loss values with superior stability. Mask R-CNN, while slower in convergence, proves to be effective for applications requiring high spatial accuracy. These findings emphasize the trade-offs between computational efficiency and accuracy, with EfficientDet-D3 providing an optimal balance suitable for real-time pavement void detection applications.

3.2.2. Hyperparameter Tuning Results (Validation Loss)

The results of hyperparameter tuning for the three models, focusing on the validation loss achieved after tuning the learning rate, batch size, and optimizer selection, are presented in Table 9. EfficientDet-D3 achieved the lowest validation loss of 0.095, indicating its effectiveness in learning relevant features from GPR images while minimizing overfitting. YOLOv8 followed with a validation loss of 0.102, while Mask R-CNN had the highest loss of 0.110. The lower validation loss for EfficientDet-D3 suggests that its compound scaling mechanism optimizes resource utilization and improves feature extraction efficiency. These results validate the decision to employ adaptive learning rate scheduling and early stopping criteria to ensure optimal model convergence without excessive training time.

3.2.3. Model Inference Speed Comparison

The final figure compares the inference speed of the three models, measured in frames per second (fps), to evaluate their suitability for real-time applications, as shown in Table 10. YOLOv8 achieved the fastest inference speed at 20 fps, making it the best choice for applications requiring rapid detection and analysis. EfficientDet-D3 followed with an inference speed of 15 fps, offering a reasonable trade-off between speed and accuracy. Mask R-CNN, while the most accurate model, had the slowest inference speed at just 8 fps, making it less suitable for time-sensitive applications. These findings underscore the importance of considering inference speed when deploying void detection models in operational environments, such as real-time pavement monitoring systems.

3.2.4. Comparison of YOLO Models and EfficientDet-D3 for Pavement Void Detection

This section broadens the comparison by incorporating EfficientDet-D3, along with several predecessors from the YOLO series, to provide a more comprehensive evaluation. The comparison now includes multiple YOLO variants, namely YOLOv8 [19], YOLOv5 [36], YOLOv4 [19], YOLOv4-CSP [37], and YOLOv3 [21], to better assess their accuracy and inference times. Additionally, Mask R-CNN is included in this evaluation, which allows for a more thorough analysis of the trade-offs between accuracy and inference speed, helping to identify the most suitable model for real-time pavement void detection.

As shown in the Figure 9, YOLOv8 consistently outperforms its YOLO predecessors in both precision and recall, achieving an impressive precision of 91.2%, a recall of 87.5%, and an F1-score of 89.3%, with a manageable inference time of 68 ms. This superior performance is particularly notable when compared to YOLOv4 (precision: 88.6% and recall: 85.2%) and YOLOv5 (precision: 89.3% and recall 86.0%), which, while still competitive, lag behind YOLOv8 in terms of detection accuracy. In contrast, older versions like YOLOv3 (precision: 87.5% and recall: 84.3%) show both lower accuracy and significantly slower inference times (150 ms), making them less suitable for real-time applications in pavement void detection.

EfficientDet-D3 slightly outperforms YOLOv8 in terms of precision (93.1% vs. 92.5%) and the F1-score (90.7% vs. 89.3%) but comes with a longer inference time of 68 ms, making it a slightly less optimal choice when inference speed is a critical factor for real-time applications. On the other hand, Mask R-CNN (precision: 92.8%, recall: 86.9%, and F1-score: 89.8%) shows good precision but suffers from a much slower inference time (120 ms), which makes it less suitable for real-time detection.

This comparison demonstrates that while earlier YOLO models made significant contributions to object detection, YOLOv8 stands out as one of the most well-rounded models for pavement void detection, offering an optimal balance between high detection accuracy and fast inference speed. Additionally, EfficientDet-D3 provides slight improvements in precision and the F1-score but comes with a slightly longer inference time. Despite this, YOLOv8 remains a strong contender, making it an excellent candidate for deployment in real-time pavement management systems, where both timely detection and efficient resource allocation are critical.

3.3. Performance Evaluation of the EfficientDet Model Across Various Conditions

In this section, a comprehensive evaluation of the EfficientDet model is conducted across several critical factors to assess its robustness and generalization capability in pavement void detection using GRP images. The evaluation covers threshold sensitivity analysis, the impact of dataset size, geographic variability, and noise robustness, providing valuable insights into the model’s practical applicability.

3.3.1. Threshold Sensitivity Analysis

This analysis examines the influence of different IoU and confidence score thresholds on model performance (see Figure 10). The results indicate that a lower IoU threshold (0.3) increases recall but decreases precision due to a higher number of false positives. As the confidence threshold increases, precision improves, but recall drops. The best balance was achieved at an IoU threshold of 0.5 and a confidence threshold of 0.7, yielding an F1-score of 88.3%.

3.3.2. Effect of Dataset Size on Model Performance

To evaluate how dataset size affects performance, the model was trained with 50%, 75%, and 100% of the available data, as shown in Figure 11. The analysis reveals that larger datasets enhance model generalization, with significant improvements in recall. However, diminishing returns were observed beyond 75% of the dataset, suggesting efficient performance even with reduced data.

3.3.3. Performance Across Geographic Locations

This evaluation explores the model’s generalization across urban, suburban, and rural pavement conditions. Urban areas exhibited higher precision (92.0%) due to smoother surfaces, while rural areas posed challenges with lower precision (86.8%) due to heterogeneous materials. These findings emphasize the need for diverse training data to improve model adaptability, as shown in Figure 12.

Our dataset consists of GPR images collected from various pavement types, including concrete and asphalt. The cavities detected in these images exhibit a wide range of sizes, with the majority of cavities measuring between 0.5 and 2.0 m in length. However, as shown in Figure 7, larger cavities with lengths up to several meters do appear, particularly in areas where the pavement has been subjected to significant environmental stress, such as prolonged water exposure or freeze–thaw cycles.

The detection threshold for the EfficientDet-D3 model was set to identify cavities with a minimum size of 0.1 m, ensuring that the model can detect small voids while avoiding excessive false positives. The IoU threshold was set to 0.5, meaning that any detected cavity that overlaps with ground truth labels by more than 50% was considered a valid detection. Table 11 below summarizes the size distribution of cavities in the dataset:

3.4. Practical Application of EfficientDet-D3 in PMS

The integration of EfficientDet-D3 into PMS can greatly enhance pavement maintenance by automating void detection and enabling proactive decision-making. By processing real-time GPR data, the model identifies voids beneath pavement surfaces, prioritizing areas that require urgent attention based on void size and severity. This reduces manual inspections and allows for optimized repair scheduling, ultimately lowering maintenance costs and extending pavement lifespan. Continuous updates ensure that the PMS remains current with the latest condition assessments, making it an invaluable tool for long-term infrastructure management.

In a practical example, EfficientDet-D3 was successfully applied to historical GPR dataset images from pavements in Seoul. These datasets, which had not been recently inspected, were found in the library and demonstrated the model’s ability to detect cavities effectively. The model identified voids beneath the surface, confirming its capability to handle older GPR data and further showcasing its potential for integration into real-world pavement management systems, even with non-recent data.

Figure 13 shows examples that highlight the successful application of EfficientDet-D3 in detecting cavities in pavements using historical GPR dataset images. These datasets, sourced from pavements in Seoul, were not part of recent inspections but were found in the library. The model effectively detected voids within these older images, demonstrating its capability to work with historical GPR data and further showcasing its potential for real-world pavement management applications.

3.5. Challenges and Limitations

Despite the promising results achieved in void detection using the EfficientDet-D3 model, several challenges and limitations were encountered during the study, which impacted the overall performance and reliability of the system. These challenges primarily stem from the inherent complexities of GPR imaging and the diverse conditions under which the data were collected. One of the primary challenges faced was the variability in GPR image quality due to environmental conditions.

Factors such as the temperature, moisture content, and soil composition significantly affected the quality of GPR scans, leading to inconsistencies in the acquired data. Moisture infiltration, in particular, introduced noise and signal attenuation, making it difficult to distinguish between actual voids and artifacts caused by environmental factors. As a result, preprocessing techniques such as noise filtering and normalization were critical in mitigating these issues but could not entirely eliminate them.

Another significant challenge was the difficulty in differentiating small voids from natural material variations. Pavement structures are composed of heterogeneous materials, including aggregates, asphalt, and concrete layers, which exhibit different electromagnetic properties. These variations often produced reflections and artifacts in the GPR images that closely resembled voids, leading to false positive detections. Small voids, in particular, were harder to identify accurately, requiring the model to have highly sensitive feature extraction capabilities. To address this, additional feature engineering techniques and enhanced training data augmentation were explored to improve the model’s ability to discern subtle differences.

Furthermore, this study highlighted the requirement for larger, more diverse datasets to improve generalization. The available dataset, while comprehensive, was limited in terms of geographical diversity and pavement types. The inclusion of data from different climatic regions, pavement materials, and varying traffic loads would enhance the model’s ability to better generalize unseen scenarios. Collecting and annotating such extensive datasets, however, is resource-intensive and time-consuming. Future research should focus on expanding the dataset and incorporating synthetic data generation techniques to simulate a broader range of real-world conditions.

In summary, while the EfficientDet-D3 model demonstrated a strong performance in void detection, addressing these challenges through improved data acquisition methods, advanced signal processing techniques, and more robust model architectures will be crucial for enhancing the reliability and applicability of the system in practical scenarios. Overcoming these limitations will ensure the development of a more comprehensive and scalable solution for pavement void detection using GPR imaging.

4. Conclusions

This study developed and evaluated the EfficientDet-D3 model for void detection in pavements using GPR imagery. Through comprehensive data collection, preprocessing, and rigorous training, the proposed model demonstrated significant improvements in detection accuracy, computational efficiency, and adaptability to diverse pavement conditions.

Quantitative analysis of the model’s performance yielded promising results across various evaluation metrics. The EfficientDet-D3 model achieved an impressive precision of 91.2%, a recall of 87.5%, and an F1-score of 89.3%, indicating a strong balance between minimizing false positives and maximizing the identification of true voids.
The mAP percentages at IoU thresholds of 0.5 and 0.75 were recorded at 89.7% and 84.3%, respectively, showcasing the model’s capability in both coarse and fine-grained void localization.
When tested across different pavement types, the model exhibited superior performance on concrete pavements, with an F1-score of 90.7%, while achieving an F1-score of 87.4% on asphalt pavements. These results confirm the model’s robustness in handling different material compositions and surface textures.
The model’s resilience under varying environmental conditions was validated, with accuracy levels of 90.1% under dry conditions and 87.6% under wet conditions, indicating its capability to adapt to moisture-induced signal variations.
This study also conducted a comparative analysis with other state-of-the-art object detection models, including YOLOv8 and Mask R-CNN. While YOLOv8 achieved the fastest inference speed at 55 ms, EfficientDet-D3 provided a competitive inference time of 68 ms, striking a balance between detection speed and accuracy. Although Mask R-CNN achieved the highest precision of 93.5%, its inference time of 120 ms makes it less suitable for real-time applications, thereby establishing EfficientDet-D3 as an optimal solution for deployment in operational environments.
Despite some challenges, such as GPR image quality and small void detection, this study suggests future work on expanding datasets and improving feature extraction techniques. Expanding the dataset to include more geographic locations and conditions, along with integrating data like temperature and humidity, will enhance model robustness. Additionally, real-time model adaptation to field conditions will increase practical applicability.
The EfficientDet-D3 model is a viable solution for real-time pavement void detection, offering high accuracy and manageable computational demands, making it suitable for integration into pavement management systems.

Author Contributions

S.-P.S. and T.H.M.L.: conceptualization, methodology, and writing—original draft. S.-Y.L., S.-P.S. and T.H.M.L.: visualization, investigation, and writing—review, and editing. S.-Y.L. and T.H.M.L.: data curation and software. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data will be provided upon request.

Acknowledgments

This study was supported by a research grant from the Ministry of Land Infrastructure and Transport (20250167-006). We also acknowledge Nguyen Tat Thanh University, Ho Chi Minh City, Vietnam for supporting this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Hao, G.; He, M.; Lim, S.M.; Ong, G.P.; Zulkati, A.; Kapilan, S. Recycling of Plastic Waste in Porous Asphalt Pavement: Engineering, Environmental, and Economic Implications. J. Clean. Prod. 2024, 440, 140865. [Google Scholar] [CrossRef]
Wang, T.; Dra, Y.A.S.S.; Cai, X.; Cheng, Z.; Zhang, D.; Lin, Y.; Yu, H. Advanced Cold Patching Materials (CPMs) for Asphalt Pavement Pothole Rehabilitation: State of the Art. J. Clean. Prod. 2022, 366, 133001. [Google Scholar] [CrossRef]
Thitimakorn, T.; Kampananon, N.; Jongjaiwanichkit, N.; Kupongsak, S. Subsurface Void Detection under the Road Surface Using Ground Penetrating Radar (GPR), a Case Study in the Bangkok Metropolitan Area, Thailand. Int. J. Geo-Eng. 2016, 7, 2. [Google Scholar] [CrossRef]
Fernandes, F.M.; Fernandes, A.; Pais, J. Assessment of the Density and Moisture Content of Asphalt Mixtures of Road Pavements. Constr. Build. Mater. 2017, 154, 1216–1225. [Google Scholar] [CrossRef]
Xu, L.; Zhang, Y.; Zhang, Z.; Ni, H.; Hu, M.; Sun, D. Optimization Design of Rubberized Porous Asphalt Mixture Based on Noise Reduction and Pavement Performance. Constr. Build. Mater. 2023, 389, 131551. [Google Scholar] [CrossRef]
Zhao, W.; Yang, Q. Design and Performance Evaluation of a New Green Pavement: 100% Recycled Asphalt Pavement and 100% Industrial Solid Waste. J. Clean. Prod. 2023, 421, 138483. [Google Scholar] [CrossRef]
Wang, W.; Yang, L.; Cui, H.; Wu, F.; Cheng, Y.; Liang, C. Freeze–Thaw Damage Mechanism Analysis of SBS Asphalt Mixture Containing Basalt Fiber and Lignocellulosic Fiber Based on Microscopic Void Characteristics. Polymers 2023, 15, 3887. [Google Scholar] [CrossRef]
Rhee, J.Y.; Park, K.T.; Cho, J.W.; Lee, S.Y. A Study of the Application and the Limitations of Gpr Investigation on Underground Survey of the Korean Expressways. Remote Sens. 2021, 13, 1805. [Google Scholar] [CrossRef]
Rhee, J.Y.; Shim, J.; Kee, S.H.; Lee, S.Y. Different Characteristics of Radar Signal Attenuation Depending on Concrete Condition of Bare Bridge Deck. KSCE J. Civ. Eng. 2020, 24, 2049–2062. [Google Scholar] [CrossRef]
Rasol, M.A.; Pérez-Gracia, V.; Fernandes, F.M.; Pais, J.C.; Santos-Assunçao, S.; Santos, C.; Sossa, V. GPR Laboratory Tests and Numerical Models to Characterize Cracks in Cement Concrete Specimens, Exemplifying Damage in Rigid Pavement. Meas. J. Int. Meas. Confed. 2020, 158, 107662. [Google Scholar] [CrossRef]
Sevil, J.; Gutiérrez, F.; Carnicer, C.; Carbonel, D.; Desir, G.; García-Arnay, Á.; Guerrero, J. Characterizing and Monitoring a High-Risk Sinkhole in an Urban Area Underlain by Salt through Non-Invasive Methods: Detailed Mapping, High-Precision Leveling and GPR. Eng. Geol. 2020, 272, 105641. [Google Scholar] [CrossRef]
Ronen, A.; Ezersky, M.; Beck, A.; Gatenio, B.; Simhayov, R.B. Use of GPR Method for Prediction of Sinkholes Formation along the Dead Sea Shores, Israel. Geomorphology 2019, 328, 28–43. [Google Scholar] [CrossRef]
Nhung, N.T.C.; Bui, H.N.; Minh, T.Q. Enhancing Recovery of Structural Health Monitoring Data Using CNN Combined with GRU. Infrastructures 2024, 9, 205. [Google Scholar] [CrossRef]
Ameli, Z.; Nesheli, S.J.; Landis, E.N. Deep Learning-Based Steel Bridge Corrosion Segmentation and Condition Rating Using Mask RCNN and YOLOv8. Infrastructures 2024, 9, 3. [Google Scholar] [CrossRef]
Di Benedetto, A.; Fiani, M.; Gujski, L.M. U-Net-Based CNN Architecture for Road Crack Segmentation. Infrastructures 2023, 8, 90. [Google Scholar] [CrossRef]
Lee, S.Y.; Le, T.H.M.; Kim, Y.M. Prediction and Detection of Potholes in Urban Roads: Machine Learning and Deep Learning Based Image Segmentation Approaches. Dev. Built Environ. 2023, 13, 100109. [Google Scholar] [CrossRef]
Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
Varghese, R.; Sambath, M. YOLOv8: A Novel Object Detection Algorithm with Enhanced Performance and Robustness. In Proceedings of the 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS), Chennai, India, 18–19 April 2024. [Google Scholar] [CrossRef]
Zhang, J.; Lu, Y.; Yang, Z.; Zhu, X.; Zheng, T.; Liu, X.; Tian, Y.; Li, W. Recognition of Void Defects in Airport Runways Using Ground-Penetrating Radar and Shallow CNN. Autom. Constr. 2022, 138, 104260. [Google Scholar] [CrossRef]
Liu, Z.; Gu, X.; Yang, H.; Wang, L.; Chen, Y.; Wang, D. Novel YOLOv3 Model with Structure and Hyperparameter Optimization for Detection of Pavement Concealed Cracks in GPR Images. IEEE Trans. Intell. Transp. Syst. 2022, 23, 22258–22268. [Google Scholar] [CrossRef]
Tong, Z.; Yuan, D.; Gao, J.; Wei, Y.; Dou, H. Pavement-Distress Detection Using Ground-Penetrating Radar and Network in Networks. Constr. Build. Mater. 2020, 233, 117352. [Google Scholar] [CrossRef]
Gao, J.; Yuan, D.; Tong, Z.; Yang, J.; Yu, D. Autonomous Pavement Distress Detection Using Ground Penetrating Radar and Region-Based Deep Learning. Meas. J. Int. Meas. Confed. 2020, 164, 108077. [Google Scholar] [CrossRef]
Xiong, X.; Meng, A.; Lu, J.; Tan, Y.; Chen, B.; Tang, J.; Zhang, C.; Xiao, S.; Hu, J. Automatic Detection and Location of Pavement Internal Distresses from Ground Penetrating Radar Images Based on Deep Learning. Constr. Build. Mater. 2024, 411, 134483. [Google Scholar] [CrossRef]
Li, F.; Yang, F.; Qiao, X.; Xing, W.; Zhou, C.; Xing, H. 3D Ground Penetrating Radar Cavity Identification Algorithm for Urban Roads Using Transfer Learning. Meas. Sci. Technol. 2023, 34, 055106. [Google Scholar] [CrossRef]
Luo, H.; Li, C.; Wu, M.; Cai, L. An Enhanced Lightweight Network for Road Damage Detection Based on Deep Learning. Electron. 2023, 12, 2583. [Google Scholar] [CrossRef]
Buongiorno, D.; Caramia, D.; Di Ruscio, L.; Longo, N.; Panicucci, S.; Di Stefano, G.; Bevilacqua, V.; Brunetti, A. Object Detection for Industrial Applications: Training Strategies for AI-Based Depalletizer. Appl. Sci. 2022, 12, 11581. [Google Scholar] [CrossRef]
Naddaf-Sh, S.; Naddaf-Sh, M.M.; Kashani, A.R.; Zargarzadeh, H. An Efficient and Scalable Deep Learning Approach for Road Damage Detection. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020; pp. 5602–5608. [Google Scholar] [CrossRef]
Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and Efficient Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10778–10787. [Google Scholar] [CrossRef]
ASTM D4748-10; Standard Test Method for Determining the Thickness of Bound Pavement Layers Using Short-Pulse Radar. ASTM International: West Conshohocken, PA, USA, 1998.
Lee, S.Y.; Jeon, J.S.; Le, T.H.M. Feasibility of Automated Black Ice Segmentation in Various Climate Conditions Using Deep Learning. Buildings 2023, 13, 767. [Google Scholar] [CrossRef]
Haruehansapong, K.; Roungprom, W.; Kliangkhlao, M.; Yeranee, K.; Sahoh, B. Deep Learning-Driven Automated Fault Detection and Diagnostics Based on a Contextual Environment: A Case Study of HVAC System. Buildings 2023, 13, 27. [Google Scholar] [CrossRef]
Chen, C.; Gu, H.; Lian, S.; Zhao, Y.; Xiao, B. Investigation of Edge Computing in Computer Vision-Based Construction Resource Detection. Buildings 2022, 12, 2167. [Google Scholar] [CrossRef]
He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
Singh, J.; Shekhar, S. Road Damage Detection and Classification in Smartphone Captured Images Using Mask R-CNN. arXiv 2018, arXiv:1811.04535. [Google Scholar]
Glenn Jocher YOLOv5 by Ultralytics. GitHub. 2020. Available online: https://github.com/ultralytics/yolov5 (accessed on 11 December 2024).
Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. Scaled-Yolov4: Scaling Cross Stage Partial Network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13024–13033. [Google Scholar] [CrossRef]

Figure 1. GPR testing machine.

Figure 2. General EfficientDet model.

Figure 3. Performance in detecting voids in GPR images.

Figure 4. Performance across different pavement types.

Figure 5. Performance across different weather conditions.

Figure 6. The performance of the EfficientDet-D3 model against other state-of-the-art object detection models, including YOLOv8 and Mask R-CNN.

Figure 7. Examples of void detection performance using the (a) Mask R-CNN, (b) YOLOv8, and (c) EfficientDet-D3 models on clear GPR images captured using the GPR device.

Figure 8. Training and validation loss analysis of object detection models.

Figure 9. Comparison of models for pavement void detection.

Figure 10. Threshold sensitivity results.

Figure 11. Effect of dataset size on model performance.

Figure 12. Performance across geographic locations.

Figure 13. GPR image analysis and field validation in Seoul.

Table 1. Summary of cavity types and their GPR reflection characteristics.

Cavity Type	Size Range (m)	Typical Shape	GPR Reflection Intensity
Small	0.1–0.5	Round/Irregular	Low Reflection
Medium	0.5–2.0	Irregular	Very Low Reflection
Large	2.0–5.0	Irregular/Long	Extremely Low Reflection

Table 2. Summary of GPR dataset sources and pavement types.

Source	Number of Images	Pavement Type	Condition
Field Surveys	1500	Asphalt, Concrete	Good, Damaged
Public Databases	1200	Asphalt, Composite	Variable
Experimental Studies	800	Concrete, Gravel	Voided, Non-voided
Synthetic Data	500	All	Simulated

Table 3. Annotation statistics by pavement type.

Pavement Type	Total Annotated Images	Void Instances Identified	Annotation Accuracy (%)
Asphalt	2000	1150	95.2
Concrete	1200	900	96.8

Table 4. Data augmentation techniques and their impact.

Augmentation Technique	Increase in Data Volume (%)	Improvement in Model Accuracy (%)
Rotation	300%	+7.5
Flipping	200%	+6.8
Contrast Adjustment	100%	+4.2

Table 5. Impact of input resolution on detection performance.

Input Resolution	Precision (%)	Recall (%)	Processing Time (ms)
768 × 768	85.3	82.5	45
1024 × 1024	90.2	87.4	68
1280 × 1280	91.5	89.1	95

Table 6. Modified feature extraction layers in EfficientDet-D3.

Layer Type	Original Configuration	Modified Configuration
BiFPN Depth	3	4
Atrous Convolution Layers	None	2
SE Blocks	Standard	Enhanced
Channels per Layer	64	128

Table 7. Effect of class balancing strategies on model accuracy.

Strategy	Precision (%)	Recall (%)	F1-Score (%)
No Balancing	80.1	72.3	76.0
Weighted Loss Function	87.5	85.0	86.2
Data Oversampling	89.2	87.1	88.1
Combined Approaches	91.0	89.5	90.2

Table 8. Impact of hyperparameter variations on model performance.

Batch Size	Learning Rate	Precision (%)	Recall (%)	F1-Score (%)
8	0.001	86.5	83.2	84.8
16	0.0005	91.2	87.5	89.3
32	0.0001	89.7	85.1	87.3

Table 9. Hyperparameter tuning results.

Model Name	Learning Rate	Batch Size	Epochs	Optimizer	Best Validation Loss	Training Time (h)
EfficientDet-D3	0.0005	16	100	Adam	0.095	12.5
YOLOv8	0.001	32	150	SGD	0.102	14.8
Mask R-CNN	0.0003	8	120	Adam	0.11	18.3

Table 10. Model inference speed comparison.

Model Name	GPU Utilization (%)	Memory Usage (GB)	Power Consumption (W)	Inference Speed (fps)
EfficientDet-D3	85	7.5	250	15
YOLOv8	90	6.8	280	20
Mask R-CNN	95	9.2	300	8

Table 11. Size distribution of cavities in the dataset.

Cavity Size (m)	Count in Dataset (%)	Typical Pavement Type	Detection Accuracy
0.1–0.5	15%	Asphalt, Concrete	82.3%
0.5–2.0	30%	Asphalt, Concrete	81.2%
2.0–5.0	35%	Asphalt, Concrete	89.8%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shin, S.-P.; Lee, S.-Y.; Le, T.H.M. Feasibility of EfficientDet-D3 for Accurate and Efficient Void Detection in GPR Images. Infrastructures 2025, 10, 140. https://doi.org/10.3390/infrastructures10060140

AMA Style

Shin S-P, Lee S-Y, Le THM. Feasibility of EfficientDet-D3 for Accurate and Efficient Void Detection in GPR Images. Infrastructures. 2025; 10(6):140. https://doi.org/10.3390/infrastructures10060140

Chicago/Turabian Style

Shin, Sung-Pil, Sang-Yum Lee, and Tri Ho Minh Le. 2025. "Feasibility of EfficientDet-D3 for Accurate and Efficient Void Detection in GPR Images" Infrastructures 10, no. 6: 140. https://doi.org/10.3390/infrastructures10060140

APA Style

Shin, S.-P., Lee, S.-Y., & Le, T. H. M. (2025). Feasibility of EfficientDet-D3 for Accurate and Efficient Void Detection in GPR Images. Infrastructures, 10(6), 140. https://doi.org/10.3390/infrastructures10060140

Article Menu

Feasibility of EfficientDet-D3 for Accurate and Efficient Void Detection in GPR Images

Abstract

1. Introduction

2. Materials and Methods

2.1. GRP Overview

2.2. Data Collection and Preprocessing

2.2.1. Dataset Construction and Quality Assurance

2.2.2. Preprocessing Techniques for GPR Image Quality Improvement

2.3. Model Development

2.3.1. Input Image Resolution Adjustment

2.3.2. Feature Extraction Layer Modifications

2.3.3. Class Balancing Techniques

2.4. Training and Hyperparameter Optimization

2.5. Evaluation Metrics

2.5.1. Precision, Recall, and the F1-Score

2.5.2. Mean Average Precision (mAP)

2.5.3. Inference Time

2.6. Comparison Model

3. Results and Discussion

3.1. Model Performance

3.1.1. Performance Across Different Pavement Types

3.1.2. Performance Across Different Weather Conditions

3.2. Comparative Analysis

3.2.1. Training and Validation Loss Analysis of Object Detection Models

3.2.2. Hyperparameter Tuning Results (Validation Loss)

3.2.3. Model Inference Speed Comparison

3.2.4. Comparison of YOLO Models and EfficientDet-D3 for Pavement Void Detection

3.3. Performance Evaluation of the EfficientDet Model Across Various Conditions

3.3.1. Threshold Sensitivity Analysis

3.3.2. Effect of Dataset Size on Model Performance

3.3.3. Performance Across Geographic Locations

3.4. Practical Application of EfficientDet-D3 in PMS

3.5. Challenges and Limitations

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI