Application of Combining YOLO Models and 3D GPR Images in Road Detection and Maintenance

: Improving the detection efﬁciency and maintenance beneﬁts is one of the greatest challenges in road testing and maintenance. To address this problem, this paper presents a method for combining the you only look once (YOLO) series with 3D ground-penetrating radar (GPR) images to recognize the internal defects in asphalt pavement and compares the effectiveness of traditional detection and GPR detection by evaluating the maintenance beneﬁts. First, traditional detection is conducted to survey and summarize the surface conditions of tested roads, which are missing the internal information. Therefore, GPR detection is implemented to acquire the images of concealed defects. Then, the YOLOv5 model with the most even performance of the six selected models is applied to achieve the rapid identiﬁcation of road defects. Finally, the beneﬁts evaluation of maintenance programs based on these two detection methods is conducted from economic and environmental perspectives. The results demonstrate that the economic scores are improved and the maintenance cost is reduced by $49,398/km based on GPR detection; the energy consumption and carbon emissions are reduced by 792,106 MJ/km (16.94%) and 56,289 kg/km (16.91%), respectively, all of which indicates the effectiveness of 3D GPR in pavement detection and maintenance.


Introduction
The quality parameters for structural layers of pavement are obtained through reasonable setpoint, drilled core on-site and laboratory testing in core sample detection. However, the inspection results cannot reflect the true conditions of the road at the scene because the setpoint is random and incidental [1,2]. In addition, the defect conditions of the road surface are acquired by manual-based patrol and judgement, which cannot detect the internal defects. These methods have the characteristics of low efficiency, poor presentation, and destructiveness that have led to a considerable increase in the cost of road maintenance. Thus, the traditional testing methods fail to meet the growing demands of road maintenance.
With the development of science and technology, new nondestructive testing (NDT) devices, such as ground-penetrating radar (GPR), the nuclear-free densitometer, laser detector, and ultrasonic depth finder have been used in fast nondestructive and precise testing. GPR is already well recognized for its role in improving the efficiency, security, and anti-interference [3,4]. Radar-collected data can provide the basis for recognizing hidden defects and be used to conduct the later maintenance and management of roads [5]. The development of 3D GPR further reinforces these effects [6]. Nevertheless, this technology has limitations, such as tedious data post-processing and a lack of evaluation criteria, which have resulted in a failure to provide automatic detection and quantitative evaluation of road testing and maintenance [7].
Recently, several efforts have been made in terms of the data processing of GPR inspection that includes signal processing and image recognition. Zhao et al. [8] proposed a nonlinear optimization method based on gradient descent to analyze the collected GPR signals in the thickness detection of asphalt pavement, which needs a prior knowledge of road structure. Liu et al. [9] used the frequency domain focusing technology of synthetic aperture radar (SAR) to aggregate scattered GPR signals for acquiring testing images. The noise of primordial signals was removed through the designed low-pass filter, and the profiles of detecting objects were extracted via the edge detection technique using the background information. Moreover, Mezgeen et al. [10] presented a formula relating the hidden crack width with the relative amplitude measured in the vertex of the hyperbola. However, a major drawback is that this research only considered regular single cracks.
As for image recognition of GPR detection, many researchers have tried to apply the complex manual processes to automatically inspect internal defects in a road, but this goal is difficult to realize [11,12]. It was not until the appearance of deep learning (DL), the real, efficient, automatic detection of concealed defects became possible in asphalt pavement [13,14]. As a result, the combination of deep convolutional neural network (CNN) models and GPR images has become a mainstream research direction. Tong et al. [15][16][17] used a CNN algorithm to achieve the automatic localization of internal cracking based on GPR testing images, which used the GPR signals as an input value to import into the CNNs.
However, although the region proposal types of CNN series models have the advantage of high accuracy, the limitation of detection speed loss has been reported. This limitation has promoted the development of more advanced DL models. Another regression method (also known as the one-stage method) substantially enhances the speed of defect detection by streamlining the workflow. This method primarily includes YOLO [18][19][20], RetinaNet [21], the single shot multibox detector (SSD) [22,23], and CenterNet [24]. Above all, YOLO version 3 (YOLOv3) is a mainstream method, and it has been widely used in remote sensing [25,26], agriculture [27], and energy [28]. It has also been successfully applied in transportation infrastructure, e.g., for the detection of pavement potholes and cracking [28][29][30]. Currently, the latest YOLO version 4 (YOLOv4) [31] and YOLO version 5 (YOLOv5) [32] have become more effective for object detection by integrating the most advanced methods.
On the other hand, researchers have performed many studies of the standardization of road testing and maintenance [33,34]. The group criteria of technical guidelines for ground-penetrating radar detection of the internal condition of highway asphalt pavement has been published by the China Highway and Transportation Society (CHTS) [35], which has provided a scientific reference for future exploration. However, these attempts are far from numerous. Therefore, in the present study, we developed a method for evaluating the maintenance benefits by comparing the traditional detection and GPR detection in asphalt pavement.
This work proposes a method for combining the YOLO series with GPR images to recognize the internal defects in asphalt pavement and compares the effectiveness of traditional detection and GPR detection by evaluating the maintenance benefits. The technical roadmap is shown in Figure 1. An introduction to the tested roads and traditional detection method are given in Section 2. In Section 3, the GPR detection process, which includes testing equipment, a testing scheme, data processing, and testing results, is elaborated. Moreover, the YOLOv3 and YOLOv5 models are applied to defect detection for better accuracy and efficiency. The fourth section discusses the maintenance programs and maintenance benefits based on two types of detection methods. Section 5 concludes the research.  Figure 2 shows the tested provincial road sections, which are called the Tonglu-Yiwu (TY) line (S210) and are located in Zhejiang province, China. Traditional and GPR inspection were implemented on this asphalt pavement from K46+000 to K51+000 and had a total length of 5 km. The structure layers, materials, and position of the tested road are indicated below.   Figure 2 shows the tested provincial road sections, which are called the Tonglu-Yiwu (TY) line (S210) and are located in Zhejiang province, China. Traditional and GPR inspection were implemented on this asphalt pavement from K46+000 to K51+000 and had a total length of 5 km. The structure layers, materials, and position of the tested road are indicated below.  Figure 2 shows the tested provincial road sections, which are called the Tonglu-Yiwu (TY) line (S210) and are located in Zhejiang province, China. Traditional and GPR inspection were implemented on this asphalt pavement from K46+000 to K51+000 and had a total length of 5 km. The structure layers, materials, and position of the tested road are indicated below.

Testing Process and Results
As shown in Figure 3, visual surveying and measurement were adopted by testing personnel to determine the damage condition of the pavement, and an inspection van was used to survey the surface roughness and skidding resistance of the pavement. Moreover, a coring survey was taken to obtain an accurate thickness of the asphalt pavement according to the highway performance assessment standards (JTG 5210-2018) and the specifications for maintenance design of highway asphalt pavement (JTG 5421-2019), which was enacted by the Ministry of Transport of the People's Republic of China.

Testing Process and Results
As shown in Figure 3, visual surveying and measurement were adopted by testing personnel to determine the damage condition of the pavement, and an inspection van was used to survey the surface roughness and skidding resistance of the pavement. Moreover, a coring survey was taken to obtain an accurate thickness of the asphalt pavement according to the highway performance assessment standards (JTG 5210-2018) and the specifications for maintenance design of highway asphalt pavement (JTG 5421-2019), which was enacted by the Ministry of Transport of the People's Republic of China. After these detections and observations, the results of the pavement defects investigation are shown in Table 1.  ----------Subbase  ---------l Left side, r right side (N-number, L-length (m), D-density (N/m), A-area (m 2 )).

Nondestructive Testing of Pavement Based on GPR
3D GPR is a new type of nondestructive testing equipment, and its testing work will not damage the pavement. 3D GPR emits penetrating high-frequency electromagnetic waves to the pavement structure through the fixed distance transmitting antenna and receives the directional reflection signals by the paired receiving antenna. Then, through data processing and analysis of the radar host, the 3D detection information of the pavement structure is reconstructed in the computer. After these detections and observations, the results of the pavement defects investigation are shown in Table 1.

Nondestructive Testing of Pavement Based on GPR
3D GPR is a new type of nondestructive testing equipment, and its testing work will not damage the pavement. 3D GPR emits penetrating high-frequency electromagnetic waves to the pavement structure through the fixed distance transmitting antenna and receives the directional reflection signals by the paired receiving antenna. Then, through data processing and analysis of the radar host, the 3D detection information of the pavement structure is reconstructed in the computer.

Testing Equipment
The 3D GPR system (3d-Radar Company, Trondheim, Norway) was used to inspect the internal damage of the road, which substantially reduced the misjudgment rate of interior conditions due to 2D imaging. The radar host of GeoScope TM MKIV (Figure 4a), multi-channel DXG TM 1820 ground-coupled antenna arrays (Figure 4b), Examiner TM 3 data analysis software, and GPS-RTK equipment (Figure 4c) was included in the 3D GPR system. GeoScope TM MKIV enables high-density, high-speed data acquisition while combining deeper detection capabilities with high resolution. By optimizing the signal bandwidth and the best possible resolution, high-speed surveying and a large scan width can be realized without losing the image details for the study of different depth layers underground. The multi-channel DXG TM 1820 ground-coupled antenna arrays have the advantage of high resolution that can collect 3D GPR data from up to 41 survey lines in a single pass in a continuous frequency range of 200 MHz to 3 GHz. In addition, the road conditions are detected from the surface of the road to a depth of 3 m by this DXG TM antenna, which is well-suited for the detection requirements of highway subgrade and pavement.

Testing Equipment
The 3D GPR system (3d-Radar Company, Trondheim, Norway) was used to inspect the internal damage of the road, which substantially reduced the misjudgment rate of interior conditions due to 2D imaging. The radar host of GeoScope TM MKIV (Figure 4a), multi-channel DXG TM 1820 ground-coupled antenna arrays (Figure 4b), Examiner TM 3 data analysis software, and GPS-RTK equipment (Figure 4c) was included in the 3D GPR system. GeoScope TM MKIV enables high-density, high-speed data acquisition while combining deeper detection capabilities with high resolution. By optimizing the signal bandwidth and the best possible resolution, high-speed surveying and a large scan width can be realized without losing the image details for the study of different depth layers underground. The multi-channel DXG TM 1820 ground-coupled antenna arrays have the advantage of high resolution that can collect 3D GPR data from up to 41 survey lines in a single pass in a continuous frequency range of 200 MHz to 3 GHz. In addition, the road conditions are detected from the surface of the road to a depth of 3 m by this DXG TM antenna, which is well-suited for the detection requirements of highway subgrade and pavement.  Combined with the unique ability of the stepped-frequency radar host GeoScope TM MKIV and VX series antennas to collect 3D radar data with a certain scan line density, the real 3D radar data processing is realized. As shown in Figure 4d, these antenna arrays combine different transmitting/receiving antenna pairs, allowing the user to collect multiple channels of data at once. By setting up, the user can collect data in a 7.5 cm × 7.5 cm grid (cover 1.5 m scanning) to obtain a true 3D image. The remaining technical parameters are shown in Table 2. Combined with the unique ability of the stepped-frequency radar host GeoScope TM MKIV and VX series antennas to collect 3D radar data with a certain scan line density, the real 3D radar data processing is realized. As shown in Figure 4d, these antenna arrays combine different transmitting/receiving antenna pairs, allowing the user to collect multiple channels of data at once. By setting up, the user can collect data in a 7.5 cm × 7.5 cm grid (cover 1.5 m scanning) to obtain a true 3D image. The remaining technical parameters are shown in Table 2.

Testing Scheme
3D GPR was adopted in this research to realize a full scan covering the road crosssection of the TY line (S210). According to the stake number, horizon, area, volume and width of the characteristic signal of internal road defects, and the details of some typical defects were detected, including subsidence of internal road structure (position, the maximum height difference, and area), bad interlayer bonding (position and area), general transverse cracking (position and length), general longitudinal cracking (position and length), penetrating cracking (position and length), water-rich zones (position and area), void zones (position and volume), and relaxing zones (position and degree). The information of the tested road section is shown in Table 3. As shown in Figure 5, the 3D GPR detection was conducted lane by lane and covered all the lanes. Some vehicles were arranged to follow the inspection van during the detection process by the proprietor, which assured the security of detection personnel and equipment. Under suitable conditions, the detection process should be closed to traffic based on the Safety Work Rules for Highway Maintenance, JTG H30-2015 (Ministry of Transport of the People's Republic of China).

Data Processing
The construction of a deep, learning-based road internal defect identification model requires a 3D GPR image dataset to provide the training set, verification, and testing required for model construction. This process was conducted by taking the steps in Figure 6.

Data Processing
The construction of a deep, learning-based road internal defect identification model requires a 3D GPR image dataset to provide the training set, verification, and testing required for model construction. This process was conducted by taking the steps in Figure  6.

Filtering for GPR Data
After the GPR data acquisition, the augmentation and filtering of these images were to be performed. Based on the GPR system data processing software (Examiner TM 3), the inverse discrete Fourier transform (ISDFT), data autoscale, and background removal

Filtering for GPR Data
After the GPR data acquisition, the augmentation and filtering of these images were to be performed. Based on the GPR system data processing software (Examiner TM 3), the inverse discrete Fourier transform (ISDFT), data autoscale, and background removal (BGR) (high pass) were used for data processing. The specific settings of the filtering parameters are shown in Figure 6.

Recognizing for GPR Data
Cracking, void, and settlement are the three main defects to be classified and identified in this research. However, the settlement defect was not included in the identification model because the scale of settlement is much larger than the other two defects, and its characteristics are distinctive. Therefore, according to Technical Guideline for Ground Penetrating Radar Detection for Internal Conditions of Highway Asphalt Pavement promulgated by China Highway and Transportation Society, the basis for judgement of cracking and void was determined by summarizing the features of the B-scan and C-scan of these two defects in Table 4.

Capturing for GPR Data
The B-scan images were chosen as the input images because they could reflect the most basic features of internal defects and the exact location through GPR. In addition, the images have a high identifiability degree, which is easier to recognize. The resolution of the captured images was 320 × 320 pixels, and the real size for the B-scan was 0.5 m × 13.2 m.

Labeling for GPR Data
LabelImg labeling software [37] was used to mark hidden cracking in the captured images. Based on the identification method of 3.3.2 (the void defect was manually identified because the number of samples was too small), the hyperbolic reflection wave in Bscan and the long strip in C-scan were used to mark the hidden cracking with rectangular boxes.
Then, the corresponding annotation information for the box was stored in an XMLformatted file, as shown in the bottom of Figure 6. The marking information includes the coordinates of two points on the diagonal line of the rectangular box, which can reflect the location and size of the selected cracking.
Next, according to the number of captured images in our early research, 350 sample images were labeled, and the total number of concealed cracks was 1400. Afterwards, these samples were assigned to three groups randomly in a certain ratio as follows: the training model's dataset (263 images and 1134 cracks), the verifying model's validation set (

Capturing for GPR Data
The B-scan images were chosen as the input images because they could reflect the most basic features of internal defects and the exact location through GPR. In addition, the images have a high identifiability degree, which is easier to recognize. The resolution of the captured images was 320 × 320 pixels, and the real size for the B-scan was 0.5 m × 13.2 m.

Labeling for GPR Data
LabelImg labeling software [37] was used to mark hidden cracking in the captured images. Based on the identification method of 3.3.2 (the void defect was manually identified because the number of samples was too small), the hyperbolic reflection wave in Bscan and the long strip in C-scan were used to mark the hidden cracking with rectangular boxes.
Then, the corresponding annotation information for the box was stored in an XMLformatted file, as shown in the bottom of Figure 6. The marking information includes the coordinates of two points on the diagonal line of the rectangular box, which can reflect the location and size of the selected cracking.
Next, according to the number of captured images in our early research, 350 sample images were labeled, and the total number of concealed cracks was 1400. Afterwards, these samples were assigned to three groups randomly in a certain ratio as follows: the training model's dataset (263 images and 1134 cracks), the verifying model's validation set (

Capturing for GPR Data
The B-scan images were chosen as the input images because they could reflect the most basic features of internal defects and the exact location through GPR. In addition, the images have a high identifiability degree, which is easier to recognize. The resolution of the captured images was 320 × 320 pixels, and the real size for the B-scan was 0.5 m × 13.2 m.

Labeling for GPR Data
LabelImg labeling software [37] was used to mark hidden cracking in the captured images. Based on the identification method of 3.3.2 (the void defect was manually identified because the number of samples was too small), the hyperbolic reflection wave in Bscan and the long strip in C-scan were used to mark the hidden cracking with rectangular boxes.
Then, the corresponding annotation information for the box was stored in an XMLformatted file, as shown in the bottom of Figure 6. The marking information includes the coordinates of two points on the diagonal line of the rectangular box, which can reflect the location and size of the selected cracking.
Next, according to the number of captured images in our early research, 350 sample images were labeled, and the total number of concealed cracks was 1400. Afterwards, these samples were assigned to three groups randomly in a certain ratio as follows: the training model's dataset (263 images and 1134

Capturing for GPR Data
The B-scan images were chosen as the input images because they could reflect the most basic features of internal defects and the exact location through GPR. In addition, the images have a high identifiability degree, which is easier to recognize. The resolution of the captured images was 320 × 320 pixels, and the real size for the B-scan was 0.5 m × 13.2 m.

Labeling for GPR Data
LabelImg labeling software [37] was used to mark hidden cracking in the captured images. Based on the identification method of 3.3.2 (the void defect was manually identified because the number of samples was too small), the hyperbolic reflection wave in Bscan and the long strip in C-scan were used to mark the hidden cracking with rectangular boxes.
Then, the corresponding annotation information for the box was stored in an XMLformatted file, as shown in the bottom of Figure 6. The marking information includes the coordinates of two points on the diagonal line of the rectangular box, which can reflect the location and size of the selected cracking.
Next, according to the number of captured images in our early research, 350 sample images were labeled, and the total number of concealed cracks was 1400. Afterwards, these samples were assigned to three groups randomly in a certain ratio as follows: the training model's dataset (263 images and 1134

Capturing for GPR Data
The B-scan images were chosen as the input images because they could reflect the most basic features of internal defects and the exact location through GPR. In addition, the images have a high identifiability degree, which is easier to recognize. The resolution of the captured images was 320 × 320 pixels, and the real size for the B-scan was 0.5 m × 13.2 m.

Labeling for GPR Data
LabelImg labeling software [37] was used to mark hidden cracking in the captured images. Based on the identification method of 3.3.2 (the void defect was manually identified because the number of samples was too small), the hyperbolic reflection wave in Bscan and the long strip in C-scan were used to mark the hidden cracking with rectangular boxes.
Then, the corresponding annotation information for the box was stored in an XMLformatted file, as shown in the bottom of Figure 6. The marking information includes the coordinates of two points on the diagonal line of the rectangular box, which can reflect the location and size of the selected cracking.
Next, according to the number of captured images in our early research, 350 sample images were labeled, and the total number of concealed cracks was 1400. Afterwards, these samples were assigned to three groups randomly in a certain ratio as follows: the training model's dataset (263 images and 1134

Capturing for GPR Data
The B-scan images were chosen as the input images because they could reflect the most basic features of internal defects and the exact location through GPR. In addition, the images have a high identifiability degree, which is easier to recognize. The resolution of the captured images was 320 × 320 pixels, and the real size for the B-scan was 0.5 m × 13.2 m.

Labeling for GPR Data
LabelImg labeling software [37] was used to mark hidden cracking in the captured images. Based on the identification method of 3.3.2 (the void defect was manually identified because the number of samples was too small), the hyperbolic reflection wave in Bscan and the long strip in C-scan were used to mark the hidden cracking with rectangular boxes.
Then, the corresponding annotation information for the box was stored in an XMLformatted file, as shown in the bottom of Figure 6. The marking information includes the coordinates of two points on the diagonal line of the rectangular box, which can reflect the location and size of the selected cracking.
Next, according to the number of captured images in our early research, 350 sample images were labeled, and the total number of concealed cracks was 1400. Afterwards, these samples were assigned to three groups randomly in a certain ratio as follows: the training model's dataset (263 images and 1134 cracks), the verifying model's validation set (44 images and 135 cracks), and the evaluating model's test set (43 images and 131 cracks).

Capturing for GPR Data
The B-scan images were chosen as the input images because they could reflect the most basic features of internal defects and the exact location through GPR. In addition, the images have a high identifiability degree, which is easier to recognize. The resolution of the captured images was 320 × 320 pixels, and the real size for the B-scan was 0.5 m × 13.2 m.

Labeling for GPR Data
LabelImg labeling software [37] was used to mark hidden cracking in the captured images. Based on the identification method of 3.3.2 (the void defect was manually identified because the number of samples was too small), the hyperbolic reflection wave in B-scan and the long strip in C-scan were used to mark the hidden cracking with rectangular boxes.
Then, the corresponding annotation information for the box was stored in an XMLformatted file, as shown in the bottom of Figure 6. The marking information includes the coordinates of two points on the diagonal line of the rectangular box, which can reflect the location and size of the selected cracking.
Next, according to the number of captured images in our early research, 350 sample images were labeled, and the total number of concealed cracks was 1400. Afterwards, these samples were assigned to three groups randomly in a certain ratio as follows: the

Testing Results
The workflow for the detection method of the YOLO models is shown in Figure 7. YOLOv3 is well known for having the most advanced one-stage detection networks. Although the updated version YOLOv5 uses new peculiarities to increase the detection efficiency, YOLOv5 and YOLOv3 still have a similar detection principle and network architecture. In brief, the latest technology has been used in YOLOv5 to update YOLOv3 in terms of Backbone and Neck. In parallel, skills are also added. Detailed information is shown in Table 5.   This study compared 6 models with 2 different kinds of versions, namely, YOLO YOLOv3-tiny, YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x [38]. YOLO-tiny was co sidered light YOLO to substantially increase the detection speed but brought accura loss. Note that the s, m, l, and x appended to YOLOv5 represent the increasing depth the model.
YOLOv3 predicts an objectness score for each bounding box based on logistic regr sion. As for the loss of the bounding box regression, intersection over union (IoU) [39 the most popular metric for calculating loss. YOLOv5 uses the same backbone of YOLO and utilizes GIoU to estimate the bounding box loss. Besides, it also uses auto-learni bounding box anchors to adjust and optimize the choice of anchors.
The network has a relatively large number of parameters and a small dataset, wh   This study compared 6 models with 2 different kinds of versions, namely, YOLOv3, YOLOv3-tiny, YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x [38]. YOLO-tiny was considered light YOLO to substantially increase the detection speed but brought accuracy loss. Note that the s, m, l, and x appended to YOLOv5 represent the increasing depth of the model.
YOLOv3 predicts an objectness score for each bounding box based on logistic regression. As for the loss of the bounding box regression, intersection over union (IoU) [39] is the most popular metric for calculating loss. YOLOv5 uses the same backbone of YOLOv3 and utilizes GIoU to estimate the bounding box loss. Besides, it also uses auto-learning bounding box anchors to adjust and optimize the choice of anchors.
The network has a relatively large number of parameters and a small dataset, which could result in overfitting. Therefore, transfer learning was adopted to train the models to overcome this hidden danger [40]. The COCO dataset includes over 500,000 image data points belonging to 80 different categories. Consequently, the pretrained weights by the COCO dataset were used to initialize the model to be trained. The other hyperparameters of the model were set as shown below: the initial learning rate was 0.001; the size of the batch and mini-batch were 16 and 4, respectively; the momentum and weight decay were 0.9 and 0.0005, respectively; the epoch was 300; and the other parameters were set to their default values.
As is shown in Figure 8, the loss and mAP curves of the YOLO models were compared. The value of loss represents the difference between the predicted value and true value. The smaller the value of loss, the better training effect. Moreover, the high mAP also denotes a great performance of the training models. According to Figure 8a, b, the final converged loss value of YOLOv3 was approx mately 2, whereas that of YOLOv5 was lower than 0.2, which suggested that the YOLOv model performed substantially better than YOLOv3 because the lower loss indicates bet ter training effects. Moreover, all mAP values of YOLOv5 were higher than those o YOLOv3. Taken together, we concluded that the performance of the YOLOv5 models wa superior.
The specific training results of the YOLOv3 and YOLOv5 models are summarized i Table 6. All mAP values of the YOLOv5m, YOLOv5l, and YOLOv5x models were highe than 90% (the highest value was 94.45%), which is commendable for the small trainin set. Another finding may be summarized as the higher the number of weights is, th higher the model's mAP value will be, suggesting that an appropriate increase in mode According to Figure 8a,b, the final converged loss value of YOLOv3 was approximately 2, whereas that of YOLOv5 was lower than 0.2, which suggested that the YOLOv5 model performed substantially better than YOLOv3 because the lower loss indicates better training effects. Moreover, all mAP values of YOLOv5 were higher than those of YOLOv3. Taken together, we concluded that the performance of the YOLOv5 models was superior.
The specific training results of the YOLOv3 and YOLOv5 models are summarized in Table 6. All mAP values of the YOLOv5m, YOLOv5l, and YOLOv5x models were higher than 90% (the highest value was 94.45%), which is commendable for the small training set.
Another finding may be summarized as the higher the number of weights is, the higher the model's mAP value will be, suggesting that an appropriate increase in model depth favored the enhancement of the training performance. However, with increasing weights, the frames per second (FPS) were reduced, while the inference time was prolonged. It is not difficult to find that the FPS of the YOLOv3 and YOLOv5 models were poorly differentiated when the values of the weights were similar. What needs illustration is that the model with faster inference speed was preferentially selected based on the requirements of rapid detection. Ultimately, the YOLOv5m model with the most even performance was used to detect internal defects in roads according to the integrated consideration of mAP and FPS. Depending on the training results obtained via YOLOv5m, the statistical information of the defects is listed in Tables 7 and 8 (the raveling and settlement were manually recognized).    A schematic of the position and size of the defects is plotted in Figure 9 based on the recognition results (the detailed analysis is in the following Section 4.1). (a)

Traditional Detection
The primary defect types of the tested road section are the dominant cracking and settlement on the road surface. The maintenance measures were conducted at the surface and basement of the tested road because of the unclear information of the internal defects.

GPR Detection
The main defect types of the tested road section are cracking (more than 90%), void zones, and raveling. Therefore, the characteristics of the cracking were the focus of the analysis. First, the overall cracking density of the proposed maintenance roads was low. Specifically, the cracking density of the surface was 1.5 m/m 2 , the cracking density of the basement was 1.0 m/m 2 , and the number of void defects was 9 (the total area was 27 m 2 ) in the left side of the tested road. Moreover, the cracking density of the surface was 0.8 m/m 2 , the cracking density of the basement was 0.6 m/m 2 , and the number of void defects was 13 (the total area was 52 m 2 ) in the right side of the tested road.
As for the development horizon of the cracking, there were three types, as shown in Figure 10. The primary defect types of the tested road section are the dominant cracking and settlement on the road surface. The maintenance measures were conducted at the surface and basement of the tested road because of the unclear information of the internal defects.

GPR Detection
The main defect types of the tested road section are cracking (more than 90%), void zones, and raveling. Therefore, the characteristics of the cracking were the focus of the analysis. First, the overall cracking density of the proposed maintenance roads was low. Specifically, the cracking density of the surface was 1.5 m/m 2 , the cracking density of the basement was 1.0 m/m 2 , and the number of void defects was 9 (the total area was 27 m 2 ) in the left side of the tested road. Moreover, the cracking density of the surface was 0.8 m/m 2 , the cracking density of the basement was 0.6 m/m 2 , and the number of void defects was 13 (the total area was 52 m 2 ) in the right side of the tested road.
As for the development horizon of the cracking, there were three types, as shown in Figure 10. Finally, the prediction of the development of defects was conducted based on the results above. Notably, the third type of developing cracking would gradually undergo a transition to the first type with the arrival of freeze-thawing during rainy and winter seasons, which would lead to the appearance of more pumping mud.

Maintenance Program
As shown in Figure 11, the following two maintenance programs were determined according to different defect severities. First, for the general road sections with low defect severity, the milling measure for the original surface (5 cm AC-13 and 7 cm AC-16) should be performed. Then, the new surface (12 cm AC-13, a previous study demonstrated that the maintenance measure of AC-13 has the highest comprehensive benefit [41]) is spread on the basement (milling and resurfacing, MR). In terms of the road sections with severe defects, after the surface milling measure, the treatment of defects (overlay paving for reinforcement, OPR) is conducted on the basement before the resurfacing.
Finally, according to the analysis for defect characteristics and maintenance measures based on conventional detection and GPR detection, the maintenance programs were established for these two detection methods. As presented in Figure 12, the MR measure was conducted for surface maintenance based on both detection methods. However, this was not the same case for the basement. Specifically, the OPR measure was adopted for the full range of basement with conventional detection, while only 1450 m for the basement of serious diseases with GPR detection.

1.
The up and down cracking (the pumping defect had emerged, Figure 10a).

2.
The top-down developing cracking (the cracking had emerged on the surface but not at the basement, Figure 10b).

3.
The bottom-up developing cracking (the cracking had emerged on the basement but not at the surface, Figure 10c).
From the perspective of the regional distribution of defects, distinct characteristics of partial defect concentration were in the tested roads. The defects of the basement were lesser than those of other structural layers in general road sections. On the other hand, the distribution of defects was more concentrated at the surface and basement in the road sections with severe defects.
Finally, the prediction of the development of defects was conducted based on the results above. Notably, the third type of developing cracking would gradually undergo a transition to the first type with the arrival of freeze-thawing during rainy and winter seasons, which would lead to the appearance of more pumping mud.

Maintenance Program
As shown in Figure 11, the following two maintenance programs were determined according to different defect severities. Finally, the prediction of the development of defects was conducted based on the results above. Notably, the third type of developing cracking would gradually undergo a transition to the first type with the arrival of freeze-thawing during rainy and winter seasons, which would lead to the appearance of more pumping mud.

Maintenance Program
As shown in Figure 11, the following two maintenance programs were determined according to different defect severities. First, for the general road sections with low defect severity, the milling measure for the original surface (5 cm AC-13 and 7 cm AC-16) should be performed. Then, the new surface (12 cm AC-13, a previous study demonstrated that the maintenance measure of AC-13 has the highest comprehensive benefit [41]) is spread on the basement (milling and resurfacing, MR). In terms of the road sections with severe defects, after the surface milling measure, the treatment of defects (overlay paving for reinforcement, OPR) is conducted on the basement before the resurfacing. First, for the general road sections with low defect severity, the milling measure for the original surface (5 cm AC-13 and 7 cm AC-16) should be performed. Then, the new surface (12 cm AC-13, a previous study demonstrated that the maintenance measure of AC-13 has the highest comprehensive benefit [41]) is spread on the basement (milling and resurfacing, MR). In terms of the road sections with severe defects, after the surface milling measure, the treatment of defects (overlay paving for reinforcement, OPR) is conducted on the basement before the resurfacing.
Finally, according to the analysis for defect characteristics and maintenance measures based on conventional detection and GPR detection, the maintenance programs were established for these two detection methods. As presented in Figure 12, the MR measure was conducted for surface maintenance based on both detection methods. However, this was not the same case for the basement. Specifically, the OPR measure was adopted for the full range of basement with conventional detection, while only 1450 m for the basement of serious diseases with GPR detection.

Benefits Analysis
The service life and pavement performance of maintenance measures have been u to evaluate the long-term benefits in many studies [42,43]. The present work drew on p vious studies and used the economic and environmental benefits as evaluation criteria comparison of traditional detection and GPR detection.
The fundamental assumptions for the calculation of benefits were as follows. T material haul-lengths of asphalt, gravel, and asphalt mixture are 100, 60, and 50 km, spectively. The density of hot-mix asphalt mixture is 2.45 t/m 3 . The thickness of treatm is typically 4 cm. The per unit of maintenance area is calculated as 375 m 2 (100 m× 3.75 single lane).
In this study, the tested road was a two-way four-lane road of 5 km. According Figure 12, the total area of the first and second maintenance measures was 75,000 m 2 bas on traditional detection. As for GPR detection, the area of the first maintenance meas was 53,250 m 2 and that of the second was 21,750 m 2 .

Economic Benefits
The average cost and economic effectiveness (the evaluation index for maintena economic-benefits obtained by some pavement performance indexes) [44] of the MR a OPR measures are listed in Table 9. The actual thickness of treatment was 0.12 m in road surface. Therefore, the final results in Figure 13 were obtained by multiplying by

Benefits Analysis
The service life and pavement performance of maintenance measures have been used to evaluate the long-term benefits in many studies [42,43]. The present work drew on previous studies and used the economic and environmental benefits as evaluation criteria for comparison of traditional detection and GPR detection.
The fundamental assumptions for the calculation of benefits were as follows. The material haul-lengths of asphalt, gravel, and asphalt mixture are 100, 60, and 50 km, respectively. The density of hot-mix asphalt mixture is 2.45 t/m 3 . The thickness of treatment is typically 4 cm. The per unit of maintenance area is calculated as 375 m 2 (100 m× 3.75 m, single lane).
In this study, the tested road was a two-way four-lane road of 5 km. According to Figure 12, the total area of the first and second maintenance measures was 75,000 m 2 based on traditional detection. As for GPR detection, the area of the first maintenance measure was 53,250 m 2 and that of the second was 21,750 m 2 .

Economic Benefits
The average cost and economic effectiveness (the evaluation index for maintenance economic-benefits obtained by some pavement performance indexes) [44] of the MR and OPR measures are listed in Table 9. The actual thickness of treatment was 0.12 m in the road surface. Therefore, the final results in Figure 13 were obtained by multiplying by 3. road surface. Therefore, the final results in Figure 13 were obtained by multiplying by 3. (a) (b) Figure 13. The contrast for maintenance cost (a) and economic effectiveness (b) of the two detecting methods. Figure 13 shows that the maintenance cost based on GPR detection was lower than that of traditional detection. More specifically, the reducing cost is $49,398/km. In addition, the economic scores were higher based on GPR detection than traditional detection in low-traffic and high-traffic road sections.  Figure 13 shows that the maintenance cost based on GPR detection was lower than that of traditional detection. More specifically, the reducing cost is $49,398/km. In addition, the economic scores were higher based on GPR detection than traditional detection in low-traffic and high-traffic road sections. Table 10 lists the energy consumption and carbon emissions of the MR and OPR measures, including the milling, raw materials production, mixture, transport, spreading, and compaction sessions. The contrast for energy consumption and carbon emissions are shown below in Figure 14.

Environmental Benefits
The contrast between the energy consumption and carbon emissions of the two detecting methods is shown in Figure 14. Obviously, the energy consumption and carbon emissions based on GPR detection were less than those based on traditional detection and were reduced by 792,106 MJ/km (16.94%) and 56,289 kg/km (16.91%), respectively.  The contrast between the energy consumption and carbon emissions of the two detecting methods is shown in Figure 14. Obviously, the energy consumption and carbon emissions based on GPR detection were less than those based on traditional detection and were reduced by 792,106 MJ/km (16.94%) and 56,289 kg/km (16.91%), respectively.

Conclusions
This paper aims to improve the detection efficiency and increase the maintenance benefits by combining YOLO models and 3D GPR images of an asphalt road. The YOLOv5m model is selected to conduct the rapid identification of road defects according to the comparison results of six YOLO series models. Based on the analysis of economic and environmental benefits for tested-road maintenance, the advantage of GPR detection has emerged. Several conclusions can be summarized as follows: 1. The internal defects in asphalt pavement, including cracking, void zones, raveling, and settlement, were detected by 3D GPR. However, the conventional method detected only the surface conditions. Furthermore, 3D GPR detection is more nondestructive relative to the coring validation.

Conclusions
This paper aims to improve the detection efficiency and increase the maintenance benefits by combining YOLO models and 3D GPR images of an asphalt road. The YOLOv5m model is selected to conduct the rapid identification of road defects according to the comparison results of six YOLO series models. Based on the analysis of economic and environmental benefits for tested-road maintenance, the advantage of GPR detection has emerged. Several conclusions can be summarized as follows: 1.
The internal defects in asphalt pavement, including cracking, void zones, raveling, and settlement, were detected by 3D GPR. However, the conventional method detected only the surface conditions. Furthermore, 3D GPR detection is more nondestructive relative to the coring validation.

2.
The final converged loss value of YOLOv3 was approximately 2, whereas that of YOLOv5 was lower than 0.2. Thus, the YOLOv5 models are suitable for the detection of internal defects in asphalt road, and these models provide a good training result even for a small dataset condition. The mAP values of the YOLOv5m, YOLOv5l, and YOLOv5x models were higher than 90% and the maximum was 94.45% in YOLOv5-x. It was also found with regularity that the larger a model's weights are, the higher the model's mAP will be, which suggests that an appropriate increase in model depth favors the enhancement of the training performance. Most importantly, the YOLOv5m models are the most balanced deep-learning models in terms of detection speed and actual performance of the six YOLO series models. 3.
In the evaluation of the economic benefits of maintenance programs, the maintenance cost based on GPR detection was reduced by $49,398/km compared to that of traditional detection, and the economic scores based on GPR detection were higher than those of traditional detection in low-traffic and high-traffic road sections. As for environmental benefits, the energy consumption and carbon emissions of the maintenance program based on GPR detection was less than those of traditional detection by 792,106 MJ/km and 56,289 kg/km or 16.94 and 16.91 percentage points, respectively.