To evaluate the effectiveness of the defect recognition framework—combining simulated images, style transfer, and deep detection models—we constructed a representative radar-image dataset featuring diverse defect characteristics. Given the scarcity of real radar images and the high cost of their annotation, we leveraged the flexibility of simulation to generate images with specific defect structures, despite the resulting domain gap relative to real data. We therefore assembled two separate datasets: one by extracting appropriate sections from field-collected radar scans, and the other by selecting images from simulation outputs. Both datasets satisfied the basic requirements for model training and ensured the smooth execution of our experiments.
3.2. Construction of the Simulated Dataset
Compared to real images, simulated images allowed flexible control over defect type, shape, and location, offering greater freedom in sample construction and effectively addressing the problem of limited real data. However, due to significant differences in imaging mechanisms and visual styles between simulated and real images, directly using simulated data for deep learning training often led to reduced recognition accuracy or even overfitting. To address this, structurally complete and parameter-defined defect simulations were first generated using ground penetrating radar Maxwell (GprMax3.0). Then, Cycle-GAN was used to perform unsupervised style transfer from the simulated domain to the real-image domain. As a result, a training dataset with high visual similarity to real images and strong controllability was constructed.
The GprMax is an electromagnetic wave simulator based on the finite-difference time-domain (FDTD) method. Initially developed in 1996 [
25], GprMax has been one of the most widely used simulation tools in the field of GPR over the past two decades [
26,
27]. GprMax was capable of simulating several advanced features, such as modeling material anisotropy, expressing dielectric dispersion using multi-pole Debye functions, and simulating pavement structures with semi-empirical dielectric formulas and fractal geometric characteristics.
- (1)
Simulation Based on GprMax 3.0
The simulated road was designed as a 5 m × 1 m rectangle. Various parameters for the media were configured. Following the pavement structure, three structural layers were defined: surface layer, base layer, and sub-base layer, with relative permittivity values of 4, 9, and 12, respectively [
28]. The pavement structure, from top to bottom, consisted of a 0.15 m surface layer, a 0.6 m base layer, and a 0.25 m sub-base layer. During the GprMax simulation, a time window of 25 ns and an electromagnetic wave excitation frequency of 2 GHz were used. The transmitting–receiving pair moved along the simulated surface with a step size of 0.02 m. The dielectric constant of asphalt concrete was set to 6.5, while the dielectric constant of air was defined as 1. The simulation parameters are listed in
Table 3.
- (2)
Analysis of Simulation Results
The simulated results of various types of pavement defects are shown in
Figure 7. Crack defects (
Figure 7a) were simulated with varying depths and shapes. In the radar images, they appeared as high-amplitude vertical reflections with a characteristic inverted “U” shape, indicating enhanced signal reflections from the sub-base to the surface layer. To further validate the authenticity of these radar features, field coring was performed at the corresponding locations of the real GPR images, and the extracted cores confirmed the presence of structural cracking consistent with the radar response. Loose defects (
Figure 7b) were modeled by introducing water- and air-filled voids within the structure, resulting in significant signal scattering. The radar images of loose materials exhibited noticeable variations in amplitude and signal delay, producing rough, mottled and discontinuous textures. Coring verification at these positions revealed loosened aggregate and voided areas, which aligned with the scattering characteristics observed in the radar images. Interlayer debonding defects (
Figure 7c) were simulated by inserting thin air layers between structural layers. These generated enhanced interlayer reflections, typically appearing in radar images as distinct gap-like signals with high detectability. Corresponding field cores clearly showed separation between pavement layers, further confirming that the radar-identified features accurately represented interlayer debonding in the actual pavement structure.
To strengthen the empirical validation of the simulated radar signatures, field coring was conducted at radar-anomaly locations identified from the real GPR survey. The coring positions were determined according to the spatial coordinates/chainage of the anomalous reflections in the radar images, using the acquisition positioning information to ensure correspondence between radar responses and physical sampling locations. In total, 20 cores were extracted, including 7 for crack-like anomalies, 6 for loose materials anomalies, and 7 for interlayer debonding anomalies. Among these, 7/7, 5/6, and 6/7 anomalies were physically confirmed by the extracted cores, respectively. These results provide direct field evidence that the characteristic radar responses described in
Figure 7 are associated with actual pavement structural defects.
Simulated radar data were generated using the GprMax 3.0 software. To ensure consistency across the simulations, radar images of various pavement defects—including cracks at different depths, interlayer debonding, and loose materials—were created based on the predefined pavement structure. A batch simulation process initially produced 100 radar images. These images were subsequently augmented through data augmentation techniques, resulting in a total of 10,000 simulated radar images for training and analysis. Before data augmentation and Cycle-GAN translation, the initial 100 GprMax-simulated radar images already contained basic structural diversity rather than repeated instances of a single defect template. Specifically, the simulated defect set included variations in defect size, burial depth, geometric shape, and spatial position within the pavement structure. Representative examples of these initial simulated defect configurations are shown in
Figure 8.
- (3)
Radar Image Generation Based on Adversarial Neural Networks
The Cycle-GAN model was used to generate radar images, aiming to achieve mutual conversion between simulated and real radar images [
29]. This approach enables effective mapping between image domains without the need for paired image data.
(1) Generators and Discriminators
Cycle-GAN consists of two generators: G1, which transforms simulated radar images into real ones, and G2, which converts real radar images into simulated ones. Each generator comprises three main components: an encoder, a transformer, and a decoder. The encoder progressively extracts low-dimensional representations from the input images, effectively capturing their semantic features. Transformation: This part performs feature mapping and spatial transformations to achieve effective inter-domain feature mapping. Decoding: This part decodes the transformed feature representations into high-resolution images in the target domain [
30,
31]. The structure of the generator is shown in
Figure 9.
The Cycle-GAN architecture included two discriminators, each designed to distinguish between real images and those generated by the corresponding generators. Discriminator D1 evaluated the differences between the input image and real radar images, and output the probability that the input belonged to the real radar image domain, as illustrated in
Figure 10. The training objective of the discriminator was to classify real images as real and generated images as fake, thereby enabling adversarial training against the generator.
(2) Loss Functions
The loss functions of Cycle-GAN consist of adversarial loss, cycle consistency loss, and identity loss. The adversarial loss encourages the generator to produce realistic transformed images capable of deceiving the discriminator. The cycle consistency loss ensures consistency in domain transformations by maintaining the difference between the input image and the original input after two transformations. The identity loss ensures that the generator preserves the identity features of the input image. The weighted sum of these loss functions constitutes the total loss function. The generator’s objective is to minimize these losses, while the discriminator aims to minimize the adversarial loss. Therefore, by optimizing these loss functions, Cycle-GAN can train high-quality image transformation generators.
Adversarial loss in Cycle-GAN is the loss function used to train the adversarial interaction between the generator and the discriminator, aiming to ensure that the generator produces realistic samples. In adversarial learning, the generator’s objective is to create synthetic samples that are indistinguishable from real ones, while the discriminator’s goal is to accurately differentiate between real and generated samples. The adversarial loss drives the generator to produce realistic samples by minimizing the negative logarithm of the probability that the generated samples are classified as generated by the discriminator. Specifically, for a generator
G and discriminator
D, the adversarial loss can be defined as the negative logarithm of the probability that the images generated by
G are classified as generated by
D, as follows:
where
G1 and
G2 represent the forward and backward generators, while
D1 and
D2 represent the discriminators that distinguish between real radar images and simulated radar images.
X and
Y denote the domains of simulated radar images and real radar images, with
x ∈
X and
y ∈
Y representing simulated radar images and real radar images, respectively.
Cycle Consistency Loss is a key loss function in Cycle-GAN. Its objective is to ensure that an image can return to its original domain after passing through both generators, achieving bidirectional consistency. During training, the cycle consistency loss forces the generators to learn a mapping such that an image, after undergoing a forward and reverse transformation, is as close as possible to the original image, as shown in
Figure 11. To address the inconsistency problems commonly encountered in GANs, Cycle-GAN incorporates least squares loss as part of the cycle consistency loss. By using least squares loss to measure the sum of squared differences between the original image and the image after undergoing two generator transformations back to its original domain, the model achieves better consistency in transformations. This loss module allows the generators to learn image transformation mappings more accurately, producing more stable and precise transformation results. The specific formula for the least squares loss is as follows:
where
Ladv1 and
Ladv2 represent the adversarial losses between
G1 and
D2, and
G2 and
D1, respectively.
Ladv3 represents the final adversarial loss.
Identity Loss is a loss function used in Cycle-GAN. Its objective is to ensure that the input image retains its identity after transformation. During adversarial network training, the generator
G is used to generate images in a specific style,
y. To demonstrate that
G can generate images in the style of
y, feeding y into
G should result in
y itself. In other words,
G(
y) and y should be as close as possible. Without a corresponding loss function, the generator might alter the image’s tone, causing overall color shifts. Therefore, introducing identity loss helps maintain image consistency and original features, ensuring that the generator accurately produces images in the desired style. The calculation of identity loss is as follows:
The total loss function is composed of these different losses, represented as a weighted sum of the generator and discriminator losses, as shown in the equation. The generator’s objective is to minimize adversarial loss, cycle consistency loss, and identity loss, while the discriminator aims to minimize adversarial loss. By optimizing these loss functions, Cycle-GAN can train generators capable of bidirectional image translation between domains.
where
LGAN(
G1,
DY,
X,
Y) represents the adversarial loss for generating domain
Y from domain
X,
LGAN(
G2,
DY,
X,
Y) represents the adversarial loss for generating domain
X from domain
Y, and
λLcyc(
G1,
G2) denotes the cycle consistency loss for generators
G1 and
G2. The
λ coefficient is a hyperparameter used to scale the cycle consistency loss.
(3) Model Architecture and Principles
The core of Cycle-GAN lies in achieving bidirectional image translation between two domains using two generators (Generator A → B and Generator B → A). Additionally, it incorporates Cycle Consistency Loss to ensure that the mapped image, after reverse mapping, remains consistent with the original image. The detailed steps are illustrated in
Figure 12.
(4) Environment Configuration and Training Parameter Settings
The Cycle-GAN model was implemented using the PyTorch (v1.10.0) framework. The training parameters were set as follows: the number of epochs was 200; the identity loss weight (lambda_identity) was 0.5; the adversarial mode (gan_mode) was set to “lsgan”; the learning-rate decay policy (lr_policy) was linear; lr_decay_iters was 50; and the batch size was 1.
To determine suitable hyperparameters for the Cycle-GAN model, a two-step analysis was performed. First, the learning rate was evaluated using three candidate values: 0.0002, 0.0005, and 0.001, while the cycle consistency loss weights were fixed at the default setting (lambda_A = lambda_B = 10). Second, after selecting the learning rate, the cycle consistency loss weights were further tested using three configurations: (5, 5), (10, 10), and (20, 20).
(5) Practical Implementation Details of Cycle-GAN Training
For practical implementation, the Cycle-GAN was trained using unpaired simulated and real GPR image sets from the two domains. The simulated domain was constructed from the GprMax-generated defect images and their augmented versions, whereas the real domain consisted of the field-collected GPR images prepared through the preprocessing, segmentation, and annotation workflow described above. Before training, images from both domains were normalized and resized to a consistent input resolution required by the network. The final checkpoint was selected based on the combined consideration of convergence behavior during training, the visual quality of the translated radar images, and the quantitative similarity evaluation between translated and real images.
(6) Hyperparameter Testing
The initial learning rates were set to 0.0002, 0.0005, and 0.001, with the cycle consistency loss weights fixed at lambda_A = lambda_B = 10. The model was trained for 200 epochs under each setting. As shown in
Figure 13, the case with a learning rate of 0.001 did not reach a sufficiently stable convergence state within 200 epochs, whereas both 0.0002 and 0.0005 showed more stable convergence behavior. Considering both convergence stability and training efficiency, 0.0005 was selected as the learning rate for the subsequent experiments.
After fixing the learning rate at lr = 0.0005, the cycle consistency loss weights were further tested using three combinations: lambda_A = lambda_B = 5, lambda_A = lambda_B = 10, and lambda_A = lambda_B = 20. As shown in
Figure 14, all three settings achieved stable convergence within 200 epochs. Based on the overall convergence behavior and generated-image quality, the model trained with lr = 0.0005 showed satisfactory robustness and stable optimization performance.
- (4)
Quantitative Evaluation of Simulation Fidelity and Cycle-GAN Translation Quality
For the quantitative evaluation of image similarity, two different comparison protocols were adopted according to the nature of the metrics. PSNR, SSIM, and MSE were computed on image pairs formed between the sampled simulated, translated, and real radar images after consistent preprocessing and intensity normalization. Since the Cycle-GAN training in this study was unpaired, these image pairs were used only for relative similarity comparison rather than for strict pixel-wise correspondence validation. By contrast, FID, KID, and LPIPS were evaluated at the distribution level between image sets from different domains, which is more suitable for unpaired image translation tasks. Before evaluation, all radar images were converted to the same image format and normalized to a consistent intensity range to ensure comparability across domains.
To quantitatively assess the similarity between simulated and real radar images, 100 simulated images and 100 real images were randomly selected for comparison using Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM) and Mean Squared Error (MSE). As shown in
Table 4, the simulated images achieved an average SSIM of 0.74 and a PSNR of 21.8 dB relative to real images, indicating moderate structural similarity. After Cycle-GAN translation, the SSIM increased to 0.82, and the PSNR reached 24.9 dB, while the MSE decreased from 0.0126 to 0.0084. These results demonstrate that the Cycle-GAN effectively narrows the domain gap between simulated and real radar images, producing images that more closely resemble real GPR observations.
To further quantify the performance of the Cycle-GAN, the Fréchet Inception Distance (FID), Kernel Inception Distance (KID), and Learned Perceptual Image Patch Similarity (LPIPS) metrics were calculated between simulated, translated, and real images. As summarized in
Table 5, the original simulated images exhibited a high FID of 85.7, while the Cycle-GAN–translated images achieved a significantly lower FID of 46.3. The KID score decreased from 12.6 × 10
−3 to 5.8 × 10
−3, and the LPIPS value decreased from 0.412 to 0.276. These improvements demonstrate that Cycle-GAN substantially reduces the domain discrepancy, producing synthetic GPR images that are substantially closer to real radar images in both perceptual structure and statistical characteristics.