Previous Article in Journal
Thermal Analysis and Thermal–Mechanical Stress Simulation of Polycrystalline Diamond Compact Bits During Rock Breaking Process
Previous Article in Special Issue
Research Progress on the Preparation and Performance of Recycled Mortars Using Solid Waste-Based Cementitious Materials
 
 
Article
Peer-Review Record

Analysis of Failure Characteristics and Mechanisms of Asphalt Pavements for Municipal Landscape Roads

by Lei Zhang 1,*, Xinxin Cao 2, Xuefeng Mei 3, Xinhui Fu 1 and Huanhuan Zhang 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Submission received: 27 November 2025 / Revised: 15 December 2025 / Accepted: 17 December 2025 / Published: 26 December 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors
  1. In the abstract, mention the size of the evaluation dataset and whether the automated detection results are validated against independent human annotations.
  2. You should clearly identify the novel contributions of this paper near the end of the introduction. 
  3. The relationship between shadow removal and distress detection accuracy should be made precise.
  4. You state 2,542 valid distress images over ~10 km, resolution 1920×1080. Please provide: (a) number of distinct road segments, (b) date/time distribution (seasons/time-of-day), (c) camera mounting height and angle, and (d) vehicle speed during image capture.
  5. Were images collected on multiple days and different lighting conditions (sunny/cloudy) to ensure variability?
  6. How were “valid distress images” defined and filtered? Any exclusion criteria?
  7. Provide precise training details: training dataset for shadow removal (which images used as ground truth? synthetic shadows? paired shadow/clean images?), batch size, optimizer, learning rate schedule, epochs, hardware used, and loss weight coefficients (L1 / VGG / FFT).
  8. How were shadow-free ground truth images obtained? If synthetic data or augmented pairs were used, describe the How is “shadow removal rate” computed? Define metric, and provide confusion measures (false positive shadow removal that alters real distress features).generation procedure.
  9. For YOLOv8-seg training, provide model variant used (n/s/m/l/x), input size during training, augmentation strategies (flip, brightness, contrast, synthetic shadows), batch size, learning rate, optimizer, number of epochs, weight initialization (pretrained on COCO?), and early-stopping criteria.
  10. How were annotations created? Labelme was used — provide annotation protocol, number of annotators, inter-annotator agreement (e.g., IoU).
  11. You state the dataset split 2,123 train / 419 test. Was there a validation set?
  12. Binder is PG 70 penetration grade. Provide supplier, lot number, basic characterization (penetration, softening point) before aging.
  13. What is the film thickness/amount used in RTFO and PAV? The oxidative extent depends on film thickness and temperature.
  14. Missing details that should be mentioned: DSR testing temperatures, frequency sweep or temperature sweep parameters, specimen geometry, strain level, and whether binder tests used appropriate linear viscoelastic limits.
  15. In BBR test, what is the specimen size, test temperatures (BBR stated -12°C earlier, but BBR standard temperatures may vary), soak time, and number of replicates?
  16. Freeze–thaw conditioning: frozen at -18°C 16 h and thaw in 60°C water for 8 h. This thaw temperature is very high!! Are you sure?  I do not recall any standard in the world mentions this temperature!
  17. For IDT and TSR, how many replicates for each condition (0,3,5,10 cycles)? Provide standard deviations.
  18. You claim detection of fine cracks (<2 mm). At the stated image resolution and capture geometry, what is the pixel length corresponding to 2 mm?
  19. How many replicates were run per condition? Add error bars and significance tests.
  20. At which temperature were |G*| and G*/sinδ measured?
  21. The linear annual growth wording is unsupported. Aging kinetics often are non-linear!!
  22. The claim that IDT strength for 8-year aged asphalt “plummets to 0 MPa after 3 cycles” is not reasonable (0 MPa indicates total brittle fragmentation). Please check data and provide raw numbers and photographs of failed specimens.

Author Response

We express our heartfelt gratitude for your meticulous review and highly constructive comments. The 22 specific suggestions you provided, covering aspects from the clarity of the abstract and the completeness of methodological descriptions to the rigor of experimental design and the accuracy of data presentation, have comprehensively addressed various dimensions of this study. These invaluable insights have significantly helped us refine the details of the manuscript, enhancing the reproducibility and scientific rigor of the research. We have made detailed supplements, corrections, and clarifications in the original text based on each of your points. Below are our point-by-point responses and clarifications regarding each comment.

Comment 1In the abstract, mention the size of the evaluation dataset and whether the automated detection results are validated against independent human annotations.

Response:The size of the evaluation dataset has been added to the abstract, along with a clarification of whether the automated detection results are validated against independent human annotations.

The following content has also been incorporated into the abstract: This study divides 2542 images into three mutually exclusive subsets: a training set of 2123 images, a validation set of 209 images, and a test set of 210 images.

Comment 2You should clearly identify the novel contributions of this paper near the end of the introduction.

Response: Add the following content to the introduction of the paper: The innovative contributions of this study are mainly reflected in two aspects: (1) proposing the SpA-Former end-to-end shadow removal network specifically designed to address the interference of complex tree shadows on landscape roads, significantly improving the accuracy of distress identification; (2) through controlled experiments, systematically revealing for the first time the intrinsic coupling mechanism between aging and environmental factors in asphalt mixtures under traffic-free conditions.

Comment 3The relationship between shadow removal and distress detection accuracy should be made precise.
Response: We agree that quantifying this relationship is crucial. In the revised Section 3.1, we have added a comparative ablation study (Table 1) that evaluates the model’s detection performance on the original dataset (with shadows) and the processed dataset (shadow-removed). Results show that the proposed shadow removal method increases the mAP@0.5 from 86.5% to 96.2%, achieving a net gain of 9.7%. This confirms that eliminating shadow interference significantly reduces missed detections and improves detection recall.

 

Comment 4You state 2,542 valid distress images over ~10 km, resolution 1920×1080. Please provide: (a) number of distinct road segments, (b) date/time distribution (seasons/time-of-day), (c) camera mounting height and angle, and (d) vehicle speed during image capture.

Response: We have significantly expanded Section 2.1 (Pavement Distress Data Acquisition) to include these specific parameters:

①Road Segments & Timing: Data were collected across multiple independent road segments during spring and early summer, with acquisition times concentrated between 9:00–11:00 AM and 2:00–4:00 PM. This ensures illumination consistency while capturing the diversity of shadow patterns.

②Equipment Setup: An industrial global shutter camera (Daheng Imaging MER2-1220-32U3M) was used, mounted at a height of 1.5 meters with a pitch angle of 15 degrees.

③Acquisition Conditions: The vehicle maintained a low speed, with an effective field of view covering 5–7 meters ahead to minimize motion blur.

Comment 5Were images collected on multiple days and different lighting conditions (sunny/cloudy) to ensure variability?

Response: Yes. As detailed in the revised Section 2.1, data acquisition was conducted over multiple days. While we selected time periods with relatively consistent illumination to reduce motion blur, the core challenge addressed in this study—tree shadows— intrinsically implies that most data were collected under sunny conditions with strong direct sunlight. The dataset includes varying shadow intensities and complex texture backgrounds, ensuring the model’s robustness to uneven illumination scenarios.

Comment 6How were “valid distress images” defined and filtered? Any exclusion criteria?

Response: In Section 2.1, we have clarified the preprocessing workflow. A "valid" image is defined as one that retains clear pavement features after standardized preprocessing (histogram equalization, brightness normalization, and noise filtering). During initial screening, images with severe motion blur or complete occlusions—where distress features were indiscernible even to human annotators—were excluded.

Comment 7Provide precise training details: training dataset for shadow removal (which images used as ground truth? synthetic shadows? paired shadow/clean images?), batch size, optimizer, learning rate schedule, epochs, hardware used, and loss weight coefficients (L1 / VGG /

Response: We have added a dedicated subsection "Model Implementation and Training Details" in Section 2.1:

①Dataset: A synthetic dataset was constructed using 200 high-quality shadow-free pavement images, combined with shadow masks generated via Perlin noise.

②Hyperparameters: Training was performed on an NVIDIA RTX 4060 GPU, using the AdamW optimizer with a batch size of 4, an initial learning rate of 0.0001, and a total of 100 epochs.

③Loss Weights: The loss coefficients are explicitly set as λ₁=1.0, λ<sub>per</sub>=0.1, and λ<sub>fft</sub>=0.05.FFT).

Comment 8How were shadow-free ground truth images obtained? If synthetic data or augmented pairs were used, describe the How is “shadow removal rate” computed? Define metric, and provide confusion measures (false positive shadow removal that alters real distress features).generation procedure.
Response: In the revised Section 2.1, we explain the following:

①Gound Truth: Shadow-free ground truth images were generated synthetically using a linear illumination attenuation model combined with Perlin noise.

②Metric Definition: The "shadow removal rate" (η) is defined based on the reduction in root mean square error (RMSE) within shadow regions.

③Confusion Metrics: To address concerns about erroneous removal of real distress features, we calculated the structural similarity index (SSIM) within distress regions. High SSIM scores (>0.92) confirm that the topological structure and texture details of cracks are effectively preserved during shadow removal.

Comment 9For YOLOv8-seg training, provide model variant used (n/s/m/l/x), input size during training, augmentation strategies (flip, brightness, contrast, synthetic shadows), batch size, learning rate, optimizer, number of epochs, weight initialization (pretrained on COCO?), and early-stopping criteria.
Response: We have updated Section 2.2 to include a detailed subsection "Model Implementation and Training Details":

Model Variant: The YOLOv8s-seg (small-scale) variant was used, initialized with COCO pre-trained weights.

Training Parameters: The model was trained using the SGD optimizer (momentum=0.937) with a batch size of 16, an initial learning rate of 0.01 coupled with a cosine annealing schedule, and a total of 300 epochs.

Augmentation & Hardware: Input images were resized to 640×640 pixels, with standard augmentation strategies applied (Mosaic, HSV adjustment, flipping). Training hardware (RTX 4060) and early stopping criteria (patience=50) are also explicitly stated.

Comment 10How were annotations created? Labelme was used — provide annotation protocol, number of annotators, inter-annotator agreement (e.g., IoU).
Response: We have added a detailed "Annotation Protocol" description in Section 2.1:

①Protocol: Three professionally trained annotators performed pixel-level annotation using Labelme.

②Consistency: To ensure quality, 10% of the dataset was cross-annotated. The mean intersection over union (mIoU) exceeded 0.85, verifying the reliability of our ground truth labels.

Comment 11You state the dataset split 2,123 train / 419 test. Was there a validation set?
Response: In Section 2.1, we have clarified the data partitioning strategy. A stratified split was adopted: the 419 test images were further divided into an independent validation set (209 images) for hyperparameter tuning and a final test set (209 images) for performance evaluation. This ensures that the reported metrics reflect the model’s generalization ability on completely unseen data.

Comment 12Binder is PG 70 penetration grade. Provide supplier, lot number, basic characterization (penetration, softening point) before aging.

Response: The base asphalt was sourced from Shandong Chambroad Petrochemicals Co., Ltd., with a penetration of 63.6 (0.1 mm) and a softening point of 48.5℃.

Comment 13What is the film thickness/amount used in RTFO and PAV? The oxidative extent depends on film thickness and temperature.

Response: The asphalt film thickness in the RTFO test is approximately 1.25 mm, and in the PAV test, it is approximately 3.2 mm.

Comment 14Missing details that should be mentioned: DSR testing temperatures, frequency sweep or temperature sweep parameters, specimen geometry, strain level, and whether binder tests used appropriate linear viscoelastic limits.

Response: In the DSR test, the measurement was conducted at a temperature of 60℃ with a scanning frequency of 10 rad/s. A 25 mm diameter parallel plate geometry with a 1 mm gap was used, and a scanning strain of 0.1% was applied, which falls within the linear viscoelastic range of the asphalt.

Comment 15In BBR test, what is the specimen size, test temperatures (BBR stated -12℃ earlier, but BBR standard temperatures may vary), soak time, and number of replicates?

Response: In the Bending Beam Rheometer (BBR) test, the specimen used was a small beam with dimensions of 127 mm in length, 12.7 mm in width, and 6.35 mm in height. The test was conducted at a temperature of -12℃, with a standard temperature conditioning time of 60±5 minutes. The test was repeated three times for reproducibility.

Comment 16Freeze–thaw conditioning: frozen at -18℃ 16 h and thaw in 60℃ water for 8 h. This thaw temperature is very high!! Are you sure?  I do not recall any standard in the world mentions this temperature!

Response: According to T 0729-2000 in the Chinese specification Standard Test Methods of Bitumen and Bituminous Mixtures for Highway Engineering (JTG E20-2011), for the“freeze-thaw splitting test of asphalt mixtures”, the specimens shall first be frozen at -18℃, then incubated at 60℃. Finally, the specimens shall be placed in a constant temperature water bath at 25℃ for not less than 2 hours for the splitting test.

Comment 17For IDT and TSR, how many replicates for each condition (0,3,5,10 cycles)? Provide standard deviations.

Response: The number of parallel specimens is three, with data presented in Figures 14 and 15 in the form of error bars.

Comment 18You claim detection of fine cracks (<2 mm). At the stated image resolution and capture geometry, what is the pixel length corresponding to 2 mm?
Response: Response:Based on the camera setup described in the revised Section 2.1 (1.5-meter height, 15-degree pitch angle), the effective horizontal field of view on the pavement is approximately 3.5–5 meters. With a horizontal image resolution of 1920 pixels, one pixel corresponds to a physical length of ~1.8–2.6 mm. As clarified in Section 3.1, "fine cracks" refer to those with a physical width of ~2 mm, which typically occupy 1–2 pixels in the images. Benefiting from the contrast enhancement by the SpA-Former module and the multi-scale feature fusion capability of YOLOv8-seg, the model can effectively detect these fine features.

Comment 19How many replicates were run per condition? Add error bars and significance tests.

Response: The number of parallel specimens is three, with data presented in Figures 14 and 15 in the form of error bars.

Comment 20At which temperature were |G*| and G*/sinδ measured?

Response: |G*| and G*/sinδ are measured at 60℃.

Comment 21The linear annual growth wording is unsupported. Aging kinetics often are non-linear!

Response: The research simulated asphalt aging durations of 2, 4, 6, and 8 years. The current sample is insufficient to accurately reflect the aging trend. The conclusion in the paper suggesting a linear growth pattern in aging will be deleted. Subsequent analysis will require an increased sample size to study the aging trend.

Comment 22The claim that IDT strength for 8-year aged asphalt “plummets to 0 MPa after 3 cycles” is not reasonable (0 MPa indicates total brittle fragmentation). Please check data and provide raw numbers and photographs of failed specimens.

Response: The asphalt mixture aged for 8 years shows a splitting strength of 0.35 MPa after 5 cycles and 0.31 MPa after 10 cycles. While there is a significant reduction relative to the initial splitting strength, it has not completely failed. Therefore, stating the splitting strength as 0 MPa in the paper is incorrect and has been corrected.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

 

The paper offers useful new ideas and technical support for maintaining and managing landscape roads, especially in detecting pavement problems and understanding how materials age.

 

That said, given the current climate emergency, it’s surprising that the authors did not open the paper by acknowledging this reality. They also seem unaware of a blunt and widely cited remark by Oxford physicist Raymond Pierrehumbert, who wrote in 2019: “Let’s get this on the table right away, without mincing words. With regard to the climate crisis, yes, it’s time to panic.” If respected experts are stressing urgency this strongly, the introduction should clearly connect the study to environmental degradation, resource efficiency, and durability.

 

The dataset description is thin and does not explain geographic or seasonal diversity.

 

The results are described as robust, but the presentation makes that hard to see.

 

The paper also doesn’t justify or discuss how traffic levels, climate conditions, or binder chemistry might affect the outcomes.

 

Mechanical testing looks only at asphalt mixtures and ignores the base and subbase layers, interlayer bonding, and structural restraint—all of which play a major role in real-world transverse cracking.

 

Finally, although the authors identify aging as the main problem, they do not go the extra step of translating their findings into practical guidance: when maintenance should occur or what material design changes they would recommend.

 

Author Response

We sincerely thank you for your recognition of the research value and practical significance of this paper, as well as for the profound insights you provided from a broader perspective. Your comments concerning the research background, data representativeness, presentation of results, and translation into engineering applications prompted us to engage in deep reflection, thereby improving the completeness and impact of the manuscript. Accordingly, we have supplemented the discussion on the study's limitations, strengthened the robustness of the conclusions, and clarified directions for future research. Below are our reflections on each of your comments and the corresponding revisions we have made.

Comment 1The paper offers useful new ideas and technical support for maintaining and managing landscape roads, especially in detecting pavement problems and understanding how materials age.

Response: Thank you for your positive feedback and for recognizing the practical value of this research. We are very pleased that you found the proposed SpA-Former shadow removal network and the analysis of the aging-environment coupling mechanism to offer useful insights and technical support for the maintenance of landscape roads. This acknowledgment strongly motivates our ongoing efforts to bridge the gap between material-level understanding and field management practices.

Comment 2That said, given the current climate emergency, it’s surprising that the authors did not open the paper by acknowledging this reality. They also seem unaware of a blunt and widely cited remark by Oxford physicist Raymond Pierrehumbert, who wrote in 2019: “Let’s get this on the table right away, without mincing words. With regard to the climate crisis, yes, it’s time to panic.” If respected experts are stressing urgency this strongly, the introduction should clearly connect the study to environmental degradation, resource efficiency, and durability.

Response: The research is helpful for clarifying the causes of cracking in non-load-bearing pavements and guiding efforts to extend pavement service life, which can conserve construction resources and reduce carbon emissions.

Comment 3The dataset description is thin and does not explain geographic or seasonal diversity.

Response: The paper has been supplemented to clarify that all data were collected from Weifang, Shandong, China (mean annual temperature approximately 12.5°C, annual precipitation 600-800 mm), which belongs to a temperate monsoon climate. The dataset is based on plain topography but currently includes only autumn data (characterized by significant temperature variations and high crack incidence). We acknowledge the insufficient seasonal coverage and have explicitly stated this as a limitation in the paper.

Comment 4The results are described as robust, but the presentation makes that hard to see.

Response: Thank you for your constructive feedback regarding the presentation of the results' robustness. In response, we have added error bars to all data points in Figures 14 and 15. These error bars are based on the standard deviation calculated from the test results of three parallel specimens. This addition visually illustrates the dispersion of the data, making the trend of "strength decreasing with intensified aging and freeze-thaw cycles" as well as the differences between groups more visible and statistically rigorous. Consequently, it provides direct support for the reliability of the conclusions drawn in the paper.

Comment 5The paper also doesn’t justify or discuss how traffic levels, climate conditions, or binder chemistry might affect the outcomes.

Response: Thank you for the reviewer's comments. This study focuses on landscape roads that are free from traffic loads, where the distress mechanism originates from the coupling of material aging and the natural environment. The paper has been supplemented with the climatic background of the data collection site (Weifang: mean annual temperature 12.5°C, annual precipitation 600–800 mm, autumn sampling) and clarifies the use of 70-penetration grade base asphalt as the benchmark material. This investigation aims to establish a research baseline for the aging behavior of standard asphalt under typical temperate climate conditions. We will note in the discussion that the extended impacts of traffic loads, extreme climates, and asphalt chemistry represent important directions for future research. This clarification not only makes the conclusions of this study more precise but also lays a foundation for subsequent comparative investigations.

Comment 6Mechanical testing looks only at asphalt mixtures and ignores the base and subbase layers, interlayer bonding, and structural restraint—all of which play a major role in real-world transverse cracking.

Response: Thank you for the valuable comments from the reviewer. We fully agree that factors such as the base and subbase layers, interlayer bonding, and structural constraints are crucial for real-world transverse cracking. As the starting point for a series of investigations, this study focuses first on clarifying the micromechanisms of aging and cracking in asphalt mixtures under purely environmental actions. This fundamental understanding is a prerequisite for constructing comprehensive structural models in subsequent work. We will explicitly state the limitations of this study at the structural scale in the "Discussion and Future Work" section of the paper. Follow-up research will systematically conduct mechanical testing on composite specimens incorporating different base types, considering interlayer bonding conditions and structural constraint effects, and perform numerical simulations to advance the material-level findings toward validation within structural systems that better reflect real-world engineering conditions.

Comment 7Finally, although the authors identify aging as the main problem, they do not go the extra step of translating their findings into practical guidance: when maintenance should occur or what material design changes they would recommend.

Response: Thank you for the important feedback from the reviewer. In the "Conclusions and Future Work" section of the paper, we have proposed recommendations to conduct systematic inspection and preventive maintenance during the 5th–6th year of service and to consider using softer asphalt (e.g., penetration grade 90) in binder selection to improve low-temperature crack resistance. At the same time, we clearly stated that these recommendations require further validation and optimization through dedicated follow-up studies.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors made all the required changes and addressed my concerns. 

Back to TopTop