Review Reports - On the Application of DiffusionDet to Automatic Car Damage Detection and Classification via High-Performance Computing

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript introduces an enhanced iteration of the car damage claim management system. The overarching framework of the system has been previously disclosed in earlier publications. The principal advancement highlighted in this study is the implementation of a more extensive and intricate deep learning model, facilitated by high-performance computing (HPC) resources. This model is constructed upon a diffusion architecture and a Swin transformer backbone. While there is a noticeable improvement in performance attributable to the increased model size, the originality of the model does not meet the standards required by this journal. It would be more appropriately published as a technical report.

Author Response

Please find the authors' reply to the reviewer in the attached letter.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors This paper presents an enhanced version of Insoore AI for automatic car damage detection and classification in the insurance claim management process. Leveraging the DiffusionDet architecture with a Swin Transformer backbone and the high - performance computing (HPC) resources of the Leonardo HPC system's Booster module, the study aims to overcome the limitations of previous methods. The authors first recap the previous Insoore AI pipeline, which used the Faster R - CNN architecture and faced challenges in handling complex damages. Then, they introduce DiffusionDet, a generative AI - based object detection framework. It formulates object detection as a denoising process, with forward and reverse diffusion steps, and has components like an image encoder and a detection decoder. The enhanced Insoore AI pipeline works by gathering vehicle images, detecting damage, segmenting car parts, mapping damage to parts, calculating the relative damage area, classifying severity, and deciding on repair or replacement. HPC resources are crucial for training GenAI - based deep learning models. Benchmarking shows that the Leonardo Booster setup offers significant training speed improvements compared to a standard configuration. The experiments use the same dataset as the previous work, with annotations for four damage classes. Performance metrics such as AP, AP50, and AP75 are used for evaluation. The results demonstrate a substantial improvement in performance, especially in AP50, which increased from 30.45 to 38.87. Future work will focus on further architectural optimizations and alternative feature extraction techniques.

Questions

What are the main differences in performance between the DiffusionDet - based Insoore AI and the previous Faster R - CNN - based version in handling different types of car damages?
How does the Swin Transformer backbone specifically contribute to the fine - grained damage identification in car components within the DiffusionDet architecture?
Considering the privacy - restricted data, what are the potential challenges and solutions for further validating and improving the model's performance in real - world scenarios?
In the DiffusionDet framework, how can the balance between the complexity of the forward and reverse diffusion processes be optimized to enhance the overall detection accuracy?
Given that the study focuses on car damage detection for insurance claims in the Italian market, how can this model be adapted for use in different markets with potentially different vehicle types and damage patterns?

Author Response

Please find the authors' reply to the reviewer in the attached letter.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This manuscript describes the improvement of authors' previously published automatic car damage recognition and localization tool. The improvement is due to the usage of LEONARDO HPC system and obtained results are reasonable. However, the reviewer would like to suggest the following points for improving this manuscript.

1. Abstract
Usually, in the abstract, the reference numbers, [1], [2], etc., are not included. Because only the abstract is taken and used. So, please exclude the reference numbers.
2. Figure 1:
This figure is not referred to in the manuscript. Please describe this figure by using the symbols in the figure although this figure shows the basics of diffusion model.
3. Figure 2:
Please explain this figure in much more detail. For example, in each of the pairs, the left is the original image and the right is the recognized results, etc. What do the shapes and colors of recognized area mean? The recognized results are satisfactory or not?
4. Figure 3:
The explanation on this figure is insufficient. The details may have been described in the previous paper dealing with Insoore AI, but the manuscript should be self-completed.
5. p.8, line 4 from the bottom:
The reviewer does not understand "Approximately 120,000 images were collected." Where are these images used? These images are used for obtaining the results shown in Figure 4? Please describe the relationship between "the test set includes 540 annotations extracted from 326 images" and these 120,000 images.
6. line 3, in Appendix I:
Maybe a typographical error, "segmentation mapping, All experiments".
7. Absolute evaluation of precision:
As described in "7. Conclusions and Future Work," 27.65% improvement has been attained, however, the reviewer is wondering whether new results are satisfactory or not. Please describe how good the improved results are in practical use.

Author Response

Please find the authors' reply to the reviewer in the attached letter.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

I maintain that the originality of the manuscript is constrained. Both the diffusion model and the Swin Transformer are resource-intensive regarding time and computational power. The authors have merely implemented these two techniques in the context of car damage detection and classification. This application represents an adaptation rather than a novel contribution to the field. Consequently, the work would be more appropriately classified as a technical report rather than a research paper.

Author Response

We thank the reviewer for the suggestions and insights. Please find attached our reply to the reviewer.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The paper can be accepted.

Author Response

We thank you very much for the positive evaluation of our revised manuscript.