Sp2PS: Pruning Score by Spectral and Spatial Evaluation of CAM Images
Round 1
Reviewer 1 Report
Dear authors, your idea is very good, which will help to implement the deep learning models in real-time. But kindly clarify me the below following points to improve the script quality.
1. Keywords should be in alphabetical.
2. In proposed methodology authors are included base-line method also to get final score. If it is the proposition, how network complexity will be reduced, and how it will be effective?
3. You need to compare your method with state-of-the-art techniques.
4. If we applied proposed methodology on Bio-medical images what will happen?
5. Script looks little confusing, kindly format it properly.
All the best!
Author Response
Dear Reviewer,
Thank you for your comments. Please refer to the attached file.
Author Response File: Author Response.pdf
Reviewer 2 Report
This article has a strong correlation with reference 8, both based on CAM. The article should specifically compare the similarities and differences between the method proposed in this article and this reference.
Author Response
Dear Reviewer,
Thank you for your comments. Please refer to the attached file.
Author Response File: Author Response.pdf
Reviewer 3 Report
The paper is easy to understand. The paper presents a solution to the problem of evaluating the preservation of decision-making regions in pruned CNN models. It addresses the challenge of identifying whether a pruned model retains its ability to attend to the same crucial areas as the un-pruned model in an image when performing inference. To tackle this, the authors propose a metric that leverages model interpretation techniques, specifically CAM-type methods (Grad-CAM, Grad-CAM++, Ablation-CAM) to visualize the regions of importance in an image. By spatially (using Structural Similarity Index) and spectrally (using Spectral Angle Mapper) comparing and integrating these CAM images using the harmonic mean across the test dataset, the proposed metric named Sp2PS quantifies the preservation of decision-making regions. (the metric region is between 0 to 1. 1 means exactly the same attention between pruned model and the unpruned model).
Strength: It is interesting to explore whether the pruned and un-pruned models have similar attention regions. It proposed a metric index Sp2PS that considered both spatial and spectral information.
Weakness:
1. The paper only measures how similar the attention regions are between the Pruned and the Un-pruned model. However, why do we need them to be similar? Figure.11 shows that even accuracy does not change much but Sp2PS gets lower with PR increasing. Sometimes pruned models outperform un-pruned models after fine-tuning. Moreover, the pruning method chooses L1 and random, why not some state-of-the-art methods?
2. The literature review part is my major concern. The paper is mainly about pruning, but in the related works section, the focus is on CNN and CAM techniques. More pruning works need to be cited. Below are just some example pruning works that are based on CAM or other type of saliency:
(1) Tian, Qing, Tal Arbel, and James J. Clark. "Task dependent deep LDA pruning of neural networks." Computer Vision and Image Understanding 203 (2021): 103154.
(2) Choi, J. I., & Tian, Q. (2023). Visual Saliency-Guided Channel Pruning for Deep Visual Detectors in Autonomous Driving. 2023 IEEE Intelligent Vehicles Symposium (IV).
Overall: It is an interesting paper that proposed a metric to evaluate the preservation of decision-making regions in pruned CNN models. However, the paper is more like an experimental study. The main innovation is simply the combination of two existing metrics (SSIM and SAM) into a new one (Sp2PS). The literature review section is unacceptable.
There are some minor issues with grammar/wording.
Author Response
Dear Reviewer,
Thank you for your comments. Please refer to the attached file.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Author may incorporate latest references in the literature section.
Any specific criteria for choosing pruning rate.
Please include these two comments in the script.
Minor corrections are required
Author Response
Please see the attached file.
Author Response File: Author Response.pdf
Reviewer 3 Report
I would like to thank the authors for their response. That said, I still have two concerns on the related work section.
(1) "pruning consists of removing weights from the network." -> this is not right. As the authors mentioned themselves, individual weights based pruning is only one type of pruning.
(2) The following work also utilizes CAM-like saliency to guide the pruning process.
J. I. Choi and Q. Tian, "Visual-Saliency-Guided Channel Pruning for Deep Visual Detectors in Autonomous Driving," 2023 IEEE Intelligent Vehicles Symposium (IV), Anchorage, AK, USA, 2023, pp. 1-6, doi: 10.1109/IV55152.2023.10186819.
Discussion on the difference is needed. Otherwise, I doubt this work's contribution to the field.
Fine.
Author Response
Please see the attached file.
Author Response File: Author Response.pdf