Geometry-Aware Weight Perturbation for Adversarial Training
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis paper introduces a geometry-aware method for adversarial training, with the primary objective of leveraging points in an open set surrounding the point of interest during the adversarial attack phase. This approach aims to better capture the underlying data manifold, which in turn could enhance the effectiveness of adversarial training. The authors' experiments provide empirical support for this insight, demonstrating the potential benefits of their proposed method.
Overall, this paper is well-motivated, thoroughly researched, and clearly articulated. The approach is innovative, and the experimental results are promising. However, I have the following comments and suggestions for further improvement:
1. Reproducibility: The reproducibility of this work is of paramount importance. I strongly recommend that the authors consider open-sourcing their project, including code, datasets, and detailed instructions for replication. This would not only strengthen the impact and credibility of the work but also facilitate future research in this area.
2. Norm-Based Attacks: The current study focuses primarily on adversarial attacks under the L∞-norm and L2 -norm. While this is a reasonable starting point, it would be beneficial for the authors to explore and discuss the applicability of their method to other types of attacks, such as L1-norm attacks or trace norm attacks. Including these additional perspectives could provide a more comprehensive evaluation of the proposed method's robustness across different adversarial scenarios.
3. Comparison with Manifold-Based Methods: The proposed method implicitly relates to manifold learning techniques, given its focus on modeling the data manifold during adversarial training. I suggest that the authors expand their discussion to include a comparison with existing manifold-based methods for adversarial robustness. This would not only place their work in the broader context of the field but also highlight the unique contributions and advantages of their geometry-aware approach.
Author Response
We extend our gratitude to the reviewer for taking the time to review this manuscript. Please find the detailed responses below.
Comments 1: Reproducibility: The reproducibility of this work is of paramount importance. I strongly recommend that the authors consider open-sourcing their project, including code, datasets, and detailed instructions for replication. This would not only strengthen the impact and credibility of the work but also facilitate future research in this area.
Response 1: Thank you for pointing that out. We plan to open-source the code after the paper is accepted.
Comments 2: Norm-Based Attacks: The current study focuses primarily on adversarial attacks under the L∞-norm and L2 -norm. While this is a reasonable starting point, it would be beneficial for the authors to explore and discuss the applicability of their method to other types of attacks, such as L1-norm attacks or trace norm attacks. Including these additional perspectives could provide a more comprehensive evaluation of the proposed method's robustness across different adversarial scenarios.
Response 2: We agree this comment. We selected L∞-norm and L2 -norm these two threat models because they are also the only two threat models discussed in previous works [A, B]. For a fair comparison, we also conducted comparison experiments under these settings. We claim that the experiments under these two threat models are informative enough to show the effectiveness of the proposed method.
Comments 3: Comparison with Manifold-Based Methods: The proposed method implicitly relates to manifold learning techniques, given its focus on modeling the data manifold during adversarial training. I suggest that the authors expand their discussion to include a comparison with existing manifold-based methods for adversarial robustness. This would not only place their work in the broader context of the field but also highlight the unique contributions and advantages of their geometry-aware approach.
Responses 3: We assume that by "manifold-based methods", the reviewer refers to the defense methods that improve model performance on the adversarial data by transforming it to the counterpart on the normal data manifold. This class of methods is also called "Adversarial Purification", and we agree they are related to our paper. Therefore, we have added Section 2.3 (page 5) to discuss the relation and distinction between these methods and our method.
[A] Yu, Chaojian, et al. "Robust weight perturbation for adversarial training". arXiv preprint arXiv:2205.14826 (2022).
[B] Yu, Chaojian, et al. "Understanding robust overfitting of adversarial training and beyond". arXiv preprint arXiv:2206.08675 (2022).
Reviewer 2 Report
Comments and Suggestions for AuthorsThe study includes a new contribution to the advancement of countermeasures against adversarial threats toward the constructed neural networks. However, this study could have potentially contributed to the domain had some of the issues in the text concerning the changed clarity of concepts, the complete absence of theoretical underpinnings, and a call for increasing the scale of experimental research been addressed.
Remarks:
1. Introduction: The section is generally fine. However, it lacks a clear presentation of the problem under consideration and the rationale for the suggested method. Although there is a brief discussion of why existing methods may not generalize and what this paper does to counteract those problems (robustness generalization).
2. Hyperparameters: While the study points out some areas for concern such as hyperparameter sensitivity, the study is brief regarding the demonstration of how change of some of the hyperparameters like λ and K2 affects the results. For instance, the current discussion could be extended or more experiments could be carried out could turn out to be useful and enlightening.
3. Proposed Algorithm: It is well-written work, and if there is a more complex and detailed analysis of how the GAWP algorithm is working in detail, particularly the justification of the weight perturbation strategy part it will be far better. Provide more details.
4. Improve the methodology section, particularly for readers less familiar with the difficulties of the adversarial training.
5. The authors compare the GAWP with basic methods such as GAIRAT and AWP, it should include more comparisons with other state-of-the-art adversary training methods.
6. The analysis is performed using ResNet architecture on well-known datasets such as CIFAR-10 and CIFAR-100. However, the study would benefit from further tests on different data types and sampling designs to generalize the proposed method.
7. Explain why the proposed method should theoretically lead to improved robustness would be beneficial.
8. It would be useful to discuss how the proposed method affects model interpretability and whether it presents any additional challenges in understanding how the model makes decisions.
9. The article discusses robust overfitting, but it would be helpful to include additional extensive analysis on whether the proposed method avoids overfitting not just in terms of robustness but also in terms of generalization to new, unseen data.
10. Improve the final section of Conclusions and Future Work. For example, how could the method be adapted for different types of neural network architectures or different domains??
Comments on the Quality of English Language
The language used by the authors is rather understandable for a non-native English speaker, yet one might wish the grammar to be clean throughout the manuscript. Careful proofreading would help improve the clarity and flow of the text. Improve the readability.
Author Response
We extend our gratitude to the reviewer for taking time to review this manuscript. Please find the detailed responses below.
Comments 1: Introduction: The section is generally fine. However, it lacks a clear presentation of the problem under consideration and the rationale for the suggested method. Although there is a brief discussion of why existing methods may not generalize and what this paper does to counteract those problems (robustness generalization).
Responses 1: Thank you for the constructive feedback! To improve the clarity of this section, we have added a figure (Figure 1 on page 2) to illustrate the problems under consideration in this paper. Furthermore, we added another figure (Figure 2 on page 3) to explain our rationale for the proposed method. Specifically, by visualizing the weight loss landscapes in Figure 2, we find that GAIRAT converges to a sharp local minimum, which motivates us to impose regularization on the smoothness of the weight loss landscape with AWP.
Comments 2: Hyperparameters: While the study points out some areas for concern such as hyperparameter sensitivity, the study is brief regarding the demonstration of how change of some of the hyperparameters like λ and K2 affects the results. For instance, the current discussion could be extended or more experiments could be carried out could turn out to be useful and enlightening.
Responses 2: Thank you for pointing that out! We have added an ablation study w.r.t. the weight perturbation constraint ϒ (Figure 5 (c) and the paragraph from line 361 to 366 on page 13). We have also enriched the discussion of K2 with a time complexity analysis (Figure 5 (e)). We selected these three parameters for ablation studies because they are special hyper-parameters for our method. For the other hyper-parameters such as learning rate and batch size, they are common hyper-parameters for almost all the AT methods. Therefore, we claim that more experiments on those parameters will not be useful or enlightening.
Comments 3: Proposed Algorithm: It is well-written work, and if there is a more complex and detailed analysis of how the GAWP algorithm is working in detail, particularly the justification of the weight perturbation strategy part it will be far better. Provide more details.
Responses 3: Agree. We have, accordingly, added Section 3.1 Preliminaries for a detailed introduction of the weight perturbation mechanism. In this section, we articulated the problem formulation of AWP (Equation (7)) and its update rule to determine the weight perturbation (Equation (8)). We have also introduced the update rule of RWP for comparison (Equation (9)). The justification of the weight perturbation strategy is discussed from line 168 to line 170. For a theoretical justification, we refer the author to the paper of AWP [A].
Comments 4: Improve the methodology section, particularly for readers less familiar with the difficulties of the adversarial training.
Responses 4: We have improved the methodology section as above. To help readers less familiar with adversarial training, we have enriched the discussion of adversarial training (line 133 - line 139, Table 1 on page 5) by providing a detailed comparison of all the AT baseline methods covered in the paper.
Comments 5: The authors compare the GAWP with basic methods such as GAIRAT and AWP, it should include more comparisons with other state-of-the-art adversary training methods.
Responses 5: Thank you for the constructive feedback! We have, accordingly, added the comparison experiments with MART and MLCAT_WP (updated Tables 3 and 4 on page 10). Note that we have compared our method with RWP which is the state-of-the-art weight perturbation strategy, to the best of our knowledge.
Comments 6: The analysis is performed using ResNet architecture on well-known datasets such as CIFAR-10 and CIFAR-100. However, the study would benefit from further tests on different data types and sampling designs to generalize the proposed method.
Responses 6: The chosen datasets and sampling design are the same as previous work RWP [B]. We follow the same setting for a fair comparison. For the other data types, we agree that it is the limitation of our current work to only discuss the application of the proposed method in the context of image data. Therefore, we have added Section 5 (page 13-14) to discuss the limitations of this paper and potential future research directions.
Comments 7: Explain why the proposed method should theoretically lead to improved robustness would be beneficial.
Responses 7: We agree the lack of theoretical justification is another limitation of this paper. Therefore, we have also added that to Section 5 (page 13-14).
Comments 8: It would be useful to discuss how the proposed method affects model interpretability and whether it presents any additional challenges in understanding how the model makes decisions.
Responses 8: We respectfully claim that the focus of this paper lies in addressing the robustness generalization issue of GAIRAT and propose an advanced AT method to enhance the model robustness. The discussion of model interpretability is out of the scope of this paper.
Comments 9: The article discusses robust overfitting, but it would be helpful to include additional extensive analysis on whether the proposed method avoids overfitting not just in terms of robustness but also in terms of generalization to new, unseen data.
Responses 9: We appreciate the insightful feedback from the reviewer. According to the suggestion, we included additional analysis on regular overfitting in the paper (Figure 4 on page 11 and line 285 - 293).
Comments 10: Improve the final section of Conclusions and Future Work. For example, how could the method be adapted for different types of neural network architectures or different domains??
Responses 10: We have, accordingly, added Section 5 for a better discussion of future work. For model architectures, we select PreActResNet-18 and Wide ResNet-34-10 which are the same as [B]. There is no necessary adaptation for our method to be applied to either architecture. As for different domains, we assume the reviewer means applying our method to other data types. This limitation has been discussed in Section 5.
[A] Wu, Dongxian, et al. "Adversarial weight perturbation helps robust generalization." arXiv preprint arXiv:2004.05884 (2020).
[B] Yu, Chaojian, et al. "Robust weight perturbation for adversarial training". arXiv preprint arXiv:2205.14826 (2022).
Reviewer 3 Report
Comments and Suggestions for AuthorsThe overall quality of the paper is good. But I would like to see better argumentation behind your decisions made during the experiments. Why did you choose the training parameters mentioned in the paper? Why did you choose the mentioned adversarial attacks?
Author Response
We extend our gratitude to the reviewer for taking time to review this manuscript. Please find the detailed responses below.
Comments 1: Why did you choose the training parameters mentioned in the paper? Why did you choose the mentioned adversarial attacks?
Responses 1: Based on our understanding, the reviewer is asking about the “training parameter” part of Section 4.1. We would like to provide more clarification regarding this part. For the choice of commonly used training parameters, such as batch size, learning rate, and momentum, we follow the settings in previous works [A, B]. For the other training parameters of our method GAWP, we choose the values based on the results of the ablation study in Section 4.3.2.
I chose the PGD attack for training and the AA attack for evaluation, which is the same as previous works [A, B]. Particularly, the PGD attack is less time-consuming compared with AA, which is appealing during training. AA is an ensemble of adversarial attacks which include the PGD attack and unseen attacks during training, which provides a more comprehensive evaluation of the model's robustness. A higher test accuracy under AA indicates an improved model robustness. A narrower gap between the test accuracies under AA and the PGD attack implies an enhanced robustness generalization.
A] Yu, Chaojian, et al. "Robust weight perturbation for adversarial training". arXiv preprint arXiv:2205.14826 (2022).
[B] Yu, Chaojian, et al. "Understanding robust overfitting of adversarial training and beyond". arXiv preprint arXiv:2206.08675 (2022).
Reviewer 4 Report
Comments and Suggestions for AuthorsThe proposed study on adversarial sample learning methods presents a compelling and relevant topic. However, I would like to suggest several revisions to further enhance the research:
(1) It would be beneficial to include a more comprehensive set of comparative experiments with existing studies. This would provide a clearer understanding of the advantages and limitations of the proposed method relative to the current state-of-the-art.
(2) The inclusion of visualizations for adversarial samples in the dataset would be valuable. Such visualizations could help to better illustrate the effectiveness and characteristics of the proposed approach.
(3) An analysis of the time and space complexity of the proposed method should be incorporated. This would provide a more thorough evaluation of the method’s efficiency and feasibility in practical applications.
(4) The content would be enriched by including discussions related to the following references: "Textual Adversarial Training of Machine Learning Models for Resistance to Adversarial Examples"
(5) It is recommended to discuss the limitations of the proposed method and potential areas for future improvements. This would provide a balanced view of the method's current capabilities and suggest directions for further research.
Author Response
We extend our gratitude to the reviewer for taking time to review this manuscript. Please find the detailed responses below.
Comments 1: It would be beneficial to include a more comprehensive set of comparative experiments with existing studies. This would provide a clearer understanding of the advantages and limitations of the proposed method relative to the current state-of-the-art.
Responses 1: Thank you for the constructive feedback! We have added more comparative experiments with MART and MLCAT_WP (updated Tables 3 and 4 on page 10). Note that we have compared our method with RWP which is the state-of-the-art weight perturbation strategy, to the best of our knowledge.
Comments 2: The inclusion of visualizations for adversarial samples in the dataset would be valuable. Such visualizations could help to better illustrate the effectiveness and characteristics of the proposed approach.
Responses 2: We agree with this comment. Accordingly, we added a figure (Figure 1 on page 2) in the Introduction section for a better illustration of the problems under consideration in this paper. Note that visualizations of adversarial examples are included in Figure 1. We have also added Figure 2 on page 3 for the visualization of the weight loss landscape, which provides a clearer explanation of the motivation behind the proposed approach.
Comments 3: An analysis of the time and space complexity of the proposed method should be incorporated. This would provide a more thorough evaluation of the method’s efficiency and feasibility in practical applications.
Responses 3: We have, accordingly, added Figure 5 (e) on page 13 for the analysis of time complexity. As for space complexity, all the experiments are executed on 4 NVIDIA GeForce 2080Ti GPUs, which has been mentioned in line 254.
Comments 4: The content would be enriched by including discussions related to the following references: "Textual Adversarial Training of Machine Learning Models for Resistance to Adversarial Examples"
Responses 4: We agree that it is a limitation of our paper to only discuss the application of the proposed method in the context of image data. Therefore, we have added Section 5 to discuss the limitations of this paper and potential future works. The discussion of the adaptation of the proposed method to text data is from line 380 to line 389.
Comments 5: It is recommended to discuss the limitations of the proposed method and potential areas for future improvements. This would provide a balanced view of the method's current capabilities and suggest directions for further research.
Responses 5: Thank you for the constructive comment! We have added Section 5 to discuss the limitations of the proposed method and future research directions.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors well addressed my concerns and I suggest Accept.
Author Response
We sincerely appreciate the positive feedback from the reviewer.
Reviewer 2 Report
Comments and Suggestions for Authors1) The introduction should give a good background, however, on this particular aspect, it fails to emphasize the motivation of the research. What makes addressing the robustness generalization issue in adversarial training has to be very important? A stronger focus on the applications of this research in the real world would most certainly improve the reader’s appreciation of the relevance of this work.
2) Explain the importance of your findings in light of society at large. In what way could this work be used or extended in the next researches in adversarial training or in other related fields? Perhaps, it is necessary to highlight a few sentences about possible uses or the significance of this investigation in the real world.
3) Clarity and Summarization: It is recommended to dedicate this section to briefly reminding the readers of the major findings and the main that the paper offers to the scientific community. Currently, while it only reasserts the concept the substance of what is stated is not entirely concise –could be improved. One could further make the results more explicit, particularly when explaining how the GAWP proposed method improves on other approaches.
4) It is introduced in a very compact manner and it might be useful to explain it in more detail how it happens, perhaps expanding on the GAWP algorithm for those who are not well-endowed with it.
5) Future Work: This is especially the case when the conclusion could supply more particular recommendations for further studies. Exploring the directions of future research is often mentioned briefly in the conclusion while the paper’s limitations are analyzed in a dedicated section.
It’s important to keep the language in check to a level that the general population, those who have no sharpened understanding of adversarial training can comprehend. There is also the ruse of using technical terms when common ones will do and these they can be put into a better, clearer perspective.
Check for grammatical errors. Proofread again the manuscript.
Author Response
Comments 1: The introduction should give a good background, however, on this particular aspect, it fails to emphasize the motivation of the research. What makes addressing the robustness generalization issue in adversarial training has to be very important? A stronger focus on the applications of this research in the real world would most certainly improve the reader’s appreciation of the relevance of this work.
Responses 1: Thank you for the comments! We have provided a more detailed discussion about the motivation of the research (line 32 - line 43). In summary, GAIRAT follows an intuition that has been proven successful in regular training settings. However, the current version of GAIRAT is unreliable because of the robustness generalization issue, and there has not been a practical remedy proposed to solve the issue. Through this paper, we hope to provide more insights of the underlying cause of this unaddressed issue. Additionally, given the success of similar ideas in regular training, we believe there exists untapped potential for GAIRAT in terms of comprehensively improving the model robustness.
Comments 2: Explain the importance of your findings in light of society at large. In what way could this work be used or extended in the next researches in adversarial training or in other related fields? Perhaps, it is necessary to highlight a few sentences about possible uses or the significance of this investigation in the real world.
Responses 2: Thank you for the comments! We have added an additional paragraph (line 419 - line 428) to discuss the real-world application of this work.
Comments 3: Clarity and Summarization: It is recommended to dedicate this section to briefly reminding the readers of the major findings and the main that the paper offers to the scientific community. Currently, while it only reasserts the concept the substance of what is stated is not entirely concise –could be improved. One could further make the results more explicit, particularly when explaining how the GAWP proposed method improves on other approaches.
Responses 3: By "this section", we assume the reviewer means Section 6 (Conclusions). Given a detailed comparison between GAWP and all the baseline methods has been provided in Section 4, we provide references to the main tables (3 and 4) in line 418 for an explicit presentation. We have already reiterated the main findings in this Section. From lines 410 to 411, we mentioned that the robustness generalization issue can be mitigated by imposing AWP regularization. From lines 413 to 415, we reiterated the main difference between GAWP and the other weight perturbation strategies, which is also why our method outperforms the others.
Comments 4: It is introduced in a very compact manner and it might be useful to explain it in more detail how it happens, perhaps expanding on the GAWP algorithm for those who are not well-endowed with it.
Responses 4: We have provided enough details about the proposed algorithm: preliminaries and background information (Section 3.1), intuition of GAWP (Section 3.2), technical details and pseudo-code (Section 3.3).
Comments 5: Future Work: This is especially the case when the conclusion could supply more particular recommendations for further studies. Exploring the directions of future research is often mentioned briefly in the conclusion while the paper’s limitations are analyzed in a dedicated section.
Responses 5: We have added the second paragraph in Section 6 (line 419 - line 428), discussing real-world applications of our work. Suggested future directions have been discussed in Section 5 (line 391 - line 407).
Comments 6: It’s important to keep the language in check to a level that the general population, those who have no sharpened understanding of adversarial training can comprehend. There is also the ruse of using technical terms when common ones will do and these they can be put into a better, clearer perspective.
Responses 6: We cannot address this concern because the reviewer did not provide any specific terms requiring more explanation.
Comments 7: Check for grammatical errors. Proofread again the manuscript.
Responses 7: Thank you for the comments! We have hired a native speaker to proofread the manuscript. Hopefully, the manuscript is now free of grammatical errors.
Reviewer 4 Report
Comments and Suggestions for AuthorsI recommend the acceptance.
Author Response
We sincerely appreciate the positive feedback from the reviewer.
Round 3
Reviewer 2 Report
Comments and Suggestions for AuthorsAll remarks have been taken into consideration by the authors. The manuscript has significantly improved.
Comments on the Quality of English LanguageMinor.

