An Evolutionary, Gradient-Free, Query-Efficient, Black-Box Algorithm for Generating Adversarial Instances in Deep Convolutional Neural Networks
 Mathieu Dutour-Sikiric
Mathieu Dutour-Sikiric
        Round 1
Reviewer 1 Report
I have several issues with this work:
1) The references 32, 33, 34 are presented as state-of-the-art but are dated in 2016 at the latest. That does not seem right.
2) The merit of gradient-free methods that allow to treat black box model are clear to the point where does that need to be said?
3) In actually querying a model which is a true black box (like a web site) one would need to query it by queries. And possibly very many queries. Can you efficiently fool google-images / TinEye or Yandex? I have my doubts that this can be done.
4) At a minimum you need to provide the number of function calls needed for getting this performance. This is common in black-box algo and provided for example by scipy.minimize.
Author Response
Ref 1
Thank you for taking the time to read the paper and offer insightful comments.
1) The references 32, 33, 34 are presented as state-of-the-art but are dated in 2016 at the latest.
We fixed this in the abstract, to emphasize we meant our comparison is against the most commonly used models to date.
2) The merit of gradient-free methods that allow to treat black box model are clear to the point where does that need to be said?
We added in the paper "Herein, we present an evolutionary, gradient-free optimization approach for generating adversarial instances, more suitable for real-life scenarios, because usually there is no access to a model's internals, including the gradients; thus, it is important to craft attacks that do not use gradients."
3) In actually querying a model which is a true black box (like a web site) one would need to query it by queries. And possibly very many queries. Can you efficiently fool google-images / TinEye or Yandex? I have my doubts that this can be done.
That is a good question, one which we are currently pursuing, as we hope to strengthen our algorithm and its applicability. We hope to report on such findings in the near future.
4) At a minimum you need to provide the number of function calls needed for getting this performance. This is common in black-box algo and provided for example by scipy.minimize.
These are the “queries” columns in all the tables, we’ve sharpened this point in the captions.
Reviewer 2 Report
The author proposed Query-Efficient Evolutionary Attack(QuEry Attack), which requires output logits of classifier, is a well-thought and explained approach in this manuscript for gradient-free optimization problems. The author tested this method across standard datasets with the commonly used deep-learning models. I have a few comments, and they are mentioned below in detail.
1. Please explain Figure 1 in detail for readers better understand the differences.
2. Also, please explain how the data was split and used to train and test. What lines 295-297 mention is not clear.
3. It’s not clear what is x^ image perturbation method used, and does the author try various methods to generate noise and compare performance with them? Maybe use the gaussian method to generate the noise, but when evaluating the performance, add Laurent noise to the test dataset to check model performance.
4. Can the author present the analysis in other metrics, along with the currently mentioned accuracy.
Author Response
Ref 2
Thank you for taking the time to read the paper and offer insightful comments.
The author proposed Query-Efficient Evolutionary Attack(QuEry Attack), which requires output logits of classifier, is a well-thought and explained approach in this manuscript for gradient-free optimization problems. The author tested this method across standard datasets with the commonly used deep-learning models. I have a few comments, and they are mentioned below in detail.
- Please explain Figure 1 in detail for readers better understand the differences.
Caption has been fixed.
- Also, please explain how the data was split and used to train and test. What lines 295-297 mention is not clear.
Added “the models are pretrained and taken from PyTorch”.
Fixed 295-297.
- It’s not clear what is x^ image perturbation method used, and does the author try various methods to generate noise and compare performance with them? Maybe use the gaussian method to generate the noise, but when evaluating the performance, add Laurent noise to the test dataset to check model performance.
We generate the perturbation using our method via a strong initialization and evolutionary algorithm. We compare the performance against 2 other black-box attacks, defenses, and robust models. Usually, Gaussian-noise addition does not fool the model and thus “smarter” noise needs to be designed — as we did.
- Can the author present the analysis in other metrics, along with the currently mentioned accuracy.
Since this is a bounded attack, the MSE is always the given epsilon.
Round 2
Reviewer 1 Report
In my initial criticism, I mentioned that the authors claim of using state of the art classification models was incorrect with the model being dated to 2016.
Now they have removed the statement about "state of the art". But is their methodology still working on state of the art models? Maybe not.
Author Response
Thank you for helping us improve the paper.
The following paragraph has been added:
We chose to use the above models since they are state-of-the-art CNN architectures. QuEry Attack exploits the nature of convolution layers by using square mutations and stripes initialisation, which were shown to be very effective [18]. Further, these architectures serve as a backbone for many other downstream tasks, such as object detection [36], semantic segmentation [ 37 ], and image captioning [ 38]. Recently, vision transformers have beaten CNNs in image classification tasks [39], but they require huge amounts of data and resources that are not available too many (including us). Future work will be dedicated to expand the attack to vision transformers.
In addition we want to emphasize that this attack was meant to work on CNN models (even added this to title), since it exploits the nature of convolution layers by using square mutations and stripes initialization --- which were proven to be effective.  ResNet, VGG and Inception are still the SOTA in terms of CNNs.
Furthermore, these architectures serve as backbone for many other downstream tasks like object detection, semantic segmentation, image captioning, etc.
(https://iopscience.iop.org/article/10.1088/1742-6596/1544/1/012196/pdf)
Many of adversarial attacks papers focus on these models since they tend to generalize to other models. These are all papers from 2022 that evaluate adversarial attacks against resnet / vgg / inception
https://arxiv.org/pdf/2210.08159v1.pdf
https://arxiv.org/pdf/2210.05968v1.pdf
https://arxiv.org/pdf/2207.02391v1.pdf
https://arxiv.org/pdf/2206.08316v1.pdf
In recent years, Vision Transformers have beaten CNNs but they need huge amount of data and resources, which we don’t have, as noted above.
Thus we chose to focus on 3 main CNN models and 3 different datasets.
Future work will be dedicated to attack Vision Transformers.
We hope you find this satisfactory, and hope the paper can be published.
Thank you very much.
Round 3
Reviewer 1 Report
The major weakness of the manuscript remains that it is limited to "Inception-v3 [33], ResNet-50 220 [34], and VGG-16-BN [35]".
Those are old models with publication dates of 2016, 2016, and 2014.
What happens for newer models? That is a very reasonable question and the authors do not appear to consider it.
Author Response
see editorial reply
 
        
