Regularization Meets Enhanced Multi-Stage Fusion Features: Making CNN More Robust against White-Box Adversarial Attacks
Abstract
:1. Introduction
- We propose a new network, EMSF2Net. The enhanced multi-stage fusion feature in EMSF2Net can represent and keep the global information of each channel well.
- We show that regularizing the enhanced multi-stage fusion feature can significantly improve the adversarial robustness of a CNN.
- The extensive experimental results on white-box attacks with different settings show the effectiveness and robustness of the proposed approach.
2. The Proposed Method
2.1. Multi-Stage Features Enhancement (MSFE)
2.2. Multi-Stage Features Fusion (MSF2)
2.3. Regularization
3. Dataset and Adversarial Attacks
3.1. Dataset: CIFAR-10
3.2. Attack Methods
3.2.1. Fast Gradient Sign Method
3.2.2. Projected Gradient Descent
3.2.3. Momentum Iterative Fast Gradient Sign Method
3.2.4. Diverse Inputs Iterative Fast Gradient Sign Method
3.2.5. Averaged Projected Gradient Descent
3.2.6. Carlini and Wagner
4. Comparison Experiments
4.1. Comparison Methods
4.2. Performance against Adversarial Attacks with -Norm
4.3. Performance against Adversarial Attacks with -Norm
5. Ablation Analysis
5.1. Performance on Three Methods
5.2. Performance on Each Class of CIFAR-10
5.3. Grad-Cam Visualization
5.4. t-SNE Visualization
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing properties of neural networks. In Proceedings of the International Conference on Learning Representations, Banff, AB, Canada, 14–16 April 2014. [Google Scholar]
- Kwon, H.; Jeong, J. AdvU-Net: Generating adversarial example based on medical image and targeting u-net model. J. Sens. 2022, 2022, 4390413. [Google Scholar] [CrossRef]
- Ilyas, A.; Santurkar, S.; Tsipras, D.; Engstrom, L.; Tran, B.; Madry, A. Adversarial examples are not bugs, they are features. In Proceedings of the Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019. [Google Scholar]
- Goodfellow, I.; Shlens, J.; Szegedy, C. Explaining and Harnessing Adversarial Examples. In Proceedings of the International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; Vladu, A. Towards Deep Learning Models Resistant to Adversarial Attacks. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Xie, C.; Tan, M.; Gong, B.; Yuille, A.; Le, Q.V. Smooth adversarial training. arXiv 2020, arXiv:2006.14536. [Google Scholar]
- Zhang, J.; Zhu, J.; Niu, G.; Han, B.; Sugiyama, M.; Kankanhalli, M. Geometry-aware instance-reweighted adversarial training. In Proceedings of the International Conference on Learning Representations, Virtual Event, 3–7 May 2021. [Google Scholar]
- Pang, T.; Yang, X.; Dong, Y.; Su, H.; Zhu, J. Bag of tricks for adversarial training. In Proceedings of the International Conference on Learning Representations, Virtual Event, 3–7 May 2021. [Google Scholar]
- Ye, N.; Li, Q.; Zhou, X.Y.; Zhu, Z. An annealing mechanism for adversarial training acceleration. IEEE Trans. Neural Netw. Learn. Syst. 2021. [Google Scholar] [CrossRef] [PubMed]
- Jia, X.; Wei, X.; Cao, X.; Foroosh, H. Comdefend: An efficient image compression model to defend adversarial examples. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 6084–6092. [Google Scholar]
- Liu, Z.; Liu, Q.; Liu, T.; Xu, N.; Lin, X.; Wang, Y.; Wen, W. Feature Distillation: DNN-oriented jpeg compression against adversarial examples. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 860–868. [Google Scholar]
- Liao, F.; Liang, M.; Dong, Y.; Pang, T.; Hu, X.; Zhu, J. Defense against adversarial attacks using high-level representation guided denoiser. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1778–1787. [Google Scholar]
- Sun, B.; Tsai, N.h.; Liu, F.; Yu, R.; Su, H. Adversarial defense by stratified convolutional sparse coding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 11447–11456. [Google Scholar]
- Bakhti, Y.; Fezza, S.A.; Hamidouche, W.; Déforges, O. DDSA: A defense against adversarial attacks using deep denoising sparse autoencoder. IEEE Access 2019, 7, 160397–160407. [Google Scholar] [CrossRef]
- Xie, C.; Wu, Y.; Maaten, L.v.d.; Yuille, A.L.; He, K. Feature denoising for improving adversarial robustness. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 501–509. [Google Scholar]
- Song, Y.; Kim, T.; Nowozin, S.; Ermon, S.; Kushman, N. Pixeldefend: Leveraging generative models to understand and defend against adversarial examples. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Kou, C.; Lee, H.K.; Chang, E.C.; Ng, T.K. Enhancing transformation-based defenses against adversarial attacks with a distribution classifier. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Yang, Y.; Zhang, G.; Katabi, D.; Xu, Z. ME-Net: Towards effective adversarial robustness with matrix estimation. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 10–15 June 2019. [Google Scholar]
- Guo, C.; Rana, M.; Cisse, M.; van der Maaten, L. Countering adversarial images using input transformations. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Hinton, G.; Vinyals, O.; Dean, J. Distilling the knowledge in a neural network. arXiv 2015, arXiv:1503.02531. [Google Scholar]
- Papernot, N.; McDaniel, P.; Wu, X.; Jha, S.; Swami, A. Distillation as a defense to adversarial perturbations against deep neural networks. In Proceedings of the IEEE Symposium on Security and Privacy, San Jose, CA, USA, 22–26 May 2016; pp. 582–597. [Google Scholar]
- Zi, B.; Zhao, S.; Ma, X.; Jiang, Y.G. Revisiting adversarial robustness distillation: Robust soft labels make student better. In Proceedings of the IEEE International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 16443–16452. [Google Scholar]
- Wang, H.; Deng, Y.; Yoo, S.; Ling, H.; Lin, Y. AGKD-BML: Defense against adversarial attack by attention guided knowledge distillation and bi-directional metric learning. In Proceedings of the IEEE International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 7658–7667. [Google Scholar]
- Moosavi-Dezfooli, S.M.; Fawzi, A.; Uesato, J.; Frossard, P. Robustness via curvature regularization, and vice versa. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 9078–9086. [Google Scholar]
- Kannan, H.; Kurakin, A.; Goodfellow, I. Adversarial logit pairing. arXiv 2018, arXiv:1803.06373. [Google Scholar]
- Mustafa, A.; Khan, S.; Hayat, M.; Goecke, R.; Shen, J.; Shao, L. Adversarial defense by restricting the hidden space of deep neural networks. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 3385–3394. [Google Scholar]
- Agarwal, C.; Nguyen, A.; Schonfeld, D. Improving robustness to adversarial examples by encouraging discriminative Features. In Proceedings of the IEEE International Conference on Image Processing, Taipei, Taiwan, 22–25 September 2019; pp. 3505–3801. [Google Scholar]
- Xu, J.; Li, Y.; Jiang, Y.; Xia, S.T. Adversarial defense via local flatness regularization. In Proceedings of the IEEE International Conference on Image Processing, Abu Dhabi, United Arab Emirates, 25–28 October 2020; pp. 2196–2200. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Krizhevsky, A.; Hinton, G. Learning Multiple Layers of Features from Tiny Images. Master’s Thesis, Department of Computer Science, University of Toronto, Toronto, ON, Canada, 2009. [Google Scholar]
- Kim, H. Torchattacks: A pytorch repository for adversarial attacks. arXiv 2020, arXiv:2010.01950. [Google Scholar]
- Dong, Y.; Liao, F.; Pang, T.; Su, H.; Zhu, J.; Hu, X.; Li, J. Boosting adversarial attacks with momentum. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 9185–9193. [Google Scholar]
- Alexey Kurakin, I.G.; Bengio, S. Adversarial examples in the physical world. arXiv 2017, arXiv:1607.02533. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Identity mappings in deep residual networks. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 630–645. [Google Scholar]
- Xie, C.; Zhang, Z.; Zhou, Y.; Bai, S.; Wang, J.; Ren, Z.; Yuille, A.L. Improving transferability of adversarial examples with input diversity. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 2730–2739. [Google Scholar]
- Athalye, A.; Engstrom, L.; Ilyas, A.; Kwok, K. Synthesizing robust adversarial examples. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 284–293. [Google Scholar]
- Zimmermann, R.S. Comment on ‘Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network’. arXiv 2019, arXiv:1907.00895. [Google Scholar]
- Carlini, N.; Wagner, D. Towards evaluating the robustness of neural networks. In Proceedings of the 2017 IEEE Symposium on Security and Privacy, San Jose, CA, USA, 22–24 May 2017; pp. 39–57. [Google Scholar]
- Wang, Y.; Zou, D.; Yi, J.; Bailey, J.; Ma, X.; Gu, Q. Improving adversarial robustness requires revisiting misclassified examples. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Guo, M.; Yang, Y.; Xu, R.; Liu, Z.; Lin, D. When nas meets robustness: In search of robust architectures against adversarial attacks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 631–640. [Google Scholar]
- Addepalli, S.; BS, V.; Baburaj, A.; Sriramanan, G.; Babu, R.V. Towards achieving adversarial robustness by enforcing feature consistency across bit planes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 1020–1029. [Google Scholar]
Layer | ResNet50 |
---|---|
STAGE 0 | |
STAGE 1 | |
STAGE 2 | |
STAGE 3 | |
STAGE 4 | |
5 | |
6 | |
7 |
FGSM | |||||
---|---|---|---|---|---|
Method | No Attack | ||||
MART | 83.6% | 78.4% | 72.9% | 61.6% | 42.6% |
RobNet | 82.7% | 76.9% | 70.6% | 58.4% | 38.0% |
BPFC | 82.4% | 73.7% | 64.6% | 50.1% | 33.7% |
EMSF2Net (Ours) | 92.7% | 83.3% | 81.1% | 73.3% | 42.7% |
Attack Strength | ||||||||
---|---|---|---|---|---|---|---|---|
PGD-10 | MI-FGSM-10 | |||||||
MART | 79.7% | 75.6% | 65.5% | 43.3% | 78.3% | 72.4% | 59.1% | 32.9% |
RobNet | 78.3% | 73.4% | 62.1% | 38.6% | 76.8% | 70.1% | 55.7% | 27.8% |
BPFC | 75.7% | 68.2% | 52.0% | 26.9% | 73.4% | 63.2% | 44.5% | 20.5% |
EMSF2Net (Ours) | 82.1% | 80.6% | 74.8% | 53.8% | 82.1% | 81.0% | 77.5% | 66.5% |
1c | DI2-FGSM-10 | EOTPGD-10 | ||||||
MART | 79.9% | 76.1% | 66.6% | 45.8% | 79.7% | 75.6% | 65.4% | 43.5% |
RobNet | 78.6% | 74.0% | 63.3% | 40.9% | 78.3% | 73.3% | 62.1% | 38.4% |
BPFC | 76.1% | 69.4% | 53.6% | 29.1% | 75.7% | 68.3% | 52.0% | 26.9% |
EMSF2Net (Ours) | 81.2% | 79.5% | 73.8% | 57.2% | 82.1% | 80.7% | 74.7% | 54.1% |
Attack Strength | ||||||||
---|---|---|---|---|---|---|---|---|
PGD-20 | MI-FGSM-20 | |||||||
MART | 78.3% | 72.2% | 57.7% | 27.7% | 78.3% | 72.0% | 57.3% | 26.6% |
RobNet | 76.7% | 69.7% | 53.9% | 22.5% | 76.7% | 69.6% | 53.5% | 21.8% |
BPFC | 73.3% | 61.9% | 39.9% | 13.0% | 73.2% | 61.8% | 39.7% | 12.9% |
EMSF2Net (Ours) | 81.4% | 79.0% | 70.2% | 45.9% | 81.5% | 79.0% | 69.6% | 50.5% |
1c | DI2-FGSM-20 | EOTPGD-20 | ||||||
MART | 78.6% | 73.0% | 59.1% | 30.1% | 78.3% | 72.2% | 57.9% | 27.6% |
RobNet | 77.0% | 70.5% | 55.5% | 24.5% | 76.7% | 69.7% | 53.7% | 22.5% |
BPFC | 73.8% | 63.5% | 41.9% | 14.5% | 73.3% | 62.0% | 39.8% | 13.1% |
EMSF2Net (Ours) | 80.2% | 76.9% | 68.2% | 45.1% | 81.5% | 79.4% | 70.5% | 45.4% |
Attack Strength | ||||||
---|---|---|---|---|---|---|
PGD_-10 | PGD_-20 | |||||
MART | 46.2% | 17.2% | 4.7% | 37.5% | 5.5% | 0.4% |
RobNet | 42.8% | 14.4% | 3.9% | 34.5% | 4.9% | 0.5% |
BPFC | 47.1% | 23.0% | 11.3% | 41.7% | 12.9% | 3.3% |
EMSF2Net (Ours) | 78.8% | 72.4% | 64.1% | 74.1% | 62.9% | 52.0% |
Attack strength | Iteration number () | |||||
100 | 500 | 1000 | ||||
PGD_-40 | CW | |||||
MART | 34.1% | 3.1% | 0.1% | 22.9% | 21.0% | 20.9% |
RobNet | 31.5% | 3.1% | 0.1% | 12.9% | 10.8% | 10.8% |
BPFC | 39.4% | 8.7% | 1.4% | 59.4% | 59.4% | 59.4% |
EMSF2Net (Ours) | 67.6% | 50.7% | 39.6% | 72.5% | 69.1% | 67.7% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, J.; Maeda, K.; Ogawa, T.; Haseyama, M. Regularization Meets Enhanced Multi-Stage Fusion Features: Making CNN More Robust against White-Box Adversarial Attacks. Sensors 2022, 22, 5431. https://doi.org/10.3390/s22145431
Zhang J, Maeda K, Ogawa T, Haseyama M. Regularization Meets Enhanced Multi-Stage Fusion Features: Making CNN More Robust against White-Box Adversarial Attacks. Sensors. 2022; 22(14):5431. https://doi.org/10.3390/s22145431
Chicago/Turabian StyleZhang, Jiahuan, Keisuke Maeda, Takahiro Ogawa, and Miki Haseyama. 2022. "Regularization Meets Enhanced Multi-Stage Fusion Features: Making CNN More Robust against White-Box Adversarial Attacks" Sensors 22, no. 14: 5431. https://doi.org/10.3390/s22145431
APA StyleZhang, J., Maeda, K., Ogawa, T., & Haseyama, M. (2022). Regularization Meets Enhanced Multi-Stage Fusion Features: Making CNN More Robust against White-Box Adversarial Attacks. Sensors, 22(14), 5431. https://doi.org/10.3390/s22145431