# Methods for Preventing Visual Attacks in Convolutional Neural Networks Based on Data Discard and Dimensionality Reduction

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

- −
- maximizing the flow of information in the network due to the careful balance between depth and width. Property maps are incremented before each pooling;
- −
- with increasing depth, the number of properties, or the width of the layer, also systematically increases;
- −
- the width of each layer is increased to increase the combination of properties before the next layer;
- −
- only 3 × 3 convolutions are used whenever possible. Considering 5 × 5 and 7 × 7 filters can be decomposed with multiple 3 × 3 convolutions.

## 3. Results

## 4. Discussion

## Funding

## Data Availability Statement

## Conflicts of Interest

## References

- Okarma, K. Applications of Computer Vision in Automation and Robotics. Appl. Sci.
**2020**, 10, 6783. [Google Scholar] [CrossRef] - Đurović, P.; Vidović, I.; Cupec, R. Semantic Component Association within Object Classes Based on Convex Polyhedrons. Appl. Sci.
**2020**, 10, 2641. [Google Scholar] [CrossRef][Green Version] - Merino, I.; Azpiazu, J.; Remazeilles, A.; Sierra, B. Histogram-Based Descriptor Subset Selection for Visual Recognition of Industrial Parts. Appl. Sci.
**2020**, 10, 3701. [Google Scholar] [CrossRef] - LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature
**2015**, 521, 36–44. [Google Scholar] [CrossRef] [PubMed] - Krizhevsky, A.; Sutskever, I.; Hinton, G. ImageNet classification with deep convolutional neural networks. In Proceedings of the 26th Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1106–1114. [Google Scholar]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J. Mach. Learn. Res.
**2014**, 15, 1929–1958. [Google Scholar] - Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. Available online: https://arxiv.org/abs/1409.1556 (accessed on 28 April 2021).
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. Available online: https://arxiv.org/abs/1409.4842 (accessed on 28 April 2021).
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. Available online: https://arxiv.org/abs/1512.03385 (accessed on 28 April 2021).
- Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. Available online: https://arxiv.org/abs/1610.02357 (accessed on 28 April 2021).
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, 24–27 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards realtime object detection with region proposal networks. In Proceedings of the 29th Conference on Neural Information Processing Systems, Santiago, Chile, 11–18 December 2015; pp. 91–99. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. Available online: https://arxiv.org/abs/1703.06870 (accessed on 28 April 2021).
- Andriyanov, N.A.; Vasiliev, K.K. Use autoregressions with multiple roots of the characteristic equations to image representation and filtering. CEUR Workshop Proc.
**2018**, 2210, 273–281. [Google Scholar] [CrossRef] - Andriyanov, N.A.; Vasiliev, K.K. Optimal filtering of multidimensional random fields generated by autoregressions with multiple roots of characteristic equations. CEUR Workshop Proc.
**2019**, 2391, 72–78. [Google Scholar] [CrossRef] - Aizawa, K. Model-Based Image Coding: Advanced Video Coding Techniques for Very Low Bit-Rate Applications. Proc. IEEE
**2005**, 83, 259–271. [Google Scholar] [CrossRef] - Chen, C.; Li, O.; Tao, D.; Barnett, A.; Rudin, C.; Su, J.K. This Looks Like That: Deep Learning for Interpretable Image Recognition. Adv. Neural Inf. Process. Syst.
**2019**, 32, 8930–8941. [Google Scholar] - Han, K.; Wen, H.; Zhang, Y.; Fu, D.; Culurciello, E.; Liu, Z. Deep Predictive Coding Network with Local Recurrent Processing for Object Recognition. Adv. Neural Inf. Process. Syst.
**2018**, 31, 9201–9213. [Google Scholar] - Srivastava, N.; Vul, E. A simple model of recognition and recall memory. Adv. Neural Inf. Process. Syst.
**2017**, 30, 293–301. [Google Scholar] - Deng, L.; Chu, H.-H.; Shi, P.; Wang, W.; Kong, X. Region-Based CNN Method with Deformable Modules for Visually Classifying Concrete Cracks. Appl. Sci.
**2020**, 10, 2528. [Google Scholar] [CrossRef][Green Version] - Jiang, J.-R.; Lee, J.-E.; Zeng, Y.-M. Time Series Multiple Channel Convolutional Neural Network with Attention-Based Long Short-Term Memory for Predicting Bearing Remaining Useful Life. Sensors
**2020**, 20, 166. [Google Scholar] [CrossRef][Green Version] - Andriyanov, N.A. Analysis of the Acceleration of Neural Networks Inference on Intel Processors Based on OpenVINO Toolkit. In Proceedings of the 2020 Systems of Signal Synchronization, Generating and Processing in Telecommunications, SYNCHROINFO, Kaliningrad, Russia, 1–3 July 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Kandel, I.; Castelli, M. How Deeply to Fine-Tune a Convolutional Neural Network: A Case Study Using a Histopathology Dataset. Appl. Sci.
**2020**, 10, 3359. [Google Scholar] [CrossRef] - Shlezinger, N.; Eldar, Y.C. Deep Task-Based Quantization. Entropy
**2021**, 23, 104. [Google Scholar] [CrossRef] - Hao-Ting, L.; Shih-Chieh, L.; Cheng-Yeh, C.; Chen-Kuo, C. Layer-Level Knowledge Distillation for Deep Neural Network Learning. Appl. Sci.
**2019**, 9, 1966. [Google Scholar] [CrossRef][Green Version] - Buslaev, A.; Iglovikov, V.I.; Khvedchenya, E.; Parinov, A.; Druzhinin, M.; Kalinin, A.A. Albumentations: Fast and Flexible Image Augmentations. Information
**2020**, 11, 125. [Google Scholar] [CrossRef][Green Version] - Pei, Z.; Xu, H.; Zhang, Y.; Guo, M.; Yang, Y.-H. Face Recognition via Deep Learning Using Data Augmentation Based on Orthogonal Experiments. Electronics
**2019**, 8, 1088. [Google Scholar] [CrossRef][Green Version] - Lorente, M.P.S.; Lopez, E.M.; Florez, L.A.; Espino, A.L.; Martínez, J.A.I.; de Miguel, A.S. Explaining Deep Learning-Based Driver Models. Appl. Sci.
**2021**, 11, 3321. [Google Scholar] [CrossRef] - Edwards, D.; Rawat, D.B. Study of Adversarial Machine Learning with Infrared Examples for Surveillance Applications. Electronics
**2020**, 9, 1284. [Google Scholar] [CrossRef] - Andriyanov, N.A.; Volkov, A.K.; Volkov, A.K.; Gladkikh, A.A.; Danilov, S.D. Automatic X-ray image analysis for aviation security within limited computing resources. IOP Conf. Ser. Mater. Sci. Eng.
**2020**, 862, 1–6. [Google Scholar] [CrossRef] - Gao, X.; Tan, Y.-A.; Jiang, H.; Zhang, Q.; Kuang, X. Boosting Targeted Black-Box Attacks via Ensemble Substitute Training and Linear Augmentation. Appl. Sci.
**2019**, 9, 2286. [Google Scholar] [CrossRef][Green Version] - Kwon, H.; Kim, Y.; Yoon, H.; Choi, D. Random Untargeted Adversarial Example on Deep Neural Network. Symmetry
**2018**, 10, 738. [Google Scholar] [CrossRef][Green Version] - Kwon, H.; Lee, J. Diversity Adversarial Training against Adversarial Attack on Deep Neural Networks. Symmetry
**2021**, 13, 428. [Google Scholar] [CrossRef] - Li, Y.; Wang, Y. Defense against Adversarial Attacks in Deep Learning. Appl. Sci.
**2019**, 9, 76. [Google Scholar] [CrossRef][Green Version] - Tao, P.; Feng, X.; Wen, C. Image Recognition Based on Two-Dimensional Principal Component Analysis Combining with Wavelet Theory and Frame Theory. J. Control. Sci. Eng.
**2018**, 2018, 9061796. [Google Scholar] [CrossRef][Green Version] - Valverde-Albacete, F.J.; Peláez-Moreno, C. The Singular Value Decomposition over Completed Idempotent Semifields. Mathematics
**2020**, 8, 1577. [Google Scholar] [CrossRef] - Andriyanov, N.; Andriyanov, D. Modeling and processing of SAR images. CEUR Workshop Proc.
**2020**, 2665, 89–92. [Google Scholar] - Vasil’ev, K.K.; Andriyanov, N.A. Image representation and processing using autoregressive random fields with multiple roots of characteristic equations. Intell. Syst. Ref. Libr.
**2020**, 175, 11–52. [Google Scholar] [CrossRef] - Civera, M.; Zanotti Fragonara, L.; Surace, C. Using Video Processing for the Full-Field Identification of Backbone Curves in Case of Large Vibrations. Sensors
**2019**, 19, 2345. [Google Scholar] [CrossRef][Green Version] - Civera, M.; Fragonara, L.Z.; Surace, C. A Computer Vision-Based Approach for Non-contact Modal Analysis and Finite Element Model Updating. In European Workshop on Structural Health Monitoring; Springer: Cham, Switzerland, 2021; Volume 127, pp. 481–493. [Google Scholar] [CrossRef]
- The USC-SIPI Image Database. Available online: http://sipi.usc.edu/database/ (accessed on 28 April 2021).
- Liu, F.; Seinstra, F.J. Adaptive Parallel Householder Bidiagonalization. In European Conference on Parallel Processing; Springer: Berlin/Heidelberg, Germany, 2009; pp. 821–833. [Google Scholar]
- Srinivasa, A.R. On the use of the upper triangular (or QR) decomposition for developing constitutive equations for Green-elastic materials. Int. J. Eng. Sci.
**2012**, 60, 1–12. [Google Scholar] [CrossRef] - Cybenko, G. Reducing Quantum Computations to Elementary Unitary Operations. Comput. Sci. Eng.
**2001**, 3, 27–32. [Google Scholar] [CrossRef] - Cakir, F.; He, K.; Xia, X.; Kulis, B.; Sclaroff, S. Deep Metric Learning to Rank. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 1–6. [Google Scholar]
- Ernst, K.; Baidyk, T. Improved method of handwritten digit recognition tested on MNIST database. Image Vis. Comput.
**2004**, 22, 971–981. [Google Scholar] [CrossRef] - Zhang, B.; Sargur, N.; Srihari, N. Fast k -Nearest Neighbor Classification Using Cluster-Based Trees. IEEE Trans. Pattern Anal. Mach. Intell.
**2004**, 26, 525–528. [Google Scholar] [CrossRef] [PubMed] - Image Dogs vs. Cats Dataset. Available online: https://www.kaggle.com/c/dogs-vs-cats (accessed on 29 April 2021).
- Tammina, S. Transfer learning using VGG-16 with Deep Convolutional Neural Network for Classifying Images. Int. J. Sci. Res. Publ.
**2019**, 9, 9420. [Google Scholar] [CrossRef] - Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J. Rethinking the Inception Architecture for Computer Vision. Available online: https://arxiv.org/pdf/1512.00567.pdf (accessed on 29 April 2021).

**Figure 3.**Images that have been visually attacked using different attack types: (

**a**) Attack No. 1; (

**b**) Attack No. 2; (

**c**) Attack No. 3; and (

**d**) Attack No. 4.

**Figure 5.**Handwritten numbers from the MNIST dataset [47].

Architecture | SNR = 0.2 | SNR = 0.5 | SNR = 1 | SNR = 1.5 | SNR = 2 |
---|---|---|---|---|---|

VGG-16 | 81.38 | 83.16 | 84.32 | 89.65 | 91.37 |

Inception-v3 | 82.55 | 84.01 | 85.11 | 90.32 | 92.16 |

CNN-5 | 62.22 | 65.96 | 67.02 | 71.37 | 78.12 |

Architecture | ${\mathit{S}}_{\mathit{c}}/{\mathit{S}}_{\mathit{i}}=0.0002$ | ${\mathit{S}}_{\mathit{c}}/{\mathit{S}}_{\mathit{i}}=0.0007$ | ${\mathit{S}}_{\mathit{c}}/{\mathit{S}}_{\mathit{i}}=0.0028$ | ${\mathit{S}}_{\mathit{c}}/{\mathit{S}}_{\mathit{i}}=0.0054$ | ${\mathit{S}}_{\mathit{c}}/{\mathit{S}}_{\mathit{i}}=0.01$ |
---|---|---|---|---|---|

VGG-16 | 89.65 | 88.32 | 85.11 | 83.07 | 80.08 |

Inception-v3 | 89.98 | 89.02 | 86.49 | 84.22 | 80.65 |

CNN-5 | 77.35 | 76.04 | 72.25 | 69.76 | 61.73 |

**Table 3.**Percentage of correct MNIST.DIGITS image recognitions without prevention of visual attacks.

Architecture | No Attack | Attack No. 1 | Attacks No. 1–2 | Attacks No. 1–3 | Attacks No. 1–4 |
---|---|---|---|---|---|

VGG-16 | 72.83 | 79.64 | 82.35 | 88.65 | 94.67 |

Inception-v3 | 74.60 | 80.32 | 83.18 | 90.32 | 95.84 |

CNN-5 | 63.22 | 72.18 | 73.27 | 81.37 | 90.12 |

**Table 4.**Percentage of correct Kaggle.Dogs vs. Cats image recognitions without prevention of visual attacks.

Architecture | No Attack | Attack No. 1 | Attacks No. 1–2 | Attacks No. 1–3 | Attacks No. 1–4 |
---|---|---|---|---|---|

VGG-16 | 66.91 | 73.65 | 79.78 | 86.63 | 90.08 |

Inception-v3 | 70.32 | 78.54 | 82.16 | 88.82 | 92.32 |

CNN-5 | 60.10 | 65.55 | 68.94 | 73.45 | 80.64 |

Architecture | No Attack | Attack No. 1 | Attacks No. 1–2 | Attacks No. 1–3 | Attacks No. 1–4 |
---|---|---|---|---|---|

VGG-16 | 72.83 | 81.12 | 83.63 | 90.08 | 97.22 |

Inception-v3 | 74.60 | 82.54 | 84.90 | 92.64 | 98.55 |

CNN-5 | 63.22 | 73.07 | 75.25 | 83.37 | 91.09 |

**Table 6.**Percentage of correct Kaggle.Dogs vs. Cats image recognitions with prevention of visual attacks.

Architecture | No Attack | Attack No. 1 | Attacks No. 1–2 | Attacks No. 1–3 | Attacks No. 1–4 |
---|---|---|---|---|---|

VGG-16 | 66.91 | 75.12 | 81.80 | 87.92 | 91.74 |

Inception-v3 | 70.32 | 79.24 | 83.74 | 90.11 | 94.82 |

CNN-5 | 60.10 | 68.55 | 70.24 | 75.50 | 84.62 |

**Table 7.**Percentage of correct MNIST.DIGITS image recognition with prevention of visual attacks and reduction of dimension.

Architecture | No Attack | Attack No. 1 | Attacks No. 1–2 | Attacks No. 1–3 | Attacks No. 1–4 |
---|---|---|---|---|---|

VGG-16 | 72.07 | 81.16 | 83.33 | 90.02 | 96.58 |

Inception-v3 | 74.31 | 82.73 | 84.76 | 92.78 | 98.02 |

CNN-5 | 61.18 | 73.08 | 74.19 | 82.11 | 88.58 |

**Table 8.**Percentage of correct Kaggle.Dogs vs. Cats image recognition with prevention of visual attacks and reduction of dimension.

Architecture | No Attack | Attack No. 1 | Attacks No. 1–2 | Attacks No. 1–3 | Attacks No. 1–4 |
---|---|---|---|---|---|

VGG-16 | 66.94 | 72.11 | 80.95 | 87.96 | 91.76 |

Inception-v3 | 70.55 | 77.03 | 83.34 | 89.55 | 94.62 |

CNN-5 | 60.16 | 63.96 | 68.67 | 73.30 | 81.18 |

Architecture | Accuracy |
---|---|

MNIST | |

AlexNet | 71.28 |

ResNet | 73.32 |

Xception | 75.61 |

Ours | 98.02 |

Kaggle.Dogs vs. Cats | |

AlexNet | 67.75 |

ResNet | 71.90 |

Xception | 71.82 |

Ours | 94.62 |

Dataset | 3 × 3 | 5 × 5 | 7 × 7 | 10 × 10 | 15 × 15 |
---|---|---|---|---|---|

MNIST | 89.96 | 91.90 | 92.95 | 94.52 | 97.71 |

Dogs vs. Cats | 85.15 | 85.73 | 86.88 | 89.12 | 91.10 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Andriyanov, N. Methods for Preventing Visual Attacks in Convolutional Neural Networks Based on Data Discard and Dimensionality Reduction. *Appl. Sci.* **2021**, *11*, 5235.
https://doi.org/10.3390/app11115235

**AMA Style**

Andriyanov N. Methods for Preventing Visual Attacks in Convolutional Neural Networks Based on Data Discard and Dimensionality Reduction. *Applied Sciences*. 2021; 11(11):5235.
https://doi.org/10.3390/app11115235

**Chicago/Turabian Style**

Andriyanov, Nikita. 2021. "Methods for Preventing Visual Attacks in Convolutional Neural Networks Based on Data Discard and Dimensionality Reduction" *Applied Sciences* 11, no. 11: 5235.
https://doi.org/10.3390/app11115235