#
Selective Poisoning Attack on Deep Neural Networks ^{†}

^{1}

^{2}

^{*}

^{†}

^{‡}

## Abstract

**:**

## 1. Introduction

- We proposes a selective poisoning attack method. We systematically organize the framework and describe the principle of the proposed scheme.
- We analyze the selective accuracy depending on the number of selective malicious data. We also analyze the iteration, distortion, and accuracy for the selective malicious data.
- Through experiments using MNIST [12], Fashion-MNIST [13], and CIFAR10 [14], we demonstrate the effectiveness of the proposed scheme. We present image samples of the malicious data with the chosen class for each dataset, and the selected malicious data are difficult to detect through human perception.

## 2. Related Work

#### 2.1. Neural Networks

#### 2.2. Exploratory Attack

#### 2.3. Causative Attack

## 3. Proposed Scheme

#### 3.1. Threat Model

#### 3.2. Proposed Method

Algorithm 1 Selective poisoning attack |

Description: |

${x}_{j}\in X$ with N instances ▹ original training dataset |

${x}_{i}^{{}^{\prime}}\in {X}^{{}^{\prime}}$ with ${N}^{{}^{\prime}}$ instances ▹ maliciously manipulated training data |

l ▹ number of iterations |

t ▹ test data |

${y}^{{}^{\prime}}$ ▹ chosen class |

Selective poisoning attack:
(${x}_{i}$, ${y}_{i}^{{}^{\prime}}$, l, ${N}^{{}^{\prime}}$) |

for i = 1 to ${N}^{{}^{\prime}}$ do |

Find ${x}_{i}$ with selective class ${y}^{{}^{\prime}}$ |

${x}_{i}^{{}^{\prime}}$← Generation malicious instance (${x}_{i}$, ${y}^{{}^{\prime}}$, l) |

Assign ${x}_{i}^{{}^{\prime}}$ to ${X}^{{}^{\prime}}$ |

end for |

A temporary training set ${X}_{T}$←X + ${X}^{{}^{\prime}}$ |

Build the model M training ${X}_{T}$ |

Record its classification accuracy on the test dataset t |

return M |

Generation malicious instance:
(${x}_{i}$, ${y}^{{}^{\prime}}$, l) |

${x}_{i}^{{}^{\prime}}\leftarrow {x}_{i}$ |

for l step do |

$loss\leftarrow Z{\left({x}_{i}^{{}^{\prime}}\right)}_{{y}^{{}^{\prime}}}-max\left(\right)open="\{"\; close="\}">Z{\left({x}_{i}^{{}^{\prime}}\right)}_{i}:i\ne {y}^{{}^{\prime}}$ |

Update ${x}_{i}^{{}^{\prime}}$ by minimizing the gradient of $loss$ |

end for |

return${x}_{i}^{{}^{\prime}}$ |

## 4. Experiment and Evaluation

#### 4.1. Datasets

#### 4.2. Pretraining of Models

#### 4.3. Generation of Malicious Training Data

#### 4.4. Experimental Results

## 5. Discussion

**Attack considerations.**In terms of the model, the chosen class accuracy can be changed according to the accuracy of the model. The accuracy is affected by the poisoning attack according to the classification result of the existing model. Moreover, the attacker needs to also consider the number of malicious data because the accuracy of a particular class depends on that number.

**Applications.**The proposed method can be applied to sensor systems. A sensor is based on the Internet of things (IoT) and displays the numerical value of the external environment or an image such as from CCTV. By applying a poisoning attack to such a sensor system, the performance of a specific part can be reduced. The proposed method can be similarly used in military applications and face recognition systems. If an attacker has to be prevented from recognizing a particular class correctly, the proposed method can be used to lower the accuracy of that particular class without compromising on the overall accuracy.

**Dataset.**According to the dataset used (MNIST, Fashion-MNIST, or CIFAR10), the selected class accuracy, iteration, and distortion in the proposed method are different. CIFAR10 is a three-dimensional image dataset with a 3072 (32, 32, 3)-pixel matrix and MNIST and Fashion-MNIST are one-dimensional image datasets with 784 (28, 28, 1)-pixel matrices. Therefore, since the number of pixels is relatively large, CIFAR10 has more iterations and distortion than MNIST and Fashion-MNIST.

**In terms of the defense.**To defend against a poisoning attack, management of training data are needed when training the model. Because the malicious data generated by the proposed method are similar to the original sample, they are difficult to identify with the human eye. Therefore, the integrity of the training data must be checked by comparing the number and the hash value. In addition, it is necessary to fix the setting value of the parameter of the model so that additional learning cannot be performed on the model after the training is completed.

**Comparison with the adversarial example.**The adversarial example and poisoning attacks differ in their target attacking methods. The adversarial example modulates the test data while a poisoning attack modifies the model parameters using malicious training data. For example, in a face recognition system, an adversarial example will manipulate the face of a certain person to deceive a specific person, but a poisoning attack will approach the training data of the model in advance to lower the recognition accuracy of a specific person.

## 6. Conclusions

## Author Contributions

## Acknowledgments

## Conflicts of Interest

## Appendix A

Layer Type | Shape |
---|---|

Convolutional+ReLU | [3, 3, 32] |

Convolutional+ReLU | [3, 3, 32] |

Max pooling | [2, 2] |

Convolutional+ReLU | [3, 3, 64] |

Convolutional+ReLU | [3, 3, 64] |

Max pooling | [2, 2] |

Fully connected+ReLU | [200] |

Fully connected+ReLU | [200] |

Softmax | [10] |

Parameter | MNIST and Fashion-MNIST | CIFAR10 |
---|---|---|

Learning rate | 0.1 | 0.001 |

Momentum | 0.9 | 0.9 |

Batch size | 128 | 128 |

Epochs | 50 | 50 |

Dropout/Delay rate | - | 0.5/10 |

Layer Type | CIFAR10 Shape |
---|---|

Convolution+ReLU | [3, 3, 64] |

Convolution+ReLU | [3, 3, 64] |

Max pooling | [2, 2] |

Convolution+ReLU | [3, 3, 128] |

Convolution+ReLU | [3, 3, 128] |

Max pooling | [2, 2] |

Convolution+ReLU | [3, 3, 256] |

Convolution+ReLU | [3, 3, 256] |

Convolution+ReLU | [3, 3, 256] |

Convolution+ReLU | [3, 3, 256] |

Max pooling | [2, 2] |

Convolution+ReLU | [3, 3, 512] |

Convolution+ReLU | [3, 3, 512] |

Convolution+ReLU | [3, 3, 512] |

Convolution+ReLU | [3, 3, 512] |

Max pooling | [2, 2] |

Convolution+ReLU | [3, 3, 512] |

Convolution+ReLU | [3, 3, 512] |

Convolution+ReLU | [3, 3, 512] |

Convolution+ReLU | [3, 3, 512] |

Max pooling | [2, 2] |

Fully connected+ReLU | [4096] |

Fully connected+ReLU | [4096] |

Softmax | [10] |

## References

- Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw.
**2015**, 61, 85–117. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations (ICLR2015), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Hinton, G.; Deng, L.; Yu, D.; Dahl, G.E.; Mohamed, A.R.; Jaitly, N.; Senior, A.; Vanhoucke, V.; Nguyen, P.; Sainath, T.N.; et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Process. Mag.
**2012**, 29, 82–97. [Google Scholar] [CrossRef] - Collobert, R.; Weston, J. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008; ACM: New York, NY, USA; pp. 160–167. [Google Scholar]
- Sun, Y.; Xue, B.; Zhang, M.; Yen, G.G. Automatically designing CNN architectures using genetic algorithm for image classification. arXiv
**2018**, arXiv:1808.03818. [Google Scholar] - Banharnsakun, A. Artificial bee colony algorithm for enhancing image edge detection. Evolv. Syst.
**2018**, 1–9. [Google Scholar] [CrossRef] - Deng, L.; Wang, Y.; Han, Z.; Yu, R. Research on insect pest image detection and recognition based on bio-inspired methods. Biosyst. Eng.
**2018**, 139–148. [Google Scholar] [CrossRef] - Barreno, M.; Nelson, B.; Joseph, A.D.; Tygar, J. The security of machine learning. Mach. Learn.
**2010**, 81, 121–148. [Google Scholar] [CrossRef] [Green Version] - Biggio, B.; Nelson, B.; Laskov, P. Poisoning attacks against support vector machines. In Proceedings of the 29th International Coference on International Conference on Machine Learning, Edinburgh, Scotland, 27 June–3 July 2012; Omnipress: Madison, WI, USA, 2012; pp. 1467–1474. [Google Scholar]
- McDaniel, P.; Papernot, N.; Celik, Z.B. Machine learning in adversarial settings. IEEE Secur. Priv.
**2016**, 14, 68–72. [Google Scholar] [CrossRef] - Kwon, H.; Yoon, H.; Park, K.W. Poisoning Attack on Deep Neural Network to Induce Fine-Grained Recognition Error. In Proceedings of the IEEE International Conference on Artificial Intelligence and Knowledge Engineering, Cagliari, Italy, 5–7 June 2019; Available online: http://hdl.handle.net/10203/262522 (accessed on 6 July 2019).
- LeCun, Y.; Cortes, C.; Burges, C.J. MNIST Handwritten Digit Database. AT&T Labs
**2010**, 2. Available online: http://yann.lecun.com/exdb/mnist (accessed on 1 July 2019). - Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms. arXiv
**2017**, arXiv:1708.07747. [Google Scholar] - Krizhevsky, A.; Nair, V.; Hinton, G. The CIFAR-10 Dataset. 2014. Available online: http://www.cs.toronto.edu/kriz/cifar.html (accessed on 1 July 2019).
- Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing properties of neural networks. In Proceedings of the International Conference on Learning Representations 2014, Banff, AB, Canada, 14–16 April 2014. [Google Scholar]
- Carlini, N.; Wagner, D. Towards evaluating the robustness of neural networks. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, 22–26 May 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 39–57. [Google Scholar]
- Papernot, N.; McDaniel, P.; Goodfellow, I.; Jha, S.; Celik, Z.B.; Swami, A. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, Abu Dhabi, United Arab Emirates, 2–6 April 2017; ACM: New York, NY, USA, 2017; pp. 506–519. [Google Scholar]
- Liu, Y.; Chen, X.; Liu, C.; Song, D. Delving into Transferable Adversarial Examples and Black-box Attacks. arXiv
**2017**, arXiv:1611.02770. [Google Scholar] - Moosavi Dezfooli, S.M.; Fawzi, A.; Fawzi, O.; Frossard, P. Universal adversarial perturbations. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. number EPFL-CONF-226156. [Google Scholar]
- Meng, D.; Chen, H. Magnet: A two-pronged defense against adversarial examples. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; ACM: New York, NY, USA, 2017; pp. 135–147. [Google Scholar]
- Shen, S.; Jin, G.; Gao, K.; Zhang, Y. Ape-gan: Adversarial perturbation elimination with gan. arXiv
**2017**, arXiv:1707.05474. [Google Scholar] - Papernot, N.; McDaniel, P.; Wu, X.; Jha, S.; Swami, A. Distillation as a defense to adversarial perturbations against deep neural networks. In Proceedings of the 2016 IEEE Symposium on Security and Privacy (SP), Security and Privacy (SP), San Jose, CA, USA, 22–26 May 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 582–597. [Google Scholar]
- Carlini, N.; Mishra, P.; Vaidya, T.; Zhang, Y.; Sherr, M.; Shields, C.; Wagner, D.; Zhou, W. Hidden Voice Commands. In Proceedings of the 25th USENIX Security Symposium, Austin, TX, USA, 10–12 August 2016; pp. 513–530. [Google Scholar]
- Carlini, N.; Wagner, D. Audio adversarial examples: Targeted attacks on speech-to-text. In Proceedings of the 2018 IEEE Security and Privacy Workshops (SPW), San Francisco, CA, USA, 24 May 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–7. [Google Scholar]
- Li, S.; Neupane, A.; Paul, S.; Song, C.; Krishnamurthy, S.V.; Chowdhury, A.K.R.; Swami, A. Adversarial perturbations against real-time video classification systems. arXiv
**2018**, arXiv:1807.00458. [Google Scholar] - Yang, C.; Wu, Q.; Li, H.; Chen, Y. Generative Poisoning Attack Method Against Neural Networks. arXiv
**2017**, arXiv:1703.01340. [Google Scholar] - Mozaffari-Kermani, M.; Sur-Kolay, S.; Raghunathan, A.; Jha, N.K. Systematic poisoning attacks on and defenses for machine learning in healthcare. IEEE J. Biomed. Health Inf.
**2015**, 19, 1893–1905. [Google Scholar] [CrossRef] [PubMed] - Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; pp. 2672–2680. [Google Scholar]
- Zoph, B.; Vasudevan, V.; Shlens, J.; Le, Q.V. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8697–8710. [Google Scholar]
- Ding, C.; Tao, D. Trunk-branch ensemble convolutional neural networks for video-based face recognition. IEEE Trans. Pattern Anal. Mach. Intell.
**2018**, 40, 1002–1014. [Google Scholar] [CrossRef] [PubMed] - Farag, W.; Saleh, Z. Traffic signs identification by deep learning for autonomous driving. IET
**2018**. [Google Scholar] [CrossRef] - Papernot, N.; McDaniel, P.; Jha, S.; Fredrikson, M.; Celik, Z.B.; Swami, A. The limitations of deep learning in adversarial settings. In Proceedings of the 2016 IEEE European Symposium on Security and Privacy (EuroS&P), Saarbrucken, Germany, 21–24 March 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 372–387. [Google Scholar]
- Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. TensorFlow: A System for Large-Scale Machine Learning. OSDI
**2016**, 16, 265–283. [Google Scholar] - LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE
**1998**, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version] - Kingma, D.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]

**Figure 2.**Chosen class accuracy of the model M according to the number of the selective malicious data.

**Table 3.**Sampling of selective posioning examples with each chosen class in CIFAR10. It shows selective poisoning examples with the same class in each row.

**Table 4.**The iteration, average distortion, total accuracy, and chosen class accuracy of M when the number of the selective malicious data are 2500.

Description | MNIST | Fashion-MNIST | CIFAR10 |
---|---|---|---|

Iteration | 400 | 400 | 6000 |

Average distortion | 3.56 | 2.58 | 67.24 |

Total accuracy | 89.7% | 81.2% | 80.9% |

Accuracy of chosen class | 43.2% | 41.7% | 55.3% |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kwon, H.; Yoon, H.; Park, K.-W.
Selective Poisoning Attack on Deep Neural Networks ^{†}. *Symmetry* **2019**, *11*, 892.
https://doi.org/10.3390/sym11070892

**AMA Style**

Kwon H, Yoon H, Park K-W.
Selective Poisoning Attack on Deep Neural Networks ^{†}. *Symmetry*. 2019; 11(7):892.
https://doi.org/10.3390/sym11070892

**Chicago/Turabian Style**

Kwon, Hyun, Hyunsoo Yoon, and Ki-Woong Park.
2019. "Selective Poisoning Attack on Deep Neural Networks ^{†}" *Symmetry* 11, no. 7: 892.
https://doi.org/10.3390/sym11070892