Differentially Private Image Classification Using Support Vector Machine and Differential Privacy
Abstract
:1. Introduction
- k-anonymity: this approach was proposed by Sweeney in 2002 [6]. A dataset is said to be k-anonymous if every combination of identity-revealing characteristics occurs in at least k different rows of the dataset. This anonymization approach is vulnerable to such attacks as background knowledge attacks [7].
- t-closeness: this anonymization scheme was proposed by Li et al. in 2007 [9]. It is a refinement of l-diversity discussed above [8]. It requires that distribution of sensitive attributes within each quasi-identifier group should be “close” to their distribution in the entire original dataset (that is, the distance between the two distributions should be no more than a threshold t) [9].
- Differential Privacy (DP): It was proposed by Dwork et al. in 2006 [10]. Unlike anonymization schemes discussed above, DP provides information-theoretic guarantee that the participation of the individual(s) in a statistical database would not be revealed. It has since become a gold standard of privacy-preserving data analysis.
- Supervised learning: where the model uses both data and the corresponding labels. Supervised learning models/algorithms can be divided into classification and regression algorithms.
- Unsupervised learning: in this paradigm, there are no corresponding labels to data. The objective in this case is to generate the distribution that represent the data; hence why unsupervised learning models are also known as generative models.
- Reinforcement learning: this machine learning paradigm involves an agent which interacts with an environment, and gets either reward or penalty for the action taken while at a particular state. The ultimate purpose is for the agent to maximize cumulative reward.
2. Background Information
2.1. Support Vector Machine
2.2. Differential Privacy
2.3. Privacy-Preserving Machine Learning
3. Privacy-Preserving Image Classification Algorithm
Algorithm 1 Privacy-preserving Image classification |
Start |
1: Load image dataset |
2: Split dataset into train and test data |
3: Add Laplace noise to the train data to privatize it |
4: Train the model using Support Vector Machine (SVM) and noised data |
5: Test the model using unnoised test data |
End |
4. Results and Discussion
5. Conclusions
Funding
Acknowledgments
Conflicts of Interest
References
- Chaudhuri, K.; Monteleoni, C.; Sarwate, A.D. Differentially private empirical risk minimization. J. Mach. Learn. Res. 2011, 12, 1069–1109. [Google Scholar] [PubMed]
- Agrawal, R.; Srikant, R. Privacy-preserving data mining. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 15–18 May 2000; Volume 29. [Google Scholar]
- Du, W.; Zhan, Z. Using randomized response techniques for privacy-preserving data mining. In Proceedings of the ninth ACM SIGKDD International Conference On Knowledge Discovery and Data Mining, Washington, DC, USA, 24–27 August 2003; pp. 505–510. [Google Scholar]
- Xu, K.; Yue, H.; Guo, L.; Guo, Y.; Fang, Y. Privacy-preserving machine learning algorithms for big data systems. In Proceedings of the 2015 IEEE 35th International Conference on Distributed Computing Systems, Columbus, OH, USA, 29 June–2 July 2015; pp. 318–327. [Google Scholar]
- Shokri, R.; Shmatikov, V. Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer And Communications Security, Denver, CO, USA, 12–16 October 2015; pp. 1310–1321. [Google Scholar]
- Sweeney, L. k-anonymity: A model for protecting privacy. Int. J. Uncertain. Fuzz. Knowl. Based Syst. 2002, 10, 557–570. [Google Scholar] [CrossRef]
- Machanavajjhala, A.; Kifer, D.; Gehrke, J.; Venkitasubramaniam, M. L-diversity: Privacy Beyond K-anonymity. ACM Trans. Knowl. Discov. Data 2007, 1. [Google Scholar] [CrossRef]
- Aggarwal, C.C.; Philip, S.Y. A general survey of privacy-preserving data mining models and algorithms. In Privacy-Preserving Data Mining; Springer: Boston, MA, USA, 2008; pp. 11–52. [Google Scholar]
- Li, N.; Li, T.; Venkatasubramanian, S. t-closeness: Privacy beyond k-anonymity and l-diversity. In Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering, Istanbul, Turkey, 15–20 April 2007; pp. 106–115. [Google Scholar]
- Dwork, C.; McSherry, F.; Nissim, K.; Smith, A. Calibrating noise to sensitivity in private data analysis. In Proceedings of the Theory of Cryptography Conference, New York, NY, USA, 4–7 March 2006; pp. 265–284. [Google Scholar]
- Michalski, R.S.; Carbonell, J.G.; Mitchell, T.M. Machine Learning: An Artificial Intelligence Approach; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
- Russell, S.J.; Norvig, P. Artificial Intelligence: A Modern Approach; Pearson Education Limited: Kuala Lumpur, Malaysia, 2016. [Google Scholar]
- Scholkopf, B.; Smola, A.J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond; MIT Press: Cambridge, MA, USA, 2001. [Google Scholar]
- Goodfellow, I.; Bengio, Y.; Courville, A.; Bengio, Y. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; Volume 1. [Google Scholar]
- LeCun, Y.; Cortes, C. MNIST Handwritten Digit Database. 2010. Available online: http://yann.lecun.com/exdb/mnist/ (accessed on 29 December 2018).
- Dwork, C.; Kenthapadi, K.; McSherry, F.; Mironov, I.; Naor, M. Our data, ourselves: Privacy via distributed noise generation. In Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, 28 May–1 June 2006; pp. 486–503. [Google Scholar]
- Dwork, C.; Roth, A. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 2014, 9, 211–407. [Google Scholar] [CrossRef]
- Dwork, C. Differential privacy: A survey of results. In Proceedings of the International Conference on Theory and Applications of Models of Computation, Xi’an, China, 25–29 April 2008; pp. 1–19. [Google Scholar]
- Zhu, T.; Li, G.; Zhou, W.; Yu, P.S. Differential Privacy and Applications. In Advances in Information Security; Springer: Berlin/Heidelberg, Germany, 2017; Volume 69. [Google Scholar]
- Attoh-Okine, N.O. Big Data and Differential Privacy: Analysis Strategies for Railway Track Engineering; John Wiley & Sons: New York, NY, USA, 2017. [Google Scholar]
- Chaudhuri, K.; Hsu, D. Sample complexity bounds for differentially private learning. In Proceedings of the 24th Annual Conference on Learning Theory, Budapest, Hungary, 9–11 July 2011; pp. 155–186. [Google Scholar]
- Ji, Z.; Lipton, Z.C.; Elkan, C. Differential privacy and machine learning: A survey and review. arXiv, 2014; arXiv:1412.7584. [Google Scholar]
- Erlingsson, Ú.; Pihur, V.; Korolova, A. RAPPOR: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC Conference on Computer And Communications Security, Scottsdale, AZ, USA, 3–7 November 2014; pp. 1054–1067. [Google Scholar]
- Apple. Learning with privacy at scale. Appl. Mach. Learn. J. 2017, 1, 1–25. [Google Scholar]
- Senekane, M.; Mafu, M.; Taele, B.M. Privacy-preserving quantum machine learning using differential privacy. In Proceedings of the 2017 IEEE AFRICON, Cape Town, South Africa, 18–20 September 2017; pp. 1432–1435. [Google Scholar]
- Johnson, N.; Near, J.P.; Song, D. Towards practical differential privacy for SQL queries. Proc. VLDB Endowment 2018, 11, 526–539. [Google Scholar]
- Choi, W.S.; Tomei, M.; Vicarte, J.R.S.; Hanumolu, P.K.; Kumar, R. Guaranteeing local differential privacy on ultra-low-power systems. In Proceedings of the 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), Los Angeles, CA, USA, 1–6 June 2018; pp. 561–574. [Google Scholar]
- Abadi, M.; Chu, A.; Goodfellow, I.; McMahan, H.B.; Mironov, I.; Talwar, K.; Zhang, L. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016; pp. 308–318. [Google Scholar]
- Chaudhuri, K.; Monteleoni, C. Privacy-preserving logistic regression. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2009; pp. 289–296. [Google Scholar]
- Rubinstein, B.I.; Bartlett, P.L.; Huang, L.; Taft, N. Learning in a large function space: Privacy-preserving mechanisms for SVM learning. J. Priv. Confid. 2012, 4, 65–100. [Google Scholar] [CrossRef]
SVM | Linear Kernel Accuracy (%) | RBF Kernel Accuracy (%) |
---|---|---|
Pure | 97.8 | 98.6 |
DP-SVM | 97.2 | 98.3 |
SVM | Linear Kernel Confusion Matrix | RBF Kernel Confusion Matrix |
---|---|---|
pure | ||
DP-SVM |
Linear SVM Accuracy (%) | RBF SVM Accuracy (%) | |
---|---|---|
Pure SVM | 98.1 | 98.6 |
0.01 | 1.57 | 7.78 |
0.1 | 37.6 | 43.9 |
ln2 | 97.2 | 98.3 |
5 | 97.8 | 98.5 |
8 | 98.1 | 98.6 |
10 | 98.1 | 98.6 |
© 2019 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Senekane, M. Differentially Private Image Classification Using Support Vector Machine and Differential Privacy. Mach. Learn. Knowl. Extr. 2019, 1, 483-491. https://doi.org/10.3390/make1010029
Senekane M. Differentially Private Image Classification Using Support Vector Machine and Differential Privacy. Machine Learning and Knowledge Extraction. 2019; 1(1):483-491. https://doi.org/10.3390/make1010029
Chicago/Turabian StyleSenekane, Makhamisa. 2019. "Differentially Private Image Classification Using Support Vector Machine and Differential Privacy" Machine Learning and Knowledge Extraction 1, no. 1: 483-491. https://doi.org/10.3390/make1010029
APA StyleSenekane, M. (2019). Differentially Private Image Classification Using Support Vector Machine and Differential Privacy. Machine Learning and Knowledge Extraction, 1(1), 483-491. https://doi.org/10.3390/make1010029