Auguring Fake Face Images Using Dual Input Convolution Neural Network
Abstract
:1. Introduction
- A dual branch CNN architecture is proposed to enlarge the view of the network with more prominent performance in auguring the fake faces.
- The study explores the blackbox approach of the DICNN model using SHAP to construct explanation-driven findings by utilizing shapely values.
2. Related Works
2.1. Deep Learning-Based Methods
2.2. Physical-Based Methods
2.3. Human Visual Performance
3. Materials and Methods
3.1. Data Collection and Pre-Processing
3.2. Proposed Method
3.2.1. Dual Input CNN Model
3.2.2. Explainable AI
3.3. Implementation
4. Results and Discussion
4.1. Model Explanation with DICNN
4.2. Model Explanation Using SHAP
4.3. Class-Wise Study of Proposed CNN Model
4.4. Comparison with the State-of-the-Art Methods
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Gaur, L.; Mallik, S.; Jhanjhi, N.Z. Introduction to DeepFake Technologies. In Proceedings of the DeepFakes: Creation, Detection, and Impact, New York, NY, USA, 8 September 2022. [Google Scholar] [CrossRef]
- Vairamani, A.D. Analyzing DeepFakes Videos by Face Warping Artifacts. In Proceedings of the DeepFakes: Creation, Detection, and Impact, New York, NY, USA, 8 September 2022. [Google Scholar] [CrossRef]
- Guo, M.H.; Xu, T.X.; Liu, J.J.; Liu, Z.N.; Jiang, P.T.; Mu, T.J.; Zhang, S.H.; Martin, R.R.; Cheng, M.M.; Hu, S.M. Attention mechanisms in computer vision: A survey. Comput. Vis. Media 2022, 8, 331–368. [Google Scholar] [CrossRef]
- Shahi, T.B.; Sitaula, C. Natural language processing for Nepali text: A review. Artif. Intell. Rev. 2021, 55, 3401–3429. [Google Scholar] [CrossRef]
- Sitaula, C.; Shahi, T.B. Monkeypox virus detection using pre-trained deep learning-based approaches. J. Med. Syst. 2022, 46, 1–9. [Google Scholar] [CrossRef] [PubMed]
- Gaur, L.; Sahoo, B.M. Introduction to Explainable AI and Intelligent Transportation. In Explainable Artificial Intelligence for Intelligent Transportation Systems: Ethics and Applications; Springer International Publishing: Cham, Switzerland, 2022; pp. 1–25. [Google Scholar] [CrossRef]
- Bhandari, M.; Panday, S.; Bhatta, C.P.; Panday, S.P. Image Steganography Approach Based Ant Colony Optimization with Triangular Chaotic Map. In Proceedings of the 2022 2nd International Conference on Innovative Practices in Technology and Management (ICIPTM), Gautam Buddha Nagar, India, 23–25 February 2022; Volume 2, pp. 429–434. [Google Scholar] [CrossRef]
- Wang, D.; Arzhaeva, Y.; Devnath, L.; Qiao, M.; Amirgholipour, S.; Liao, Q.; McBean, R.; Hillhouse, J.; Luo, S.; Meredith, D.; et al. Automated Pneumoconiosis Detection on Chest X-Rays Using Cascaded Learning with Real and Synthetic Radiographs. In Proceedings of the 2020 Digital Image Computing: Techniques and Applications (DICTA), Melbourne, Australia, 29 November–2 December 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Tran, L.; Yin, X.; Liu, X. Representation Learning by Rotating Your Faces. arXiv 2017. Available online: https://arxiv.org/abs/1705.11136 (accessed on 31 October 2022). [CrossRef] [Green Version]
- Suwajanakorn, S.; Seitz, S.M.; Kemelmacher-Shlizerman, I. Synthesizing obama: Learning lip sync from audio. ACM Trans. Graph. (ToG) 2017, 36, 1–13. [Google Scholar] [CrossRef]
- Thies, J.; Zollhofer, M.; Stamminger, M.; Theobalt, C.; Nießner, M. Face2face: Real-time face capture and reenactment of rgb videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2387–2395. [Google Scholar]
- Dang, H.; Liu, F.; Stehouwer, J.; Liu, X.; Jain, A.K. On the detection of digital face manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2020; pp. 5781–5790. [Google Scholar]
- Rossler, A.; Cozzolino, D.; Verdoliva, L.; Riess, C.; Thies, J.; Nießner, M. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1–11. [Google Scholar]
- Tolosana, R.; Vera-Rodriguez, R.; Fierrez, J.; Morales, A.; Ortega-Garcia, J. Deepfakes and beyond: A survey of face manipulation and fake detection. Inf. Fusion 2020, 64, 131–148. [Google Scholar] [CrossRef]
- Li, S.; Dutta, V.; He, X.; Matsumaru, T. Deep Learning Based One-Class Detection System for Fake Faces Generated by GAN Network. Sensors 2022, 22, 7767. [Google Scholar] [CrossRef]
- Wong, A.D. BLADERUNNER: Rapid Countermeasure for Synthetic (AI-Generated) StyleGAN Faces. 2022. Available online: https://doi.org/10.48550/ARXIV.2210.06587 (accessed on 31 October 2022).
- Zotov, E. StyleGAN-Based Machining Digital Twin for Smart Manufacturing. Ph.D. Thesis, University of Sheffield, Sheffield, UK, 2022. [Google Scholar]
- Karras, T.; Laine, S.; Aila, T. A Style-Based Generator Architecture for Generative Adversarial Networks. arXiv 2018. [CrossRef]
- Fu, J.; Li, S.; Jiang, Y.; Lin, K.Y.; Qian, C.; Loy, C.C.; Wu, W.; Liu, Z. Stylegan-human: A data-centric odyssey of human generation. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Berlin, Germany; pp. 1–19. [Google Scholar]
- Xu, Y.; Raja, K.; Pedersen, M. Supervised Contrastive Learning for Generalizable and Explainable DeepFakes Detection. In Proceedings of the Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, Waikoloa, HI, USA, 4–8 January 2022; pp. 379–389. [Google Scholar]
- Fu, Y.; Sun, T.; Jiang, X.; Xu, K.; He, P. Robust GAN-Face Detection Based on Dual-Channel CNN Network. In Proceedings of the 2019 12th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Suzhou, China, 19–21 October 2019; pp. 1–5. [Google Scholar] [CrossRef]
- Salman, F.M.; Abu-Naser, S.S. Classification of Real and Fake Human Faces Using Deep Learning. Int. J. Acad. Eng. Res. (IJAER) 2022, 6, 1–14. [Google Scholar]
- Zhang, Y.; Zheng, L.; Thing, V.L.L. Automated face swapping and its detection. In Proceedings of the 2017 IEEE 2nd International Conference on Signal and Image Processing (ICSIP), Singapore, 4–6 August 2017; pp. 15–19. [Google Scholar] [CrossRef]
- Huang, G.B.; Ramesh, M.; Berg, T.; Learned-Miller, E. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments; Technical Report 07-49; University of Massachusetts: Amherst, MA, USA, 2007. [Google Scholar]
- Guo, Z.; Hu, L.; Xia, M.; Yang, G. Blind detection of glow-based facial forgery. Multimed. Tools Appl. 2021, 80, 7687–7710. [Google Scholar] [CrossRef]
- Kingma, D.P.; Dhariwal, P. Glow: Generative flow with invertible 1x1 convolutions. Adv. Neural Inf. Process. Syst. 2018, 31. [Google Scholar]
- Durall, R.; Keuper, M.; Pfreundt, F.J.; Keuper, J. Unmasking deepfakes with simple features. arXiv 2019, arXiv:1911.00686. [Google Scholar]
- Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive Growing of GANs for Improved Quality, Stability, and Variation. arXiv 2017. [Google Scholar] [CrossRef]
- Gandhi, A.; Jain, S. Adversarial perturbations fool deepfake detectors. In Proceedings of the 2020 International joint conference on neural networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar]
- Yousaf, B.; Usama, M.; Sultani, W.; Mahmood, A.; Qadir, J. Fake visual content detection using two-stream convolutional neural networks. Neural Comput. Appl. 2022, 34, 7991–8004. [Google Scholar] [CrossRef]
- Bhandari, M.; Parajuli, P.; Chapagain, P.; Gaur, L. Evaluating Performance of Adam Optimization by Proposing Energy Index. In Proceedings of the Recent Trends in Image Processing and Pattern Recognition, University of Malta, Msida, Malta, 8–10 December 2021; Santosh, K., Hegadi, R., Pal, U., Eds.; Springer International Publishing: Cham, Switzerland; pp. 156–168. [Google Scholar]
- Hu, S.; Li, Y.; Lyu, S. Exposing GAN-Generated Faces Using Inconsistent Corneal Specular Highlights. In Proceedings of the ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 2500–2504. [Google Scholar] [CrossRef]
- Nightingale, S.; Agarwal, S.; Härkönen, E.; Lehtinen, J.; Farid, H. Synthetic faces: How perceptually convincing are they? J. Vis. 2021, 21, 2015. [Google Scholar] [CrossRef]
- Boulahia, H. Small Dataset of Real And Fake Human Faces for Model Testing. Kaggle 2022. [Google Scholar]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
- LeCun, Y.; Boser, B.; Denker, J.; Henderson, D.; Howard, R.; Hubbard, W.; Jackel, L. Handwritten digit recognition with a back-propagation network. Adv. Neural Inf. Process. Syst. 1989, 2. [Google Scholar]
- Sun, Y.; Zhu, L.; Wang, G.; Zhao, F. Multi-input convolutional neural network for flower grading. J. Electr. Comput. Eng. 2017, 2017. [Google Scholar] [CrossRef] [Green Version]
- Dua, N.; Singh, S.N.; Semwal, V.B. Multi-input CNN-GRU based human activity recognition using wearable sensors. Computing 2021, 103, 1461–1478. [Google Scholar] [CrossRef]
- Choi, J.; Cho, Y.; Lee, S.; Lee, J.; Lee, S.; Choi, Y.; Cheon, J.E.; Ha, J. Using a Dual-Input Convolutional Neural Network for Automated Detection of Pediatric Supracondylar Fracture on Conventional Radiography. Investig. Radiol. 2019, 55, 1. [Google Scholar] [CrossRef] [PubMed]
- Jiang, P.; Wen, C.K.; Jin, S.; Li, G.Y. Dual CNN-Based Channel Estimation for MIMO-OFDM Systems. IEEE Trans. Commun. 2021, 69, 5859–5872. [Google Scholar] [CrossRef]
- Naglah, A.; Khalifa, F.; Khaled, R.; Razek, A.A.K.A.; El-Baz, A. Thyroid Cancer Computer-Aided Diagnosis System using MRI-Based Multi-Input CNN Model. In Proceedings of the 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), Nice, France, 13–16 April 2021; pp. 1691–1694. [Google Scholar] [CrossRef]
- Gaur, L.; Bhandari, M.; Shikhar, B.S.; Nz, J.; Shorfuzzaman, M.; Masud, M. Explanation-Driven HCI Model to Examine the Mini-Mental State for Alzheimer’s Disease. ACM Trans. Multimed. Comput. Commun. Appl. 2022. [Google Scholar] [CrossRef]
- Gaur, L.; Bhandari, M.; Razdan, T.; Mallik, S.; Zhao, Z. Explanation-Driven Deep Learning Model for Prediction of Brain Tumour Status Using MRI Image Data. Front. Genet. 2022, 13. [Google Scholar] [CrossRef] [PubMed]
- Bhandari, M.; Shahi, T.B.; Siku, B.; Neupane, A. Explanatory classification of CXR images into COVID-19, Pneumonia and Tuberculosis using deep learning and XAI. Comput. Biol. Med. 2022, 150, 106156. [Google Scholar] [CrossRef]
- Bachmaier Winter, L. Criminal Investigation, Technological Development, and Digital Tools: Where Are We Heading? In Investigating and Preventing Crime in the Digital Era; Springer: Berlin, Germany, 2022; pp. 3–17. [Google Scholar]
- Ferreira, J.J.; Monteiro, M. The human-AI relationship in decision-making: AI explanation to support people on justifying their decisions. arXiv 2021, arXiv:2102.05460. [Google Scholar]
- Hall, S.W.; Sakzad, A.; Choo, K.K.R. Explainable artificial intelligence for digital forensics. Wiley Interdiscip. Rev. Forensic Sci. 2022, 4, e1434. [Google Scholar] [CrossRef]
- Veldhuis, M.S.; Ariëns, S.; Ypma, R.J.; Abeel, T.; Benschop, C.C. Explainable artificial intelligence in forensics: Realistic explanations for number of contributor predictions of DNA profiles. Forensic Sci. Int. Genet. 2022, 56, 102632. [Google Scholar] [CrossRef]
- Edwards, T.; McCullough, S.; Nassar, M.; Baggili, I. On Exploring the Sub-domain of Artificial Intelligence (AI) Model Forensics. In Proceedings of the International Conference on Digital Forensics and Cyber Crime, Virtual Event, Singapore, 6–9 December 2021; pp. 35–51. [Google Scholar]
- Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: London, UK, 2017; pp. 4765–4774. [Google Scholar]
- Van Rossum, G.; Drake, F.L. Python 3 Reference Manual; CreateSpace: Scotts Valley, CA, USA, 2009. [Google Scholar]
- Gulli, A.; Pal, S. Deep Learning with Keras; Packt Publishing Ltd.: Birmingham, UK, 2017. [Google Scholar]
- Carneiro, T.; Medeiros Da NóBrega, R.V.; Nepomuceno, T.; Bian, G.B.; De Albuquerque, V.H.C.; Filho, P.P.R. Performance Analysis of Google Colaboratory as a Tool for Accelerating Deep Learning Applications. IEEE Access 2018, 6, 61677–61685. [Google Scholar] [CrossRef]
- Rettberg, J.W.; Kronman, L.; Solberg, R.; Gunderson, M.; Bjørklund, S.M.; Stokkedal, L.H.; Jacob, K.; de Seta, G.; Markham, A. Representations of machine vision technologies in artworks, games and narratives: A dataset. Data Brief 2022, 42, 108319. [Google Scholar] [CrossRef]
Layer Name | Shape of Output | Param # | Connected to |
---|---|---|---|
Input 1 | (None, 224, 224, 3) | 0 | - |
Input 2 | (None, 224, 224, 3) | 0 | - |
Conv2D | (None, 222, 222, 32) | 896 | Input 1 |
Flatten 1 | (None, 150,528) | 0 | Input 2 |
Flatten 2 | (None, 1,577,088) | 0 | Conv2D |
Concatenate Layer | (None, 1,727,616) | 0 | [Flatten 1, Flatten 2] |
Dense 1 | (None, 224) | 386,986,208 | Concatenate Layer |
Dropout | (None, 224) | (None, 224) | Dense 1 |
Dense 2 | (None, 2) | 450 | Dropout |
Total params: 386,987,554 | |||
Trainable params: 386,987,554 | |||
Non-trainable params: 0 |
TA | TL | VA | VL | TsA | TsL | BP | |
---|---|---|---|---|---|---|---|
K1 | 99.90 | 0.0036 | 100.00 | 9.78 × 10 −5 | 99.00 | 0.04 | 0 |
K2 | 97.99 | 0.6236 | 98.45 | 0.2445 | 100.00 | 0.01 | 2 |
K3 | 99.90 | 7.84 × 10 −4 | 100.00 | 2.11 × 10 −5 | 99.00 | 0.09 | 0 |
K4 | 99.61 | 0.0082 | 100.00 | 0.0036 | 97.67 | 0.04 | 0 |
K5 | 99.32 | 0.9420 | 100.00 | 1.07 × 10 −5 | 99.22 | 0.03 | 0 |
K6 | 98.84 | 0.1851 | 97.67 | 0.3579 | 99.11 | 0.62 | 3 |
K7 | 98.74 | 0.1261 | 99.22 | 0.0632 | 99.22 | 0.07 | 1 |
K8 | 99.61 | 0.0122 | 100.00 | 0.0014 | 99.22 | 0.01 | 0 |
K9 | 99.71 | 0.0254 | 97.67 | 0.2454 | 98.45 | 0.30 | 3 |
K10 | 100.00 | 0.0037 | 100.00 | 0.0039 | 100.00 | 0.01 | 0 |
99.36 ± 0.62 | 0.19 ± 0.31 | 99.30 ± 0.94 | 0.092 ± 0.13 | 99.08 ± 0.64 | 0.122 ± 0.18 | 0.9 ± 1.22 |
Spec | Sen | Pre | Fsc | Rec | ||
---|---|---|---|---|---|---|
K1 | Fake | 99.34 | 100.00 | 98.31 | 99.98 | 99.15 |
Real | 100.00 | 99.34 | 98.56 | 98.78 | 99.56 | |
K2 | Fake | 97.26 | 100.00 | 96.55 | 98.25 | 100.00 |
Real | 100.00 | 97.26 | 100.00 | 98.61 | 97.26 | |
K3 | Fake | 100.00 | 100.00 | 100.00 | 99.12 | 99.34 |
Real | 100.00 | 100.00 | 99.54 | 98.67 | 99.76 | |
K4 | Fake | 99.50 | 100.00 | 98.12 | 99.34 | 99.89 |
Real | 100.00 | 99.50 | 98.90 | 99.38 | 98.86 | |
K5 | Fake | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Real | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | |
K6 | Fake | 96.25 | 100.00 | 100.00 | 98.09 | 96.25 |
Real | 100.00 | 96.25 | 94.23 | 97.03 | 100.00 | |
K7 | Fake | 96.10 | 99.25 | 100.00 | 99.20 | 97.34 |
Real | 99.25 | 96.10 | 95.32 | 98.30 | 99.89 | |
K8 | Fake | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Real | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | |
K9 | Fake | 95.71 | 100.00 | 100.00 | 97.81 | 95.71 |
Real | 100.00 | 95.71 | 95.16 | 97.52 | 100 | |
K10 | Fake | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Real | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | |
Fake | 98.41 ± 1.75 | 99.93 ± 0.23 | 99.23 ± 1.15 | 99.18 ± 0.81 | 98.77 ± 1.59 | |
Real | 99.93 ± 0.23 | 98.41 ± 1.75 | 98.17 ± 2.20 | 98.83 ± 0.98 | 99.53 ± 0.83 |
Ref | Category | Method | Dataset | Performance (%) | XAI |
---|---|---|---|---|---|
[20] | DL | Xception Network | 150,000 images | Acc: 83.99% | No |
[21] | DL | CNN | 60,000 images | Acc: 97.97% | No |
[22] | DL | dual-channel CNN | 9000 images | Acc: 100% | No |
[23] | DL | CNN | 321,378 face images | Acc: 92% | No |
[27] | DL | Naive classifiers | Faces-HQ | Acc: 100% | No |
[29] | DL | VGG | 10,000 real and fake image | Acc: 99.9% | No |
[29] | DL | ResNet | 10,000 real and fake image | Acc: 94.75% | No |
[30] | DL | Two Stream CNN | 30,000 images | Acc: 88.80% | No |
[32] | Physical | Corneal specular highlight | 1000 images | Acc: 94% | No |
[33] | Human | Visual | 400 images | Acc: 50-60% | No |
Ours | DL | DICNN | 1289 images | Acc: 99.36 ± 0.62 | SHAP |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bhandari, M.; Neupane, A.; Mallik, S.; Gaur, L.; Qin, H. Auguring Fake Face Images Using Dual Input Convolution Neural Network. J. Imaging 2023, 9, 3. https://doi.org/10.3390/jimaging9010003
Bhandari M, Neupane A, Mallik S, Gaur L, Qin H. Auguring Fake Face Images Using Dual Input Convolution Neural Network. Journal of Imaging. 2023; 9(1):3. https://doi.org/10.3390/jimaging9010003
Chicago/Turabian StyleBhandari, Mohan, Arjun Neupane, Saurav Mallik, Loveleen Gaur, and Hong Qin. 2023. "Auguring Fake Face Images Using Dual Input Convolution Neural Network" Journal of Imaging 9, no. 1: 3. https://doi.org/10.3390/jimaging9010003