Deepfakes Generation and Detection: A Short Survey
Abstract
:1. Introduction
2. Deepfake Generation and Detection
2.1. Identity Swap
2.1.1. Identity Swap Generation
2.1.2. Identity Swap Detection
2.2. Face Reenactment
2.2.1. Face Reenactment Generation
2.2.2. Face Reenactment Detection
2.3. Attribute Manipulation
2.3.1. Attribute Manipulation Generation
2.3.2. Attribute Manipulation Detection
2.4. Entire Face Synthesis
2.4.1. Entire Face Synthesis Generation
2.4.2. Entire Face Synthesis Detection
3. Open Issues and Research Directions
3.1. Generalization Capability
3.2. Explainability of Deepfake Detectors
3.3. Next-Generation Deepfake and Face Manipulation Generators
3.4. Vulnerability to Adversarial Attacks
3.5. Mobile Deepfake Detector
3.6. Lack of Large-Scale ML-Generated Databases
3.7. Reproducible Research
4. Conclusions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Available online: https://theconversation.com/3-2-billion-images-and-720-000-hours-of-video-are-shared-online-daily-can-you-sort-real-from-fake-148630 (accessed on 4 January 2023).
- Available online: https://www.nbcnews.com/business/consumer/so-it-s-fine-if-you-edit-your-selfies-not-n766186 (accessed on 4 January 2023).
- Dolhansky, B.; Bitton, J.; Pflaum, B.; Lu, J.; Howes, R.; Wang, M.; Ferrer, C. The deepfake detection challenge dataset. arXiv 2020, arXiv:2006.07397. [Google Scholar]
- Akhtar, Z.; Dasgupta, D.; Banerjee, B. Face Authenticity: An Overview of Face Manipulation Generation, Detection and Recognition. In Proceedings of the International Conference on Communication and Information Processing (ICCIP), Talegaon-Pune, India, 17–18 May 2019; pp. 1–8. [Google Scholar]
- Mirsky, Y.; Lee, W. The creation and detection of deepfakes: A survey. ACM Comput. Surv. 2021, 54, 1–41. [Google Scholar] [CrossRef]
- FaceApp Technology Limited. Available online: https://www.faceapp.com/ (accessed on 4 January 2023).
- Laan Labs. Available online: http://faceswaplive.com/ (accessed on 4 January 2023).
- Changsha Shenduronghe Network Technology Co., Ltd. Available online: https://apps.apple.com/cn/app/id1465199127 (accessed on 21 June 2022).
- DeepfakesWeb.com. Available online: https://deepfakesweb.com/ (accessed on 4 January 2023).
- PiVi&Co. Available online: https://apps.apple.com/us/app/agingbooth/id35746779 (accessed on 21 June 2022).
- Anthropics Technology Ltd. Available online: https://www.anthropics.com/portraitpro/ (accessed on 4 January 2023).
- Neocortext. Available online: https://hey.reface.ai/ (accessed on 4 January 2023).
- The Audacity Team. Available online: https://www.audacityteam.org/ (accessed on 4 January 2023).
- Magix Software GmbH. Available online: https://www.magix.com/us/music-editing/sound-forge/ (accessed on 4 January 2023).
- Adobe. Available online: https://www.photoshop.com/en (accessed on 4 January 2023).
- Collins, E.; Bala, R.; Price, B.; Susstrunk, S. Editing in style: Uncovering the local semantics of GANs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 5771–5780. Available online: https://github.com/IVRL/GANLocalEditing (accessed on 4 January 2023).
- He, Z.; Zuo, W.; Kan, M.; Shan, S.; Chen, X. AttGAN: Facial attribute editing by only changing what you want. IEEE Trans. Image Process. 2019, 28, 5464–5478. Available online: https://github.com/LynnHo/AttGAN-Tensorflow (accessed on 4 January 2023). [CrossRef] [Green Version]
- Roettgers, J. How AI Tech Is Changing Dubbing, Making Stars Like David Beckham Multilingual. 2019. Available online: https://variety.com/2019/biz/news/ai-dubbing-david-beckham-multilingual-1203309213/ (accessed on 4 January 2023).
- Lee, D. Deepfake Salvador Dali Takes Selfies with Museum Visitors, The Verge. 2019. Available online: https://www.theverge.com/2019/5/10/18540953/salvador-dali-lives-deepfake-museum (accessed on 4 January 2023).
- Güera, D.; Delp, E.J. Deepfake Video Detection Using Recurrent Neural Networks. In Proceedings of the 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Auckland, New Zealand, 27–30 November 2018; pp. 1–6. [Google Scholar]
- Diakopoulos, N.; Johnson, D. Anticipating and addressing the ethical implications of deepfakes in the context of elections. New Media Soc. 2021, 23, 2072–2098. [Google Scholar] [CrossRef]
- Pantserev, K. The malicious use of AI-based deepfake technology as the new threat to psychological security and political stability. In Cyber Defence in the Age of AI, Smart Societies and Augmented Humanity; Springer: Cham, Switzerland, 2020; pp. 37–55. [Google Scholar]
- Oliveira, L. The current state of fake news. Procedia Comput. Sci. 2017, 121, 817–825. [Google Scholar]
- Zhou, X.; Zafarani, R. A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Comput. Surv. (CSUR) 2020, 53, 1–40. [Google Scholar] [CrossRef]
- Kietzmann, J.; Lee, L.; McCarthy, I.; Kietzmann, T. Deepfakes: Trick or treat? Bus. Horiz. 2020, 63, 135–146. [Google Scholar] [CrossRef]
- Zakharov, E.; Shysheya, A.; Burkov, E.; Lempitsky, V. Few-Shot Adversarial Learning of Realistic Neural Talking Head Models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9458–9467. [Google Scholar]
- Damiani, J. A Voice Deepfake Was Used to Scam a CEO Out of $243,000. 2019. Available online: https://www.forbes.com/sites/jessedamiani/2019/09/03/a-voice-deepfake-was-used-to-scam-a-ceo-out-of-243000/?sh=173f55a52241 (accessed on 4 January 2023).
- Korshunov, P.; Marcel, S. Vulnerability assessment and detection of Deepfake videos. In Proceedings of the International Conference on Biometrics (ICB), Crete, Greece, 4–7 June 2019; pp. 1–6. [Google Scholar]
- Scherhag, U.; Nautsch, A.; Rathgeb, C.; Gomez-Barrero, M.; Veldhuis, R.N.; Spreeuwers, L.; Schils, M.; Maltoni, D.; Grother, P.; Marcel, S.; et al. Biometric Systems under Morphing Attacks: Assessment of Morphing Techniques and Vulnerability Reporting. In Proceedings of the International Conference of the Biometrics Special Interest Group, Darmstadt, Germany, 20–22 September 2017; pp. 1–7. [Google Scholar]
- Rathgeb, C.; Drozdowski, P.; Busch, C. Detection of Makeup Presentation Attacks based on Deep Face Representations. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR), Virtual Event, 10–15 January 2021; pp. 3443–3450. [Google Scholar]
- Majumdar, P.; Agarwal, A.; Singh, R.; Vatsa, M. Evading Face Recognition via Partial Tampering of Faces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 17–16 June 2019; pp. 11–20. [Google Scholar]
- Ferrara, M.; Franco, A.; Maltoni, D.; Sun, Y. On the impact of alterations on face photo recognition accuracy. In Proceedings of the International Conference on Image Analysis and Processing, Naples, Italy, 9–13 September 2013; pp. 743–751. [Google Scholar]
- Yang, L.; Song, Q.; Wu, Y. Attacks on state-of-the-art face recognition using attentional adversarial attack generative network. Multimed. Tools Appl. 2021, 80, 855–875. [Google Scholar] [CrossRef]
- Colbois, L.; Pereira, T.; Marcel, S. On the use of automatically generated synthetic image datasets for benchmarking face recognition. arXiv 2021, arXiv:2106.04215. [Google Scholar]
- Huang, C.-Y.; Lin, Y.Y.; Lee, H.-Y.; Lee, L.-S. Defending Your Voice: Adversarial Attack on Voice Conversion. In Proceedings of the IEEE Spoken Language Technology Workshop (SLT), Virtual, 19–22 January 2021; pp. 552–559. [Google Scholar]
- Akhtar, Z.; Mouree, M.R.; Dasgupta, D. Utility of Deep Learning Features for Facial Attributes Manipulation Detection. In Proceedings of the IEEE International Conference on Humanized Computing and Communication with Artificial Intelligence (HCCAI), Irvine, CA, USA, 21–23 September 2020; pp. 55–60. [Google Scholar]
- Akhtar, Z.; Dasgupta, D. A Comparative Evaluation of Local Feature Descriptors for DeepFakes Detection. In Proceedings of the IEEE International Symposium on Technologies for Homeland Security (HST), Woburn, WA, USA, 5–6 November 2019; pp. 1–5. [Google Scholar]
- Bekci, B.; Akhtar, Z.; Ekenel, H.K. Cross-Dataset Face Manipulation Detection. In Proceedings of the 28th Signal Processing and Communications Applications Conference (SIU), Gaziantep, Türkiye, 5–7 October 2020; pp. 1–4. [Google Scholar]
- Khodabakhsh, A.; Akhtar, Z. Unknown presentation attack detection against rational attackers. IET Biom. 2021, 10, 1–20. [Google Scholar] [CrossRef]
- Yavuzkilic, S.; Sengur, A.; Aktar, Z.; Siddique, K. Spotting DeepFakes and Face Manipulations by Fusing Features from Multi-Stream CNNs Models. Symmetry 2021, 13, 1352. [Google Scholar] [CrossRef]
- Wang, T.; Cheng, H.; Chow, K.; Nie, L. Deep convolutional pooling transformer for deepfake detection. arXiv 2022, arXiv:2209.05299. [Google Scholar]
- Kaddar, B.; Fezza, S.; Hamidouche, W.; Akhtar, Z.; Hadid, A. HCiT: Deepfake Video Detection Using a Hybrid Model of CNN features and Vision Transformer. In Proceedings of the 2021 IEEE Visual Communications and Image Processing (VCIP), Munich, Germany, 5–10 December 2021; pp. 1–5. [Google Scholar]
- Yavuzkiliç, S.; Akhtar, Z.; Sengür, A.; Siddique, K. DeepFake Face Video Detection using Hybrid Deep Residual Networks and LSTM Architecture. In AI and Deep Learning in Biometric Security: Trends, Potential and Challenges; CRC Press: Boca Raton, FL, USA, 2021; pp. 81–104. [Google Scholar]
- Hussain, S.; Neekhara, P.; Jere, M.; Koushanfar, F.; McAuley, J. Adversarial deepfakes: Evaluating vulnerability of deepfake detectors to adversarial examples. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021; pp. 3348–3357. [Google Scholar]
- Lim, S.-Y.; Chae, D.-K.; Lee, S.-C. Detecting Deepfake Voice Using Explainable Deep Learning Techniques. Appl. Sci. 2022, 12, 3926. [Google Scholar] [CrossRef]
- Mehta, V.; Gupta, P.; Subramanian, R.; Dhall, A. FakeBuster: A DeepFakes detection tool for video conferencing scenarios. In Proceedings of the International Conference on Intelligent User Interfaces-Companion, College Station, TX, USA, 13–17 April 2021; pp. 61–63. [Google Scholar]
- Juefei-Xu, F.; Wang, R.; Huang, Y.; Guo, Q.; Ma, L.; Liu, Y. Countering Malicious DeepFakes: Survey, Battleground, and Horizon. Int. J. Comput. Vis. 2022, 130, 1678–1734. [Google Scholar] [CrossRef] [PubMed]
- Lu, Z.; Li, Z.; Cao, J.; He, R.; Sun, Z. Recent progress of face image synthesis. In Proceedings of the 4th IAPR Asian Conference on Pattern Recognition (ACPR), Nanjing, China, 26–29 November 2017; pp. 7–12. [Google Scholar]
- Zhang, T. Deepfake generation and detection, a survey. Multimed. Tools Appl. 2022, 81, 6259–6276. [Google Scholar] [CrossRef]
- Mustak, M.; Salminen, J.; Mäntymäki, M.; Rahman, A.; Dwivedi, Y. Deepfakes: Deceptions, mitigations, and opportunities. J. Bus. Res. 2023, 154, 113368. [Google Scholar] [CrossRef]
- Tolosana, R.; Vera-Rodriguez, R.; Fierrez, J.; Morales, A. Ortega-Garcia. Deepfakes and beyond: A survey of face manipulation and fake detection. Inf. Fusion 2020, 64, 131–148. [Google Scholar]
- Korshunova, I.; Shi, W.; Dambre, J.; Theis, L. Fast Face-Swap Using Convolutional Neural Networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–27 October 2017; pp. 3697–3705. [Google Scholar]
- Nirkin, Y.; Masi, I.; Tuan, A.T.; Hassner, T.; Medioni, G. On Face Segmentation, Face Swapping, and Face Perception. In Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition, Xi’an, China, 15–19 May 2018; pp. 98–105. [Google Scholar]
- Mahajan, S.; Chen, L.; Tsai, T. SwapItUp: A Face Swap Application for Privacy Protection. In Proceedings of the IEEE 31st International Conference on Advanced Information Networking and Applications (AINA), Taipei, Taiwan, 27–29 March 2017; pp. 46–50. [Google Scholar]
- Wang, H.; Dongliang, X.; Wei, L. Robust and Real-Time Face Swapping Based on Face Segmentation and CANDIDE-3. In Proceedings of the PRICAI 2018: Trends in Artificial Intelligence, Nanjing, China, 28–31 August 2018; pp. 335–342. [Google Scholar]
- Natsume, R.; Yatagawa, T.; Morishima, S. RSGAN: Face Swapping and Editing Using Face and Hair Representation in Latent Spaces. arXiv 2018, arXiv:1804.03447. [Google Scholar]
- Yan, S.; He, S.; Lei, X.; Ye, G.; Xie, Z. Video face swap based on autoencoder generation network. In Proceedings of the International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, China, 16–17 July 2018; pp. 103–108. [Google Scholar]
- Zhou, H.; Liu, Y.; Liu, Z.; Luo, P.; Wang, X. Talking face generation by adversarially disentangled audio-visual representation. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 9299–9306. [Google Scholar]
- Li, L.; Bao, J.; Yang, H.; Chen, D.; Wen, F. Faceshifter: Towards high fidelity and occlusion aware face swapping. arXiv 2019, arXiv:1912.13457. [Google Scholar]
- Li, L.; Bao, J.; Yang, H.; Chen, D.; Wen, F. Advancing High Fidelity Identity Swapping for Forgery Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 19–13 June 2020; pp. 5073–5082. [Google Scholar]
- Chen, R.; Chen, X.; Ni, B.; Ge, Y. SimSwap: An Efficient Framework For High Fidelity Face Swapping. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12–16 October 2020; pp. 2003–2011. [Google Scholar]
- Koopman, M.; Rodriguez, A.; Geradts, Z. Detection of deepfake video manipulation. In Proceedings of the 20th Irish Machine Vision and Image Processing Conference (IMVIP), Coleraine, UK, 29–31 August 2018; pp. 133–136. [Google Scholar]
- Li, Y.; Lyu, S. Exposing DeepFake Videos by Detecting Face Warping Artifacts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 16–17 June 2019; pp. 1–7. [Google Scholar]
- Li, Y.; Chang, M.; Lyu, S. In ictu oculi: Exposing ai generated fake face videos by detecting eye blinking. arXiv 2018, arXiv:1806.02877. [Google Scholar]
- Amerini, I.; Galteri, L.; Caldelli, R.; Del Bimbo, A. Deepfake Video Detection through Optical Flow Based CNN. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Republic of Korea, 27–28 October 2019; pp. 1205–1207. [Google Scholar]
- Fernandes, S.; Raj, S.; Ortiz, E.; Vintila, I.; Salter, M.; Urosevic, G.; Jha, S. Predicting Heart Rate Variations of Deepfake Videos using Neural ODE. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Republic of Korea, 27–28 October 2019; pp. 1721–1729. [Google Scholar]
- Tariq, S.; Lee, S.; Woo, S. A Convolutional LSTM based Residual Network for Deepfake Video Detection. arXiv 2020, arXiv:2009.07480. [Google Scholar]
- Chan, C.C.K.; Kumar, V.; Delaney, S.; Gochoo, M. Combating Deepfakes: Multi-LSTM and Blockchain as Proof of Authenticity for Digital Media. In Proceedings of the IEEE/ITU International Conference on Artificial Intelligence for Good (AI4G), Virtual, 21–23 September 2020; pp. 55–62. [Google Scholar]
- Zhu, K.; Wu, B.; Wang, B. Deepfake Detection with Clustering-based Embedding Regularization. In Proceedings of the IEEE Fifth International Conference on Data Science in Cyberspace (DSC), Hong Kong, 27–30 July 2020; pp. 257–264. [Google Scholar]
- Nirkin, Y.; Wolf, L.; Keller, Y.; Hassner, T. DeepFake detection based on the discrepancy between the face and its context. arXiv 2020, arXiv:2008.12262. [Google Scholar]
- Frick, R.A.; Zmudzinski, S.; Steinebach, M. Detecting “DeepFakes” in H.264 Video Data Using Compression Ghost Artifacts. Electron. Imaging 2020, 32, 116-1. [Google Scholar] [CrossRef]
- Kumar, A.; Bhavsar, A.; Verma, R. Detecting deepfakes with metric learning. In Proceedings of the IEEE International Workshop on Biometrics and Forensics (IWBF), Porto, Portugal, 29–30 April 2020; pp. 1–6. [Google Scholar]
- Bonettini, N.; Cannas, E.; Mandelli, S.; Bondi, L.; Bestagini, P.; Tubaro, S. Video Face Manipulation Detection Through Ensemble of CNNs. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR), Virtual Event, 10–15 January 2021; pp. 5012–5019. [Google Scholar]
- Cozzolino, D.; Rössler, A.; Thies, J.; Nießner, M.; Verdoliva, L. ID-Reveal: Identity-aware DeepFake Video Detection. arXiv 2020, arXiv:2012.02512. [Google Scholar]
- Wang, J.; Wu, Z.; Ouyang, W.; Han, X.; Chen, J.; Jiang, Y.; Li, S. M2TR: Multi-modal multi-scale transformers for deepfake detection. In Proceedings of the International Conference on Multimedia Retrieval, Newark, NJ, USA, 27–30 June 2022; pp. 615–623. [Google Scholar]
- Chugh, K.; Gupta, P.; Dhall, A.; Subramanian, R. Not made for each other-Audio-Visual Dissonance-based Deepfake Detection and Localization. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12–16 October 2020; pp. 439–447. [Google Scholar]
- Zhao, H.; Zhou, W.; Chen, D.; Wei, T.; Zhang, W.; Yu, N. Multi-attentional deepfake detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 2185–2194. [Google Scholar]
- Trinh, L.; Tsang, M.; Rambhatla, S.; Liu, Y. Interpretable and Trustworthy Deepfake Detection via Dynamic Prototypes. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021; pp. 1973–1983. [Google Scholar]
- Aneja, S.; Nießner, M. Generalized Zero and Few-Shot Transfer for Facial Forgery Detection. arXiv 2020, arXiv:2006.11863. [Google Scholar]
- Liu, S.; Lian, Z.; Gu, S.; Xiao, L. Block shuffling learning for Deepfake Detection. arXiv 2022, arXiv:2202.02819. [Google Scholar]
- Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8110–8119. [Google Scholar]
- Thies, J.; Zollhofer, M.; Stamminger, M.; Theobalt, C.; Nießner, M. Face2face: Real-time face capture and reenactment of RGB videos. In Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2387–2395. [Google Scholar]
- Kim, H.; Garrido, P.; Tewari, A.; Xu, W.; Thies, J.; Niessner, M.; Pérez, P.; Richardt, C.; Zollhofer, M.; Theobalt, C. Deep video portraits. ACM Trans. Graph. (TOG) 2018, 37, 1–4. [Google Scholar] [CrossRef]
- Nirkin, Y.; Keller, Y.; Hassner, T. FSGAN: Subject agnostic face swapping and reenactment. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 7184–7193. [Google Scholar]
- Zhang, J.; Zeng, X.; Wang, M.; Pan, Y.; Liu, L.; Liu, Y.; Ding, Y.; Fan, C. Freenet: Multi-identity face reenactment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 5326–5335. [Google Scholar]
- Doukas, M.; Koujan, M.; Sharmanska, V.; Roussos, A.; Zafeiriou, S. Head2Head++: Deep Facial Attributes Re-Targeting. IEEE Trans. Biom. Behav. Identity Sci. 2021, 3, 31–43. [Google Scholar] [CrossRef]
- Cao, M.; Huang, H.; Wang, H.; Wang, X.; Shen, L.; Wang, S.; Bao, L.; Li, L.; Luo, J. Task-agnostic Temporally Consistent Facial Video Editing. arXiv 2020, arXiv:2007.01466. [Google Scholar]
- Cozzolino, D.; Thies, J.; Rossler, A.; Riess, C.; Niener, M.; Verdoliva, L. Forensictransfer: Weakly-supervised domain adaptation for forgery detection. arXiv 2018, arXiv:1812.02510. [Google Scholar]
- Matern, F.; Riess, C.; Stamminger, M. Exploiting Visual Artifacts to Expose DeepFakes and Face Manipulations. In Proceedings of the IEEE Winter Applications of Computer Vision Workshops, Waikoloa Village, HI, USA, 7–11 January 2019; pp. 1–10. [Google Scholar]
- Rossler, A.; Cozzolino, D.; Verdoliva, L.; Riess, C.; Thies, J.; Nießner, M. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 1–11. [Google Scholar]
- Sabir, E.; Cheng, J.; Jaiswal, A.; AbdAlmageed, W.; Masi, I.; Natarajan, P. Recurrent Convolutional Strategies for Face Manipulation Detection in Videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–17 June 2019; pp. 1–8. [Google Scholar]
- Kumar, P.; Vatsa, M.; Singh, R. Detecting face2face facial reenactment in videos. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Snowmass Village, CO, USA, 2–5 March 2020; pp. 2589–2597. [Google Scholar]
- Wang, Y.; Dantcheva, A. A video is worth more than 1000 lies. Comparing 3DCNN approaches for detecting deepfakes. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition (FG), Virtual, 16–20 November 2020; pp. 515–519. [Google Scholar]
- Zhao, X.; Yu, Y.; Ni, R.; Zhao, Y. Exploring Complementarity of Global and Local Spatiotemporal Information for Fake Face Video Detection. In Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 22–27 May 2022; pp. 2884–2888. [Google Scholar]
- Berthouzoz, F.; Li, W.; Dontcheva, M.; Agrawala, M. A Framework for content-adaptive photo manipulation macros: Application to face, landscape, and global manipulations. ACM Trans. Graph. 2011, 30, 1–14. [Google Scholar] [CrossRef]
- Lu, J.; Sunkavalli, K.; Carr, N.; Hadap, S.; Forsyth, D. A visual representation for editing face images. arXiv 2016, arXiv:1612.00522. [Google Scholar]
- Ning, X.; Xu, S.; Nan, F.; Zeng, Q.; Wang, C.; Cai, W.; Jiang, Y. Face editing based on facial recognition features. IEEE Trans. Cogn. Dev. Syst. 2022. preprint. [Google Scholar] [CrossRef]
- Xiao, T.; Hong, J.; Ma, J. Elegant: Exchanging latent encodings with gan for transferring multiple face attributes. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 168–184. [Google Scholar]
- Zhang, G.; Kan, M.; Shan, S.; Chen, X. Generative adversarial network with spatial attention for face attribute editing. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 417–432. [Google Scholar]
- Sun, R.; Huang, C.; Zhu, H.; Ma, L. Mask-aware photorealistic facial attribute manipulation. J. Comput. Visual Media 2021, 7, 1–12. [Google Scholar] [CrossRef]
- Choi, Y.; Choi, M.; Kim, M.; Ha, J.; Kim, S.; Choo, J. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8789–8797. [Google Scholar]
- Huang, D.; Tao, X.; Lu, J.; Do, M.N. Geometry-Aware GAN for Face Attribute Transfer. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 729–733. [Google Scholar]
- Wei, Y.; Gan, Z.; Li, W.; Lyu, S.; Chang, M.; Zhang, L.; Gao, J.; Zhang, P. MagGAN: High-Resolution Face Attribute Editing with Mask-Guided Generative Adversarial Network. In Proceedings of the Asian Conference on Computer Vision, Kyoto, Japan, 30 November–4 December 2020; pp. 1–18. [Google Scholar]
- Xu, Z.; Yu, X.; Hong, Z.; Zhu, Z.; Han, J.; Liu, J.; Ding, E.; Bai, X. FaceController: Controllable Attribute Editing for Face in the Wild. arXiv 2021, arXiv:2102.11464. [Google Scholar] [CrossRef]
- Ferrara, M.; Franco, A.; Maltoni, D. The magic passport. In Proceedings of the IEEE International Joint Conference on Biometrics, Clearwater, FL, USA, 29 September–2 October 2014; pp. 1–7. [Google Scholar]
- Bharati, A.; Singh, R.; Vatsa, M.; Bowyer, K. Detecting facial retouching using supervised deep learning. IEEE Trans. Inf. Secur. 2016, 11, 1903–1913. [Google Scholar] [CrossRef]
- Dang, L.M.; Hassan, S.I.; Im, S.; Moon, H. Face image manipulation detection based on a convolutional neural network. Expert Syst. Appl. 2019, 129, 156–168. [Google Scholar] [CrossRef]
- Rathgeb, C.; Satnoianu, C.-I.; Haryanto, N.E.; Bernardo, K.; Busch, C. Differential Detection of Facial Retouching: A Multi-Biometric Approach. IEEE Access 2020, 8, 106373–106385. [Google Scholar] [CrossRef]
- Guo, Z.; Yang, G.; Chen, J.; Sun, X. Fake face detection via adaptive residuals extraction network. arXiv 2020, arXiv:2005.04945. [Google Scholar]
- Mazaheri, G.; Roy-Chowdhury, A. Detection and Localization of Facial Expression Manipulations. arXiv 2021, arXiv:2103.08134. [Google Scholar]
- Kim, D.; Kim, D.; Kim, K. Facial Manipulation Detection Based on the Color Distribution Analysis in Edge Region. arXiv 2021, arXiv:2102.01381. [Google Scholar]
- Scherhag, U.; Debiasi, L.; Rathgeb, C.; Busch, C.; Uhl, A. Detection of Face Morphing Attacks Based on PRNU Analysis. IEEE Trans. Biom. Behav. Identit-Sci. 2019, 1, 302–317. [Google Scholar] [CrossRef]
- Zhao, J.; Mathieu, M.; LeCun, Y. Energy-based generative adversarial network. arXiv 2016, arXiv:1609.03126. [Google Scholar]
- Kossaifi, J.; Tran, L.; Panagakis, Y.; Pantic, M. Gagan: Geometry-aware generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 878–887. [Google Scholar]
- Kaneko, T.; Hiramatsu, K.; Kashino, K. Generative attribute controller with conditional filtered generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6089–6098. [Google Scholar]
- Berthelot, D.; Schumm, T.; Metz, L. Began: Boundary equilibrium generative adversarial networks. arXiv 2017, arXiv:1703.10717. [Google Scholar]
- Liu, M.; Tuzel, O. Coupled generative adversarial networks. In Proceedings of the Advances in Neural Information Processing Systems 29 (NIPS 2016), Barcelona, Spain, 5–10 December 2016; pp. 469–477. [Google Scholar]
- Kingma, D.; Dhariwal, P. Glow: Generative flow with invertible 1 × 1 convolutions. arXiv 2018, arXiv:1807.03039. [Google Scholar]
- Schonfeld, E.; Schiele, B.; Khoreva, A. A u-net based discriminator for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8207–8216. [Google Scholar]
- Choi, H.; Park, C.; Lee, K. From inference to generation: End-to-end fully self-supervised generation of human face from speech. arXiv 2020, arXiv:2004.05830. [Google Scholar]
- Curtó, J.; Zarza, I.; De La Torre, F.; King, I.; Lyu, M. High-resolution deep convolutional generative adversarial networks. arXiv 2017, arXiv:1711.06491. [Google Scholar]
- Lin, J.; Zhang, R.; Ganz, F.; Han, S.; Zhu, J. Anycost gans for interactive image synthesis and editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 14986–14996. [Google Scholar]
- Chen, S.; Liu, F.; Lai, Y.; Rosin, P.; Li, C.; Fu, H.; Gao, L. DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control. arXiv 2021, arXiv:2105.08935. [Google Scholar]
- McCloskey, S.; Albright, M. Detecting gan-generated imagery using color cues. arXiv 2018, arXiv:1812.08247. [Google Scholar]
- Yu, N.; Davis, L.; Fritz, M. Attributing fake images to gans: Learning and analyzing gan fingerprints. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 Octover–2 November 2019; pp. 7556–7566. [Google Scholar]
- Marra, F.; Gragnaniello, D.; Verdoliva, L.; Poggi, G. Do GANs leave artificial fingerprints? In Proceedings of the IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), San Jose, CA, USA, 28–30 March 2019; pp. 506–511. [Google Scholar]
- Nataraj, L.; Mohammed, T.M.; Manjunath, B.S.; Chandrasekaran, S.; Flenner, A.; Bappy, J.H.; Roy-Chowdhury, A. Detecting GAN generated Fake Images using Co-occurrence Matrices. Electron. Imaging. 2019, 2019, 1–7. [Google Scholar] [CrossRef] [Green Version]
- Wang, R.; Juefei-Xu, F.; Ma, L.; Xie, X.; Huang, Y.; Wang, J.; Liu, Y. Fakespotter: A simple yet robust baseline for spotting ai-synthesized fake faces. arXiv 2019, arXiv:1909.06122. [Google Scholar]
- Marra, F.; Saltori, C.; Boato, G.; Verdoliva, L. Incremental learning for the detection and classification of gan-generated images. In Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS), Delft, The Netherlands, 9–12 December 2019; pp. 1–6. [Google Scholar]
- Li, S.; Dutta, V.; He, X.; Matsumaru, T. Deep Learning Based One-Class Detection System for Fake Faces Generated by GAN Network. Sensors 2022, 22, 7767. [Google Scholar] [CrossRef]
- Guo, H.; Hu, S.; Wang, X.; Chang, M.C.; Lyu, S. Eyes Tell All: Irregular Pupil Shapes Reveal GAN-Generated Faces. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 22–27 May 2022; pp. 2904–2908. [Google Scholar]
- Burgos-Artizzu, X.; Perona, P.; Dollar, P. Robust face landmark estimation under occlusion. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 1513–1520. [Google Scholar]
- Sagonas, C.; Tzimiropoulos, G.; Zafeiriou, S.; Pantic, M. 300 faces in-the-wild challenge: The first facial landmark localization challenge. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Columbus, OH, USA, 23–28 June 2014; pp. 397–403. [Google Scholar]
- Learned-Miller, E.; Huang, G.; Chowdhury, A.; Li, H.; Hua, G. Labeled Faces in the Wild: A Survey. Adv. Face Detect. Facial Image Anal. 2016, 1, 189–248. [Google Scholar]
- Liu, Z.; Luo, P.; Wang, X.; Tang, X. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 3730–3738. [Google Scholar]
- Cao, Q.; Shen, L.; Xie, W.; Parkhi, O.M.; Zisserman, A. VGGFace2: A Dataset for Recognising Faces across Pose and Age. In Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG), Xi’an, China, 15–19 May 2018; pp. 67–74. [Google Scholar]
- Xu, Z.; Hong, Z.; Ding, C.; Zhu, Z.; Han, J.; Liu, J.; Ding, E. MobileFaceSwap: A Lightweight Framework for Video Face Swapping. arXiv 2022, arXiv:2201.03808. [Google Scholar] [CrossRef]
- Shu, C.; Wu, H.; Zhou, H.; Liu, J.; Hong, Z.; Ding, C.; Han, J.; Liu, J.; Ding, E.; Wang, J. Few-Shot Head Swapping in the Wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 10789–10798. [Google Scholar]
- Chung, J.S.; Nagrani, A.; Zisserman, A. Voxceleb2: Deepspeaker recognition. In Proceedings of the IEEE Conf. Conference of the International Speech Communication Association, Hyderabad, India, 2–6 September 2018; pp. 1–6. [Google Scholar]
- Afchar, D.; Nozick, V.; Yamagishi, J.; Echizen, I. Mesonet: A compact facial video forgery detection network. In Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS), Montpellier, France, 7–10 December 2021; pp. 1–7. [Google Scholar]
- Miao, C.; Chu, Q.; Li, W.; Gong, T.; Zhuang, W.; Yu, N. Towards Generalizable and Robust Face Manipulation Detection via Bag-of-local-feature. arXiv 2021, arXiv:2103.07915. [Google Scholar]
- Li, Y.; Yang, X.; Sun, P.; Qi, H.; Lyu, S. Celeb-df: A large-scale challenging dataset for deepfake forensics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 3207–3216. [Google Scholar]
- Jiang, L.; Li, R.; Wu, W.; Qian, C.; Loy, C. Deeperforensics-1.0: A large-scale dataset for real world face forgery detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2886–2895. [Google Scholar]
- Prajapati, P.; Pollett, C. MRI-GAN: A Generalized Approach to Detect DeepFakes using Perceptual Image Assessment. arXiv 2022, arXiv:2203.00108. [Google Scholar]
- Zhang, Y.; Zhang, S.; He, Y.; Li, C.; Loy, L.C.C.; Liu, Z. One-shot Face Reenactment. In Proceedings of the British Machine Vision Conference (BMVC), Cardiff, UK, 9–12 September 2019; pp. 1–13. [Google Scholar]
- Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive growing of gans for improved quality, stability, and variation. arXiv 2017, arXiv:1710.10196. [Google Scholar]
- Karras, T.; Laine, S.; Aila, T. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4401–4410. [Google Scholar]
- Li, S.; Deng, W.; Du, J. Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2584–2593. [Google Scholar]
- Ngo, L.; Karaoglu, S.; Gever, T. Unified Application of Style Transfer for Face Swapping and Reenactment. In Proceedings of the Asian Conference on Computer Vision, Kyoto, Japan, 30 November–4 December 2020; pp. 1–17. [Google Scholar]
- Shen, J.; Zafeiriou, S.; Chrysos, G.G.; Kossaifi, J.; Tzimiropoulos, G.; Pantic, M. The first facial landmark tracking in-the-wild challenge: Benchmark and results. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Santiago, Chile, 7–13 December 2015; pp. 50–58. [Google Scholar]
- Tripathy, S.; Kannala, J.; Rahtu, E. FACEGAN: Facial Attribute Controllable rEenactment GAN. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021; pp. 1329–1338. [Google Scholar]
- Bounareli, S.; Argyriou, V.; Tzimiropoulos, G. Finding Directions in GAN’s Latent Space for Neural Face Reenactment. arXiv 2022, arXiv:2202.00046. [Google Scholar]
- Nagrani, A.; Chung, J.S.; Zisserman, A. Voxceleb: A large-scale speaker identification dataset. In Proceedings of the INTERSPEECH, Stockholm, Sweden, 20–24 August 2017; pp. 1–6. [Google Scholar]
- Agarwal, M.; Mukhopadhyay, R.; Namboodiri, V.; Jawahar, C. Audio-visual face reenactment. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–7 January 2023; pp. 5178–5187. [Google Scholar]
- Nguyen, H.; Fang, F.; Yamagishi, J.; Echizen, I. Multi-task Learning For Detecting and Segmenting Manipulated Facial Images and Videos. arXiv 2019, arXiv:1906.06876. [Google Scholar]
- Dang, H.; Liu, F.; Stehouwer, J.; Liu, X.; Jain, A. On the Detection of Digital Face Manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 1–10. [Google Scholar]
- Kim, M.; Tariq, S.; Woo, S. FReTAL: Generalizing Deepfake Detection using Knowledge Distillation and Representation Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 1001–1012. [Google Scholar]
- Yu, P.; Fei, J.; Xia, Z.; Zhou, Z.; Weng, J. Improving Generalization by Commonality Learning in Face Forgery Detection. IEEE Trans. Inf. Secur. 2022, 17, 547–558. [Google Scholar] [CrossRef]
- Wu, H.; Wang, P.; Wang, X.; Xiang, J.; Gong, R. GGViT:Multistream Vision Transformer Network in Face2Face Facial Reenactment Detection. In Proceedings of the 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada, 21–25 August 2022; pp. 2335–2341. [Google Scholar]
- Lample, G.; Zeghidour, N.; Usunier, N.; Bordes, A.; Denoyer, L.; Ranzato, M. Fader networks: Manipulating images by sliding attributes. arXiv 2017, arXiv:1706.00409. [Google Scholar]
- Liu, M.; Ding, Y.; Xia, M.; Liu, X.; Ding, E.; Zuo, W.; Wen, S. STGAN: A unified selective transfer network for arbitrary image attribute editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3673–3682. [Google Scholar]
- Kim, H.; Choi, Y.; Kim, J.; Yoo, S.; Uh, Y. Exploiting Spatial Dimensions of Latent in GAN for Real-Time Image Editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 852–861. [Google Scholar]
- Choi, Y.; Uh, Y.; Yoo, J.; Ha, J. StarGAN v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8188–8197. [Google Scholar]
- Huang, W.; Tu, S.; Xu, L. IA-FaceS: A Bidirectional Method for Semantic Face Editing. Neural Netw. 2023, 158, 272–292. [Google Scholar] [CrossRef] [PubMed]
- Sun, J.; Wang, X.; Zhang, Y.; Li, X.; Zhang, Q.; Liu, Y.; Wang, J. Fenerf: Face editing in neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 7672–7682. [Google Scholar]
- Wang, S.; Wang, O.; Owens, A.; Zhang, R.; Efros, A. Detecting photoshopped faces by scripting photoshop. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 10072–10081. [Google Scholar]
- Du CX, T.; Trung, H.T.; Tam, P.M.; Hung NQ, V.; Jo, J. Efficient-Frequency: A hybrid visual forensic framework for facial forgery detection. In Proceedings of the IEEE Symposium Series on Computational Intelligence (SSCI), Canberra, Australia, 1–4 December 2020; pp. 707–712. [Google Scholar]
- Deepfake in the Wild Dataset. Available online: https://github.com/deepfakeinthewild/deepfake-in-the-wild (accessed on 4 January 2023).
- Rathgeb, C.; Nichols, R.; Ibsen, M.; Drozdowski, P.; Busch, C. Busch. Crowd-powered Face Manipulation Detection: Fusing Human Examiner Decisions. arXiv 2022, arXiv:2201.13084. [Google Scholar]
- Phillips, P.; Wechsler, H.; Huang, J.; Rauss, P.J. The FERET database and evaluation procedure for face-recognition algorithms. Image Vis. Comput. 1998, 16, 295–306. [Google Scholar] [CrossRef]
- Guo, Z.; Yang, G.; Zhang, D.; Xia, M. Rethinking gradient operator for exposing AI-enabled face forgeries. Expert Syst. Appl. 2023, 215. [Google Scholar] [CrossRef]
- Li, Y.; Chen, X.; Wu, F.; Zha, Z.J. Linestofacephoto: Face photo generation from lines with conditional self-attention generative adversarial networks. In Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, 21–25 October 2019; pp. 2323–2331. [Google Scholar]
- Xia, W.; Yang, Y.; Xue, J.H.; Wu, B. TediGAN: Text-Guided Diverse Face Image Generation and Manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 2256–2265. [Google Scholar]
- Song, H.; Woo, S.; Lee, J.; Yang, S.; Cho, H.; Lee, Y.; Choi, D.; Kim, K. Talking Face Generation with Multilingual TTS. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 21425–21430. [Google Scholar]
- Zen, H.; Dang, V.; Clark, R.; Zhang, Y.; Weiss, R.J.; Jia, Y.; Chen, Z.; Wu, Y. LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech. Interspeech 2019. [Google Scholar] [CrossRef] [Green Version]
- Shi, Y.; Bu, H.; Xu, X.; Zhang, S.; Li, M. AISHELL-3: A Multi-Speaker Mandarin TTS Corpus. Interspeech 2021. [Google Scholar] [CrossRef]
- Li, Z.; Min, M.; Li, K.; Xu, C. StyleT2I: Toward Compositional and High-Fidelity Text-to-Image Synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 18197–18207. [Google Scholar]
- Wang, S.; Wang, O.; Zhang, R.; Owens, A.; Efros, A. CNN-generated images are surprisingly easy to spot… for now. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8695–8704. [Google Scholar]
- Pu, J.; Mangaokar, N.; Wang, B.; Reddy, C.; Viswanath, B. Noisescope: Detecting deepfake images in a blind setting. In Proceedings of the Annual Computer Security Applications Conference, Austin, TX, USA, 6–10 December 2020; pp. 913–927. [Google Scholar]
- Yousaf, B.; Usama, M.; Sultani, W.; Mahmood, A.; Qadir, J. Fake visual content detection using two-stream convolutional neural networks. Neural Comput. Appl. 2022, 34, 7991–8004. [Google Scholar] [CrossRef]
- Nowroozi, E.; Conti, M.; Mekdad, Y. Detecting high-quality GAN-generated face images using neural networks. arXiv 2022, arXiv:2203.01716. [Google Scholar]
- Ferreira, A.; Nowroozi, E.; Barni, M. VIPPrint: Validating Synthetic Image Detection and Source Linking Methods on a Large Scale Dataset of Printed Documents. J. Imaging 2021, 7, 50. [Google Scholar] [CrossRef]
- Boyd, A.; Tinsley, P.; Bowyer, K.; Czajka, A. CYBORG: Blending Human Saliency Into the Loss Improves Deep Learning-Based Synthetic Face Detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, Hawaii, 3–7 January 2023; pp. 6108–6117. [Google Scholar]
- Karras, T.; Aittala, M.; Hellsten, J.; Laine, S.; Lehtinen, J.; Aila, T. Training generative adversarial networks with limited data. Adv. Neural Inf. Process. Syst. 2020, 33, 12104–12114. [Google Scholar]
- Karras, T.; Aittala, M.; Laine, S.; Härkönen, E.; Hellsten, J.; Lehtinen, J.; Aila, T. Alias-free generative adversarial networks. Adv. Neural Inf. Process. Syst. 2021, 34, 852–863. [Google Scholar]
- Banerjee, S.; Bernhard, J.S.; Scheirer, W.J.; Bowyer, K.W.; Flynn, P.J. SREFI: Synthesis of realistic example face images. In Proceedings of the IEEE International Joint Conference on Biometrics, Denver, CO, USA, 1–4 October 2017; pp. 37–45. [Google Scholar] [CrossRef] [Green Version]
- Mishra, S.; Shukla, A.K.; Muhuri, P.K. Explainable Fuzzy AI Challenge 2022: Winner’s Approach to a Computationally Efficient and Explainable Solution. Axioms 2022, 11, 489. [Google Scholar] [CrossRef]
- Adadi, A.; Berrada, M. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access 2018, 6, 52138–52160. [Google Scholar] [CrossRef]
- Das, A.; Rad, P. Opportunities and challenges in explainable artificial intelligence (xai): A survey. arXiv 2020, arXiv:2006.11371. [Google Scholar]
Study | Approach | Dataset | Performance | Source Code | Year |
---|---|---|---|---|---|
Deepfake Generation | |||||
Wang et al. [55] | Real-time face swapping using CANDIDE-3 | COFW [132], 300W [133], LFW [134] | SWR = 87.9%. | × | 2018 |
Natsume et al. [56] | Face swapping and editing using RSGAN | CelebA [135] | MS-SSIM = 0.087 | × | 2018 |
Chen et al. [61] | High fidelity encoder-decoder | VGGFace2 [136] | Qualitative Analysis | https://github.com/neuralchen/SimSwap (accessed on 4 January 2023) | 2021 |
Xu et al. [137] | Lightweight Identity-aware Dynamic Network | VGGFace2 [136] FaceForensics++ [90] | FID = 6.79% | https://github.com/Seanseattle/MobileFaceSwap (accessed on 4 January 2023) | 2022 |
Shu et al. [138] | Portrait, identity, and pose encoders with generator and feature pyramid network | VoxCeleb2 [139] | PSNR = 33.26 | https://github.com/jmliu88/heser (accessed on 4 January 2023) | 2022 |
Deepfake Detection | |||||
Afcha et al. [140] | CNNs | FaceForensics++ [90] | Acc = 98.40% | https://github.com/DariusAf/MesoNet (accessed on 4 January 2023) | 2018 |
Zhao et al. [77] | Multi-attentional | FaceForensics++ [90] DFDC [3] | Acc = 97.60% LL = 0.1679 | https://github.com/yoctta/multiple-attention (accessed on 4 January 2023) | 2021 |
Miao et al. [141] | Transformers via bag-of-feature for generalization | FaceForensics++ [90], Celeb-DF [142], DeeperForensics-1.0 [143] | Acc = 87.86% AUC = 82.52% Acc = 97.01% | × | 2021 |
Prajapati et al. [144] | Perceptual Image Assessment + GANs | DFDC [3] | AUC = 95% Acc = 91% | https://github.com/pratikpv/mri_gan_deepfake (accessed on 4 January 2023) | 2022 |
Wang et al. [75] | Multi-modal Multi-scale Transformer (M2TR) | FaceForensics++ [90] | Acc = 97.93% | https://github.com/wangjk666/M2TR-Multi-modal-Multi-scale-Transformers-for-Deepfake-Detection (accessed on 4 January 2023) | 2022 |
Reenactment Generation | |||||
Zhang et al. [145] | Decoder + warping | CelebA-HQ [146] FFHQ [147] RAF-DB [148] | AU = 75.1% AU = 70.9% AU = 71.1% | https://github.com/bj80heyue/One_Shot_Face_Reenactment (accessed on 4 January 2023) | 2019 |
Ngo et al. [149] | Encoder-decoder | 300VW [150] | CL= 1.46 | × | 2020 |
Tripathy et al. [151] | Facial attribute controllable GANs | FaceForensics++ [90] | CSIM = 0.747 | × | 2021 |
Bounareli et al. [152] | 3D shape model | VoxCeleb [153] | FID = 0.66 | × | 2022 |
Agarwal et al. [154] | Audio-Visual Face Reenactment GAN | VoxCeleb [153] | FID = 9.05 | https://github.com/mdv3101/AVFR-Gan/ (accessed on 4 January 2023) | 2023 |
Reenactment Detection | |||||
Nguyen et al. [155] | Autoencoder | FaceForensics++ [90] | EER = 7.07% | https://github.com/nii-yamagishilab/ClassNSeg (accessed on 4 January 2023) | 2019 |
Dang et al. [156] | CNNs + Attention mechanism | FaceForensics++ [90] | AUC = 99.4% EER = 3.4% | https://github.com/Jstehouwer/FFD_CVPR2020 (accessed on 4 January 2023) | 2020 |
Kim et al. [157] | Knowledge Distillation | FaceForensics++ [90] | Acc = 86.97% | × | 2021 |
Yu et al. [158] | U-Net Structure | FaceForensics++ [90] | Acc = 97.26% | × | 2022 |
Wu et al. [159] | Multistream Vision Transformer Network | FaceForensics++ [90] | Acc = 94.46% | × | 2022 |
Attribute Manipulation Generation | |||||
Lample et al. [160] | Encoder-decoder | CelebA [135] | RMSE = 0.0009 | https://github.com/facebookresearch/FaderNetworks (accessed on 4 January 2023) | 2018 |
Liu et al. [161] | Selective transfer GANs | CelebA [135] | Acc = 70.80% | https://github.com/csmliu/STGAN (accessed on 4 January 2023) | 2019 |
Kim et al. [162] | Real-time style map GANs | CelebA-HQ [146] AFHQ [163] | FID = 4.03 FID = 6.71 | https://github.com/naver-ai/StyleMapGAN (accessed on 4 January 2023) | 2021 |
Huang et al. [164] | Multi-head encoder and decoder | CelebA-HQ [146] StyleMapGAN [162] | MSE = 0.023 FID = 7.550 | × | 2022 |
Sun et al. [165] | 3D-aware generator with two decoupled latent codes | FFHQ [147] | FID = 28.2 | https://github.com/MrTornado24/FENeRF (accessed on 4 January 2023) | 2022 |
Attribute Manipulation Detection | |||||
Wang et al. [166] | CNNs | Own dataset | Acc = 90.0% | https://github.com/peterwang512/FALdetector (accessed on 4 January 2023) | 2019 |
Du et al. [167] | DFT + CNNs | Deepfake-in-the-wild [168] Celeb-DF [142] DFDC [3] | Acc = 78.00% Acc = 96.00% Acc = 81.00% | × | 2020 |
Akhtar et al. [36] | DNNs | Own dataset | Acc = 99.31 | × | 2021 |
Rathgeb et al. [169] | Human majority voting | FERET [170] | CCR = 62.8% | × | 2022 |
Guo et al. [171] | Gradient operator convolutional network with tensor pre-processing and manipulation trace attention module | FaceForensics++ [90] | Acc = 94.86% | https://github.com/EricGzq/GocNet-pytorch (accessed on 4 January 2023) | 2023 |
Entire face synthesis generation | |||||
Li et al. [172] | Conditional self-attention GANs | CelebA-HQ [146] | KID = 0.62 | https://github.com/LiYuhangUSTC/Lines2Face (accessed on 4 January 2023) | 2019 |
Karras et al. [81] | StyleGAN | FFHQ [147] | FID = 3.31 | https://github.com/NVlabs/stylegan2 (accessed on 4 January 2023) | 2020 |
Xia et al. [173] | Textual descriptions GANs | CelebA-HQ [146] | FID = 106.37 | https://github.com/IIGROUP/TediGAN (accessed on 4 January 2023) | 2021 |
Song et al. [174] | Text-to-speech system | LibriTTS dataset [175] AISHELL-3 [176] | FPS = 30.3 | × | 2022 |
Li et al. [177] | StyleT2I: High-Fidelity Text-to-Image Synthesis | CelebA-HQ [146] | FID = 18.02 | https://github.com/zhihengli-UR/StyleT2I (accessed on 4 January 2023) | 2022 |
Entire face synthesis detection | |||||
Wang et al. [178] | CNNs | StyleGAN2 [81] ProGAN [146] | AP = 99.10% AP = 100% | https://github.com/peterwang512/CNNDetection (accessed on 4 January 2023) | 2020 |
Pu et al. [179] | Incremental clustering | PGGAN [146] | F1 Score = 99.09% | https://github.com/jmpu/NoiseScope (accessed on 4 January 2023) | 2020 |
Yousaf et al. [180] | Two-Stream CNNs | StarGAN [101] | Acc = 96.32% | × | 2021 |
Nowroozi et al. [181] | Cross-band and spatial co-occurrence matrix + CNNs | StyleGAN2 [81] VIPPrint [182] | Acc = 93.80% Acc = 92.56% | × | 2022 |
Boyd et al. [183] | Human-annotated saliency maps into a deep learning loss function | StyleGAN2 [81], ProGAN [146], StyleGAN [147], StyleGAN2-ADA [184], StyleGAN3 [185], StarGANv2 [163], SREFI [186] | AUC = 0.633 | https://github.com/BoydAidan/CYBORG-Loss (accessed on 4 January 2023) | 2023 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Akhtar, Z. Deepfakes Generation and Detection: A Short Survey. J. Imaging 2023, 9, 18. https://doi.org/10.3390/jimaging9010018
Akhtar Z. Deepfakes Generation and Detection: A Short Survey. Journal of Imaging. 2023; 9(1):18. https://doi.org/10.3390/jimaging9010018
Chicago/Turabian StyleAkhtar, Zahid. 2023. "Deepfakes Generation and Detection: A Short Survey" Journal of Imaging 9, no. 1: 18. https://doi.org/10.3390/jimaging9010018
APA StyleAkhtar, Z. (2023). Deepfakes Generation and Detection: A Short Survey. Journal of Imaging, 9(1), 18. https://doi.org/10.3390/jimaging9010018