This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Novel Deepfake Image Detection with PV-ISM: Patch-Based Vision Transformer for Identifying Synthetic Media
by
Orkun Çınar
Orkun Çınar *
and
Yunus Doğan
Yunus Doğan
Department of Computer Engineering, Dokuz Eylul University, Izmir 35390, Turkey
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(12), 6429; https://doi.org/10.3390/app15126429 (registering DOI)
Submission received: 24 April 2025
/
Revised: 1 June 2025
/
Accepted: 5 June 2025
/
Published: 7 June 2025
Abstract
This study presents a novel approach to the increasingly important task of distinguishing AI-generated images from authentic photographs. The detection of such synthetic content is critical for combating deepfake misinformation and ensuring the authenticity of digital media in journalism, forensics, and online platforms. A custom-designed Vision Transformer (ViT) model, termed Patch-Based Vision Transformer for Identifying Synthetic Media (PV-ISM), is introduced. Its performance is benchmarked against innovative transfer learning methods using 60,000 authentic images from the CIFAKE dataset, which is derived from CIFAR-10, along with a corresponding collection of images generated using Stable Diffusion 1.4. PV-ISM incorporates patch extraction, positional encoding, and multiple transformer blocks with attention mechanisms to identify subtle artifacts in synthetic images. Following extensive hyperparameter tuning, an accuracy of 96.60% was achieved, surpassing the performance of ResNet50 transfer learning approaches (93.32%) and other comparable methods reported in the literature. The experimental results demonstrate the model’s balanced classification capabilities, exhibiting excellent recall and precision throughout both image categories. The patch-based architecture of Vision Transformers, combined with appropriate data augmentation techniques, proves particularly effective for synthetic image detection while requiring less training time than traditional transfer learning approaches.
Share and Cite
MDPI and ACS Style
Çınar, O.; Doğan, Y.
Novel Deepfake Image Detection with PV-ISM: Patch-Based Vision Transformer for Identifying Synthetic Media. Appl. Sci. 2025, 15, 6429.
https://doi.org/10.3390/app15126429
AMA Style
Çınar O, Doğan Y.
Novel Deepfake Image Detection with PV-ISM: Patch-Based Vision Transformer for Identifying Synthetic Media. Applied Sciences. 2025; 15(12):6429.
https://doi.org/10.3390/app15126429
Chicago/Turabian Style
Çınar, Orkun, and Yunus Doğan.
2025. "Novel Deepfake Image Detection with PV-ISM: Patch-Based Vision Transformer for Identifying Synthetic Media" Applied Sciences 15, no. 12: 6429.
https://doi.org/10.3390/app15126429
APA Style
Çınar, O., & Doğan, Y.
(2025). Novel Deepfake Image Detection with PV-ISM: Patch-Based Vision Transformer for Identifying Synthetic Media. Applied Sciences, 15(12), 6429.
https://doi.org/10.3390/app15126429
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.