Next Article in Journal
Developing a Bridge Health Index (BHI) with a Wighted Priority Index (PI) for Maintenance Decision-Making: An Open Data-Based Approach in Korea
Previous Article in Journal
Resilience Under Heatwaves: Croatia’s Power System During the July 2024 Heatwave and the Role of Variable Renewable Energy by 2030
Previous Article in Special Issue
A Convolutional Neural Network as a Potential Tool for Camouflage Assessment
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Novel Deepfake Image Detection with PV-ISM: Patch-Based Vision Transformer for Identifying Synthetic Media

Department of Computer Engineering, Dokuz Eylul University, Izmir 35390, Turkey
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(12), 6429; https://doi.org/10.3390/app15126429 (registering DOI)
Submission received: 24 April 2025 / Revised: 1 June 2025 / Accepted: 5 June 2025 / Published: 7 June 2025
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)

Abstract

This study presents a novel approach to the increasingly important task of distinguishing AI-generated images from authentic photographs. The detection of such synthetic content is critical for combating deepfake misinformation and ensuring the authenticity of digital media in journalism, forensics, and online platforms. A custom-designed Vision Transformer (ViT) model, termed Patch-Based Vision Transformer for Identifying Synthetic Media (PV-ISM), is introduced. Its performance is benchmarked against innovative transfer learning methods using 60,000 authentic images from the CIFAKE dataset, which is derived from CIFAR-10, along with a corresponding collection of images generated using Stable Diffusion 1.4. PV-ISM incorporates patch extraction, positional encoding, and multiple transformer blocks with attention mechanisms to identify subtle artifacts in synthetic images. Following extensive hyperparameter tuning, an accuracy of 96.60% was achieved, surpassing the performance of ResNet50 transfer learning approaches (93.32%) and other comparable methods reported in the literature. The experimental results demonstrate the model’s balanced classification capabilities, exhibiting excellent recall and precision throughout both image categories. The patch-based architecture of Vision Transformers, combined with appropriate data augmentation techniques, proves particularly effective for synthetic image detection while requiring less training time than traditional transfer learning approaches.
Keywords: deep learning; Vision Transformers; image classification; AI-generated images; attention mechanism; transfer learning deep learning; Vision Transformers; image classification; AI-generated images; attention mechanism; transfer learning

Share and Cite

MDPI and ACS Style

Çınar, O.; Doğan, Y. Novel Deepfake Image Detection with PV-ISM: Patch-Based Vision Transformer for Identifying Synthetic Media. Appl. Sci. 2025, 15, 6429. https://doi.org/10.3390/app15126429

AMA Style

Çınar O, Doğan Y. Novel Deepfake Image Detection with PV-ISM: Patch-Based Vision Transformer for Identifying Synthetic Media. Applied Sciences. 2025; 15(12):6429. https://doi.org/10.3390/app15126429

Chicago/Turabian Style

Çınar, Orkun, and Yunus Doğan. 2025. "Novel Deepfake Image Detection with PV-ISM: Patch-Based Vision Transformer for Identifying Synthetic Media" Applied Sciences 15, no. 12: 6429. https://doi.org/10.3390/app15126429

APA Style

Çınar, O., & Doğan, Y. (2025). Novel Deepfake Image Detection with PV-ISM: Patch-Based Vision Transformer for Identifying Synthetic Media. Applied Sciences, 15(12), 6429. https://doi.org/10.3390/app15126429

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop