Next Article in Journal
Hybrid Boustrophedon and Direction-Biased Region Transitions for Mobile Robot Coverage Path Planning: A Region-Based Multi-Cost Framework
Next Article in Special Issue
Systematic Literature Review of Human–AI Collaboration for Intelligent Construction
Previous Article in Journal
Natural Language Processing in Generating Industrial Documentation Within Industry 4.0/5.0
Previous Article in Special Issue
A Drilling Debris Tracking and Velocity Measurement Method Based on Fine Target Feature Fusion Optimization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Forensic Analysis of Manipulated Images and Videos

by
Sergio A. Falcón-López
1,*,
Llanos Tobarra
2,
Antonio Robles-Gómez
2 and
Rafael Pastor-Vargas
2
1
Programa de Doctorado en Tecnologías Industriales, Escuela Internacional de Doctorado UNED (EIDUNED), Universidad Nacional de Educacion a Distancia (UNED), 28040 Madrid, Spain
2
Control and Communication System Department, Computer Science Engineering Faculty, Universidad Nacional de Educacion a Distancia (UNED), 28040 Madrid, Spain
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(23), 12664; https://doi.org/10.3390/app152312664
Submission received: 5 November 2025 / Revised: 25 November 2025 / Accepted: 27 November 2025 / Published: 29 November 2025
(This article belongs to the Special Issue AI from Industry 4.0 to Industry 5.0: Engineering for Social Change)

Abstract

The transition from Industry 4.0 to Industry 5.0 emphasizes the need for ethical, transparent, and human-centric artificial intelligence systems. In this context, ensuring the authenticity of digital information has become crucial for maintaining societal trust. This study addresses the challenge of detecting manipulated multimedia content, including synthetic images, videos, and audio generated by artificial intelligence, commonly known as Deepfakes. We analyze and compare general-purpose and Deepfake-specific detection methods to assess their effectiveness in real-world scenarios. This work introduces a refined reference model that integrates both application-oriented and methodological criteria, grouping tools into Blind Forensic, Handcrafted Machine Learning, Deep Learning-based methods, and Toolkits. This structured taxonomy provides a clearer comparative framework than existing works, which typically classify detectors using only one of these dimensions. To ensure reproducible evaluation, all experiments were performed using the SAFL dataset, which consolidates real and synthetic multimedia content generated with publicly available tools under a unified protocol. Among the tested tools, Forensically achieved the highest accuracy in image forgery detection 86.9%, while Autopsy reached 69.5% among Deepfake-specific image detectors. In video analysis, Forensically obtained 98.6% accuracy, whereas Deepware Scanner achieved 91.2% as the most effective Deepfake-focused tool. These results highlight that general-purpose methods remain robust for images, while specialized detectors perform competitively in videos. Overall, the proposed model and dataset establish a consistent foundation for advancing hybrid detection strategies aligned with the ethical and transparent AI principles envisioned in Industry 5.0.

1. Introduction

In recent years, the rapid digital transformation associated with Industry 4.0 has led to a hyper-connected world, where automation and data exchange have redefined how information is produced and consumed. Smartphones and smart devices equipped with high-resolution cameras and large storage capacities have become ubiquitous, enabling users to effortlessly capture and share high-quality multimedia content at low cost.
As we advance toward Industry 5.0, the focus extends beyond technological efficiency toward trust, ethics, and human-centric artificial intelligence. In this context, any piece of information can spread globally within seconds. False information, in particular, can manipulate public opinion with potentially catastrophic consequences [1,2]. Therefore, detecting fake content has become crucial to safeguard individual rights and public trust.
Multimedia forgery detection is a growing field that focuses on identifying visual or auditory content that has been manipulated or synthetically generated. Traditional digital forensic tools detect inconsistencies in compression, lighting, or metadata, whereas more recent Deepfake-specific methods leverage deep learning to capture complex semantic anomalies [3]. This evolution aligns with broader advances in detection technology across other computer vision domains, where deep neural networks and improved feature extraction techniques have demonstrated strong performance in tasks such as foreign object detection [4].
Despite rapid advances in detection techniques, new generation methods have drastically improved the realism of Deepfakes. Initially, these manipulations were easy to spot due to visual imperfections. Today, however, their sophistication has made manual detection nearly impossible, requiring advanced technological tools [5]. Moreover, existing methods often struggle to generalize across manipulation types and real-world conditions.
Most studies focus either on general-purpose detection methods or on Deepfake-specific models, but rarely compare both approaches under unified conditions. A few exceptions, such as [6], attempt to analyze both, but there is still a lack of practical, end-to-end frameworks that allow for objective evaluation on common datasets. Moreover, most existing evaluations are conducted on idealized datasets, which fail to reflect the complexity of real-world manipulations.
This study addresses the following question: How do general-purpose forensic tools and Deepfake-specific detection methods perform when applied to realistic manipulated content generated with publicly available software? We hypothesize that, despite not being specifically designed for synthetic content, some general-purpose tools may still offer competitive performance under certain conditions.
The objectives of this work are to:
  • Categorize detection methods into general-purpose and Deepfake-specific groups.
  • Create a dataset using publicly available generation tools.
  • Evaluate the effectiveness and robustness of each method across manipulation types.
  • Bridge the gap between theoretical comparisons and real-world applicability.
To this end, we generated a dataset of images and videos using Deepfake generation tools such as FaceApp [7] and DeepFaceLab [8]. We then applied both general-purpose tools and Deepfake-specific detectors, analyzing their performance using metrics such as accuracy, precision, recall, F 1 -score, and confusion matrices.
The specific contributions of our research are as follows:
  • Design and proposal of a reference detection model, organized into four categories: Blind Forensic methods, Handcrafted Feature-based Machine Learning (ML) methods, Deep Learning-based methods, and Toolkits. This model provides a unified framework to categorize and evaluate existing detection approaches.
  • Creation of a realistic Deepfake dataset, composed of manipulated images and videos generated using diverse and publicly available generation tools.
  • Empirical evaluation of Deepfake detection tools, assessing their accuracy and robustness through standard metrics such as confusion matrix, precision, recall, F 1 -score, and accuracy.
  • Comparative analysis of general-purpose vs. Deepfake-specific methods, identifying their respective strengths and limitations when applied to different media types and manipulation techniques.
Ultimately, this research contributes to the development of ethical and transparent artificial intelligence ecosystems by strengthening the reliability of multimedia content in an era where trust and authenticity are central to the transition from Industry 4.0 to Industry 5.0.
This work is organized as follows. Section 2 presents the research objectives and motivation behind our contribution. Section 3 provides an overview of the current state of research into Deepfake detection techniques, covering both general manipulation methods, as well as techniques specifically designed for Deepfakes. Additionally, Section 4 describes the research methodology used in our forensic analysis of manipulated multimedia resources, and the dataset generated using various Deepfake creation techniques. A detailed description of it is provided for the reader. The proposed reference model is detailed in Section 5. The  results obtained, which were evaluated using a set of performance metrics, are presented in Section 6. These results are promising. Finally, Section 7 summarizes the conclusions and outlines potential directions for future research.

2. Objectives

The main objective of this research is to determine whether the existing image and video manipulation detection tools available to users can detect Deepfake multimedia content. Secondly, the study will examine whether tools based on general-purpose image and video detection techniques can detect Deepfake content or whether such tools have become obsolete. Finally, the effectiveness of tools using Deepfake-specific detection techniques will be evaluated, to assess their accuracy, and determine whether they outperform tools using general-purpose techniques.
Given that Deepfakes are becoming an increasingly significant threat to society, the aim of this research work is to mitigate the challenges posed by information manipulation through Deepfake multimedia content. The first step to achieving this objective is to review existing Deepfake detection techniques for image and video manipulation, including both general-purpose and Deepfake-specific methods. To compare the performance of these techniques, a dataset of multimedia files generated through Deepfake creation methods was constructed first. Secondly, a case study was conducted to analyze multimedia forgeries, including a comparative performance evaluation of the two types of detection techniques studied.

3. State of the Art (SOTA)

This section presents a comprehensive review of the current state of multimedia forgery detection, structured along two complementary axes: detection methods and detection tools. In Section 3.1, we first classify and analyze the main methodological approaches used to identify manipulated images and videos. These include Blind forensic methods, Handcrafted feature-based machine learning methods, and Deep learning-based methods, each characterized by different assumptions, capabilities, and requirements. A summary of this taxonomy is provided in Table 1.
In addition to this, we review publicly available forensic tools that implement one or more of these detection strategies in Section 3.2. These tools are evaluated not as methodological contributions, but as practical solutions that reflect how detection techniques are deployed in real-world scenarios.
This two-part structure provides both a theoretical and practical foundation for our experimental comparison, which aims to assess the real-world effectiveness of representative methods and tools under a unified benchmark.

3.1. Detection Methods

Detection methods for multimedia forensics can be grouped into three main methodological categories as detailed  in Section 3.1.1, Section 3.1.2 and Section 3.1.3. This taxonomy reflects varying levels of supervision, automation, and dependence on prior knowledge. Each category is based on distinct detection paradigms and operates under different assumptions and constraints.
In the following subsections, we provide a detailed overview of each method type, highlighting their underlying principles, strengths, and limitations. This analysis forms the basis for identifying current challenges in multimedia forgery detection, which we summarize at the end in Section 3.1.4 to motivate the objectives of our empirical evaluation.

3.1.1. Blind Forensic Methods

Blind forensic methods aim to detect traces of manipulation without requiring prior knowledge of the specific tampering process or training data. Instead, they rely on statistical inconsistencies, signal processing artifacts, or physical modeling of acquisition pipelines to uncover subtle anomalies that are often imperceptible to the human eye.
A common strategy is to analyze imperfections introduced by the camera sensor or compression process. For example, Color Filter Array (CFA) patterns and noise inconsistencies can be disrupted by digital tampering. Several techniques examine local anomalies in color interpolation or statistical noise distributions to reveal manipulated regions, especially in JPEG-compressed images, which tend to accumulate characteristic artifacts through double compression or block misalignments.
One of the most representative blind methods is Photo Response Non-Uniformity (PRNU) analysis. Camera sensors exhibit subtle manufacturing defects that introduce a unique noise pattern, called PRNU, into all images captured by the same device. This pattern acts as a fingerprint that can be estimated from a set of images and then used to authenticate a suspect image. Deviations from the expected PRNU pattern may indicate forgery. However, this approach requires access to a sufficient number of original images from the same device and is often infeasible in real-world internet scenarios [3].
Model-based variants extend this concept by constructing a statistical model of noise residuals extracted from reference images, typically using a sliding window approach. Any significant deviation from the reference distribution in a test image is interpreted as evidence of manipulation [3]. Although these techniques are effective in constrained conditions, they rely heavily on prior access to specific hardware or controlled datasets, limiting their practical application in open environments.
In summary, blind forensic methods are lightweight and interpretable, often requiring no training phase. However, they suffer from limited generalization and are highly sensitive to post-processing operations, especially compression and filtering. Despite their limitations, these methods remain relevant for detecting non-synthetic forgeries and serve as a baseline for comparison with learning-based approaches.

3.1.2. Handcrafted Feature-Based Machine Learning Methods

These methods rely on manually engineered features that capture statistical or structural irregularities in multimedia content. The key idea is to define a set of discriminative features capable of distinguishing between authentic and manipulated data, and then use them to train a supervised classifier.
Feature design typically leverages domain knowledge to identify visual cues that are likely to be affected by manipulation. Common examples include color inconsistencies, edge statistics, frequency-domain coefficients, or noise residuals. Once extracted, these features are fed into standard classifiers such as Support Vector Machines (SVM), Random Forest, or Gradient Boosting models, which learn to separate real and tampered instances.
For instance, in [12], handcrafted features are derived from spatial and frequency domains to capture artifacts introduced during image or video manipulation. The authors demonstrate how these features, though simple and interpretable, can be effective when combined with carefully tuned classifiers. Similarly, ref. [6] proposes the use of residual noise obtained via high-pass filtering to highlight manipulation traces, particularly in low-texture regions where tampering is more difficult to detect.
One of the strengths of this family of methods lies in their transparency and explainability: the features used can often be traced back to interpretable visual patterns. Moreover, they are computationally efficient and require relatively small training datasets.
However, their performance is tightly coupled to the quality and relevance of the chosen features. Designing features that generalize across multiple types of manipulation remains a major challenge. Handcrafted approaches may also be vulnerable to post-processing operations such as filtering, resizing, or compression, which can suppress or distort the signal patterns they rely on.
Despite these limitations, handcrafted feature-based methods offer a viable middle ground between traditional blind forensics and data-intensive deep learning models, and are particularly useful when training data is scarce or explainability is a priority.

3.1.3. Deep Learning-Based Methods

Deep learning techniques have emerged as a dominant approach for detecting manipulated multimedia content due to their ability to automatically learn complex representations directly from data. These methods typically rely on large datasets to train models that can generalize across diverse manipulation types and content domains.
We group deep learning approaches into three subcategories: task-specific supervised models, generic supervised models, and one-class learning techniques.
Task-specific Supervised Models. These methods are designed to detect particular manipulation artifacts by tailoring the network architecture and training data to specific forgery types. For example, some approaches focus on artifacts caused by double JPEG compression or video encoding traces. In [18], the authors propose a Convolutional Neural Network (CNN) model optimized for identifying double compression artifacts in still images. Similarly, ref. [19] presents a method for detecting manipulations in H.264-compressed videos by analyzing inter-frame inconsistencies.
Other task-specific models target concrete scenarios such as copy-move forgeries, where duplicated regions within an image are used to conceal or replicate elements. Techniques like [20] use convolutional layers to learn patch-level correlations and identify duplicated content with high precision.
While these methods achieve excellent performance within their scope, their specialization makes them vulnerable to manipulations outside their target domain, limiting their general applicability.
Generic Supervised Models. Generic approaches aim to detect a wide variety of manipulations by training deep neural networks on heterogeneous datasets containing diverse forgery types. These methods are typically based on standard CNN architectures such as Xception, EfficientNet, or ResNet variants, adapted for classification tasks.
In [21], the authors train a CNN to detect generic image manipulations using a large dataset with various tampering techniques. Likewise, ref. [6] proposes a method based on residual noise maps extracted via high-pass filtering, which are then input into a convolutional architecture to detect subtle anomalies. This approach is particularly effective for capturing low-level inconsistencies across multiple manipulation categories.
The main advantage of generic models is their flexibility and broader coverage. However, their performance depends heavily on the diversity and quality of the training dataset. Overfitting and lack of generalization across unseen manipulations or datasets remain open challenges.
One-Class Learning Techniques. One-class deep learning methods are designed for scenarios where only authentic data is available during training. These models learn a representation of normal content and detect deviations as potential anomalies. Autoencoders and one-class classification frameworks are commonly used for this purpose.
In [12], a one-class autoencoder is trained using frames from original videos, under the assumption that they are unaltered. During inference, the reconstruction error of each frame is used as an anomaly score. This strategy is applicable to both images and video content, and offers the advantage of not requiring labeled manipulated data.
While one-class methods are suitable for limited supervision scenarios, they often struggle with subtle or localized manipulations and may generate false positives in highly variable content.
In conclusion, deep learning-based approaches offer high detection accuracy and flexibility but rely on large, representative datasets and are often difficult to interpret. Their performance may degrade in cross-dataset evaluations or under adversarial conditions. As such, they benefit from complementary analysis alongside blind or feature-based methods in practical forensic workflows.

3.1.4. Summary and Identified Gaps

In this section, we have reviewed the main families of multimedia manipulation detection methods. Blind forensic techniques offer interpretable, lightweight solutions, but are limited in scope and require knowledge of acquisition devices. Handcrafted feature-based approaches provide a compromise between interpretability and detection capability, though their effectiveness relies on well-chosen features and curated datasets. Deep learning-based methods, while powerful and flexible, demand large-scale annotated data and remain vulnerable to domain shifts and adversarial conditions.
Despite the diversity of existing approaches, we identify a critical gap in the current literature: most studies evaluate detection methods in isolation or under different experimental conditions, often focusing on either general-purpose or Deepfake-specific methods, but rarely both. Moreover, few works offer practical, reproducible frameworks that allow a fair and comparative assessment across method categories.
Our work addresses this gap by proposing a reference taxonomy for detection methods, building a unified dataset with controlled manipulations, and conducting an empirical evaluation of representative tools from each category under the same conditions. This comprehensive analysis provides a practical benchmark to guide future research and deployment in multimedia forensics. This is part of one of the research activities of our international chair “Smart Rural IoT and Secured Environments”, supported by INCIBE [22] (National Cybersecurity Institute of Spain).
After reviewing the main methodological families, we now turn to practical implementations that bring these techniques to end users.

3.2. Detection Tools

The use of different methodologies is as important as the existence of tools that implement these methodologies and make them available to analysts.
As a basic tool, we would like to highlight Forensically [10]. This general-purpose tool allows us to apply different techniques based on blind methods. We use Forensically to collect ELA (Error Level Analysis) images and perform image-noise extraction. The generated results must be manually validated by a forensic analyst. It is our baseline analysis tool for method comparison as it requires human intervention.
Another studied general-purpose tool is the MantraNet [14] project, which uses Deep Learning techniques. It uses pre-trained models; therefore, the MantraNet network is able to detect these manipulations only on suspicious images, and its performance depends on the training dataset supplied. It has been developed in Python 3.7 using Keras and TensorFlow. As shown in this work, MantraNet is capable of detecting splicing, copy-move, deletion, and forgeries of unknown types. It is also possible to work with images of different dimensions.
Image Forgery Detection with CNNs [15], uses Deep Learning techniques based on manipulation detection, such as splicing and copy-move. It also has a set of pre-trained models. This approach uses on CNNs that are specifically trained to identify local inconsistencies in image content that could indicate manipulation. The pre-trained models are optimized for common forgery scenarios, enabling detection without extensive retraining and making them easy to apply directly to new image datasets.
Several general-purpose forensic tools were also evaluated, despite not being specifically designed for Deepfake detection due to their common use in digital image analysis.
JPEGsnoop 1.8.0 [23] is a JPEG decoder that analyzes compression artefacts and quantization tables to infer an image’s editing history or origin. In this study, this tool failed to detect any manipulations in the Deepfake dataset. This result is in line with expectations, as synthetically generated images often ignore traditional compression signatures or are post-processed by platforms that normalize compression artefacts. Therefore, JPEGsnoop is still useful for detecting general-purpose manipulations, but not for identifying synthetic content.
Ghiro 0.2.1 [11] is an open-source platform for automated image forensics. It extracts metadata, generates histograms, detects embedded thumbnails, and applies techniques such as ELA. However, Ghiro did not produce significant indicators of manipulation when applied to Deepfake images. This limitation is due to the nature of AI-generated content, which generally lacks general-purpose editing traces or embedded metadata. Its inclusion in the evaluation highlights the reduced effectiveness of traditional forensic frameworks in the context of modern synthetic media.
ExifTool 13.42 [24], a widely used tool for extracting metadata from image and video files, was also used in the analysis. Although the images in the dataset are not sourced from social media, neither the synthetic nor the real content has embedded metadata. This scenario was designed to simulate the conditions often encountered on social platforms, where metadata is frequently removed upon upload. Consequently, this tool did not provide useful information with which to discriminate between authentic and manipulated content. However, their inclusion remains relevant to illustrate the limitations of metadata-based approaches and to reinforce the need for detection strategies focused on content analysis.
As mentioned before, general-purpose tools look for any kind of manipulation in images or videos. Conversely, Deepfake-specific tools focus their attention on detecting face substitution in images or videos.
The well-known forensic analysis tool Autopsy 4.22.1 [25] has a Deepfake detection plugin [13]. This functional Deepfake detection system is based on the Discrete Fourier Transform for feature extraction and a Support Vector Machine for processing and the detection of fake images and videos. It can be applied to both images and videos. Moreover, it provides the results of the analysis with the corresponding confusion matrix, precision, and recall, as well as a prediction inferring the probability that an image has been manipulated. Apart from faces, this plugin can also detect object insertion in images or videos.
The MesoNet [16] is a method for the automatic detection of facial manipulation in videos. It follows a deep learning-based approach where two networks are built, both with a low number of layers to focus on mesoscopic-level properties of the image. These two pre-trained models, Meso-4 and MesoInception4, can be found in the repository. In our case, we have adapted these models for image recognition in MesoNet. Specifically, we modified the code to analyze images and customized the displayed result.
The Deepfake-Detection project [26] is based on [27] and has been developed in PyTorch 1.3.1. This work focuses on Deepfake detection using CNN-based models. The provided scripts are capable of detecting Deepfakes in both images and videos. Furthermore, the repository includes three pre-trained models: FF++_c23, FF++_c40, and Deepfake_c0_xception. These scripts were adapted to run the experiments.
Finally, Deepware Scanner [17] is an online tool that is based on several Deepfake video detection projects built on top of PyTorch. These include the FaceNet library which, in conjunction with the Multitask Cascaded Convolutional Network (MTCNN) for in-frame face detection, enables Deepware detection and face recognition in videos. The Deepware Scanner is an easy-to-use tool: users simply upload a video to the Deepware Scanner section, and a report is generated showing whether it is a Deepfake, along with the probability of detection from the different models used.
After studying the existing detection techniques and tools in this section, the next objective is to leverage the available tools accessible to the average user to assess their ability to detect manipulation in audio and video files. Since AI-based tools have been trained on data from specific repositories and have shown highly effective performance in detecting manipulations of those datasets, a new repository of Deepfake audio and video files has been generated for this work. This will allow us to evaluate whether these tools can effectively detect manipulations in newly created content, simulating a real-world scenario where a file originates from social networks or other media sources. The following section outlines the research methodology used to achieve this objective and provides a detailed description of the dataset generated.

4. Materials and Methods

The methodology employed in this study is initially outlined to conduct the forensic analysis of multimedia forgery. Additionally, the established repository used to assess various content manipulation detection methods is also described from the exposed generation methods of Deepfakes.

4.1. Research Methodology

In order to achieve the main objectives of this work, the following must be done: (1) Reviewing the existing techniques for detecting Deepfakes, as well as classifying the general-purpose and Deepfake-specific tools analyzed in this context. This classification will serve as a reference model to be expanded with new and additional tools; and (2) Conducting a proof of concept, in which the studied tools are analyzed for multimedia forgery. For this purpose, the obtained results will be presented and discussed to serve as a starting point for further studies within the topic of the work. Figure 1 shows the workflow used for our current proposal.
The specific steps of the proposed workflow for this work are detailed below:
  • Survey of detection methods. Existing forensic literature on detection techniques capable of spotting visual anomalies in images and videos was reviewed, including both general-purpose and Deepfake-specific approaches.
  • Identification of generation tools. The landscape of Deepfake generation tools was mapped, encompassing face-swap software (DeepFaceLab 11.20.2021) [8], mobile applications (FaceApp) [7], text-to-image and synthetic face generators (DALL-E [28], ThisPersonDoesNotExist) [29,30], and video reenactment tools (Avatarify Desktop) [31].
  • Dataset generation. A dataset was constructed using a representative tool from each generation category to create mock samples, which were then combined with real examples to form a balanced evaluation set.
  • Selection of detection software. An exhaustive search was conducted to identify tools capable of detecting manipulations in images and videos. General-purpose detection methods were applied to individual multimedia samples without the use of reference models. Additionally, specific Deepfake detection techniques were considered, as introduced in the above section.
  • Experimental execution. All the selected detection tools were run on the generated dataset. Appropriate modifications were made between tools to ensure consistency and allow for meaningful comparisons.
  • Evaluation. The output formats were harmonized, the standard evaluation metrics were computed, and the results were compared to identify the most effective detection techniques.

4.2. Dataset Justification

Although several datasets exist for Deepfake detection research, most are limited either in the types of manipulations they include or in their applicability to both image and video modalities under consistent evaluation conditions. In addition, many are focused exclusively on either images or videos. For our experimental benchmark, we required a dataset that included both image and video modalities, contained manipulations generated with recent and diverse Deepfake techniques, and could be adapted to test both general-purpose and Deepfake-specific detection tools under uniform conditions. No existing dataset fully met these criteria, which motivated the construction of a new dataset tailored to our evaluation objectives.
To assess the relevance of our dataset, we compared its scope and features with commonly used benchmarks in Deepfake detection research [32]. Table 2 summarizes this comparison. We have named Synthetic and Authentic Forensic Lab (SAFL), which is available for public use.

4.3. Dataset Generation

The main existing methods for generating Deepfakes are based on solutions involving GANs (Generative Adversarial Networks) and several of their variations. The research community distinguishes four methods for Deepfake generation [46]. These are Entire Face Synthesis, Identity Swap/Faceswap, Attribute Manipulation, and Expression Swap. These approaches are considered for Deepfake content generation.
The Entire Face Synthesis technique involves creating complete, non-existent faces [47]. In our case, the Dall-E [28] tool and the Thispersondoesnotexist website [29] were used to generate fully synthetic images. The Identity Swap/Faceswap method consists of replacing one person’s face with another person’s face. DeepFaceLab [8] was used to implement this method. This tool replaces the face generated in the original content. Therefore, there is a high probability that the cut-off tools will detect the fake part.
The Attribute Manipulation method allows face editing and image/video manipulation. Face attributes, such as hair/skin color, appearance (e.g., aged or younger), or the addition/removal of elements, among others, can be modified [47]. It is achieved using FaceApp [7] and images from the CelebA dataset, which has been employed in recent studies such as [48] as a source. The Expression Swap technique addresses the modification of a facial expression. For instance, with Face2Face [49]. Computer graphics are employed to capture the first few frames of a video to obtain and track a temporary facial representation. Avatarify [31] was used for this work. Real multimedia content was added to the repository to determine whether the recognition tools could distinguish between real images and videos.
Conversely, DeepFaceLab was also used to generate Deepfake video frames. In addition to this, a selection of input videos labeled as “real” was collected from Celeb-DF [36]. A video camera was used to transfer expressions from a real face to an image, and this process was recorded to generate a final video captured with OBS Studio 31.1.0 [50].
To ensure legal clarity regarding the generation of synthetic samples, Table 3 summarizes the licenses and usage terms of the tools employed in the dataset creation process. In addition, we specify whether each tool embeds watermarks or automatic markers in the resulting synthetic media.

4.4. Dataset Description

Accordingly, for the specific purposes of this work, a dedicated dataset has been generated and published, including a set of 2095 images and 204 videos produced using the four aforementioned approaches. The original 2102 images were selected from the CelebA dataset, which has been employed in recent studies such as [48], and 212 videos were selected from Celeb-DF [36]. In addition, 2000 original audio samples in Spanish language and 2000 Deepfake audio samples generated using a Text-To-Speech (TTS) technique have been added to the repository to support future research on Deepfake audio detection.
All “real” samples included in the SAFL dataset are not newly collected by the authors, but are directly extracted from these well-established public benchmark datasets. No additional personal data of the individuals depicted is collected or inferred, and no new recordings of private persons are created. We do not add identifying metadata (e.g., names or online profiles). Since the source datasets were released under licenses allowing non-commercial academic use, the use of these samples in SAFL is based on secondary processing of openly available research corpora. Users of the SAFL dataset must comply with the terms and data-protection conditions established by the original dataset providers.
This generated dataset is utilized in this research to assess the selected Deepfake detection tools in a comparative study between general-purpose and Deepfake-specific detection tools [45]. The scripts used for this work are also available in the GitHub repository (oigres5/SAFL, version v1.0, commit 3f6f200, accessed on 22 November 2025).
Basically, the repository consists of the following three parts:
  • /images. It contains two subdirectories: one with the generated Deepfakes and the other with real images.
  • /videos. It contains two subdirectories: one with the generated Deepfakes and the other with real videos.
  • /audios. It contains two subdirectories: one with the generated Deepfakes and the other with real audio files.
  • /src. Some scripts have been developed to test some of our tools, including Image Forgery Detection, MantraNet, and MesoNet. These scripts can be found here.
Table 4 and Table 5 provide specifications for the generated fake files contained in the repository. The former represents the generated images, and the latter represents the generated videos. Table 6 and Table 7 display the real images and videos included in our dataset. Each table details the nomenclature and the contents of the repository. This facilitates the identification of the origin of the images and videos.
Each table is arranged consistently and displays a uniform structure. The first column shows the prefix used to name image or video files. The second column specifies the tool used for creation. For real images and videos, this column specifies the repository from which they originate. The third column includes comments regarding their creation. The last column provides the number of generated images andor video.

4.5. Experimental Setup

To ensure faithful and transparent reproducibility of the experiments, the configuration used for each application is detailed below.
Before presenting the details of each experiment, it should be noted that the evaluated tools are either end-user applications or pre-trained models; therefore, no additional training was required. For most tools, each experiment was executed five times to verify potential variability. Since the evaluated tools rely on deterministic inference (i.e., no random initialization, dropout, or stochastic components are involved during prediction), all repetitions produced identical results. Since the experimental runs yielded the same metric values in every repetition, the metrics did not take on different values across runs; consequently, both the empirical variance and the empirical standard deviation of these metrics were zero. This confirms that the reported metrics correspond to deterministic outcomes rather than averages of stochastic runs.
However, the Deepfake-Detection application yielded slightly different results across runs. To account for this variability, the experiment was executed five times and the results were averaged.
The hardware environment consisted of a laptop manufactured by Micro-Star International Co., Ltd. (Taipei, Taiwan), equipped with a 12th-generation Intel Core i7-12700H processor (32 GB of RAM, and an NVIDIA GeForce RTX 3070 graphics card. The host operating system was Windows 11 Pro. In addition, Deepfake-Detection experiments were executed on a virtual machine deployed with VirtualBox running Ubuntu 24.
Regarding parameter settings, for tools that allowed parameter configuration (e.g., thresholds, feature selection, or compression settings), the default values recommended by the developers were used unless explicitly stated otherwise. Hyperparameter optimization was not necessary, as the evaluated tools are distributed with fixed or pre-trained configurations. The tools for which training hyperparameters have been documented are described below:
  • Image Forgery Detection with CNN. Results were obtained with the pre-trained model CASIA2_WithRot_LR001_b128_nodrop.pt. Details of the hyperparameters can be found in the tool’s repository [15]. The authors also provide a report [51] indicating that this model was trained on the Casia V2.0 dataset with data augmentation based on image rotations, a batch size of 128, and a learning rate of 0.001. Furthermore, no dropout was applied in this model.
  • Autopsy plugin (image and video Deepfake detection). This plugin is distributed as a final tool that uses a trained model, whose parameters are described in [13]. The model is based on a SVM with an RBF (Radial Basis Function) kernel and a regularization parameter of 6.37.
  • MesoNet. The pre-trained model used in the experiment, Meso4_DF, is the one distributed with the original implementation. The architectural details of this model are specified in [52], although the corresponding hyperparameters are not documented.
Finally, each tool was applied to the dataset under conditions consistent with its functional scope and input requirements. The goal was to ensure a fair comparison across methods; however, certain applications imposed restrictions that limited their applicability. For instance, Deepware Scanner does not process videos shorter than 5 s, and the Autopsy plugin does not support images in PNG format. In such cases, only the supported subset of the dataset was used for evaluation. Performance was assessed separately for images and videos using standard metrics: Precision, Recall, F 1 -score and Accuracy. In addition, the Confusion Matrix was used for detailed error analysis. The results presented in Section 6 correspond to these evaluation settings.

5. Our Proposed Reference Model

This section outlines the proposed reference model for classifying various manipulation detection techniques in multimedia content.
Unlike previous works, which typically classify detection approaches either by their application domain (general-purpose vs. Deepfake-specific) or by their methodological basis (e.g., forensic, handcrafted features, deep learning), our model integrates both perspectives into a unified framework. This dual-layer organization highlights not only how each tool operates, but also why certain methods perform differently across manipulation types and media formats.
The main advantage of this reference model is that it provides a clearer and more systematic way to compare heterogeneous detection techniques under common criteria. By grouping tools according to their underlying principles—Blind Forensic, Handcrafted Machine Learning, Deep Learning-based methods, and Toolkits—it becomes possible to identify methodological strengths, limitations, and expected behavior in practical forensic scenarios. This helps bridge a gap in existing studies, where comparisons are often made across incompatible categories or without a structured analytical basis.
The paper introduces a reference model that categorizes Deepfake detection tools into the following groups: Blind Forensic methods, Handcrafted Feature-based Machine Learning methods, Deep Learning-based Methods and Toolkits.
This taxonomy is designed to provide a more precise and technically grounded classification of tamper detection tools by focusing on the methodological basis of each approach. Blind Forensic methods, in particular, rely on signal inconsistencies, statistical traces or metadata analysis, and do not require training data. The ML category comprises tools that extract handcrafted features. Deep Learning methods automatically learn representations from data using deep neural networks. Toolkits, on the other hand, provide ready-to-use solutions which may internally rely on one or more of these methodologies.
This refined classification model supplements broader application-based categories by determining whether a tool was originally developed for general-purpose forensic analysis or was specifically designed to detect Deepfake content. While some tools were not initially intended for Deepfake detection, they can still provide valuable indicators of manipulation when applied to such content.
Table 8 shows how the tools studied were classified according to this methodological taxonomy. This model forms the basis of our subsequent experimental evaluation, in which we assess the performance of both general-purpose and Deepfake-specific tools across our benchmark dataset.
For the evaluation, a representative subset of tools was selected based on the feasibility, redundancy, and relevance of the results. In particular, tools such as Ghiro, ExifTool and JPEGsnoop were conceptually included in the analysis provided by Forensically. For this reason, they were not evaluated separately. Similarly, methods such as MantraNet were excluded because they produced irrelevant results in preliminary tests, and MesoNet was discarded because its video processing component could not be executed successfully. The final evaluation includes a curated set of tools that demonstrate functional applicability and diverse methodological approaches. The selected tools were: Forensically, Image Forgery Detection with CNN, Autopsy plugin, MesoNet, Deepfake-Detection and Deepware Scanner.

6. Performance Evaluation

This section summarizes the first results of all the techniques applied to our image and video repository, as well as a comparison between general-purpose and specific Deepfake detection techniques.

6.1. Goodness Metrics

To evaluate the effectiveness of our work, different metrics have been used in this study (Precision, Recall, F 1 -score, and Accuracy). However, before that, some definitions must be provided, as described below:
  • True Positive (TP). This metric refers to the samples that are correctly classified as Deepfakes.
  • False Negative (FN). This metric refers to the Deepfake samples that are falsely classified as real images and videos.
  • False Positive (FP). This metric refers to the real samples that are misclassified as Deepfakes.
  • True Negative (TN). This metric refers to the samples that are correctly classified as real images and videos.
Precision. Precision is defined as the number of true positives ( T P ) divided by the total number of predicted positives ( T P + F P ). It indicates the quality of the model’s positive predictions. Precision is calculated as detailed in Equation (1).
Precision = T P T P + F P
Recall. Recall is defined as the number of true positives ( T P ) divided by the total number of actual positives ( T P + F N ). It measures the ability of the model to detect positive samples. Recall is calculated as detailed in Equation (2).
Recall = T P T P + F N
F 1 -score. The F 1 -score is the harmonic mean of precision (P) and recall (R). It provides a balanced measure by considering both precision and recall. It is detailed in Equation (3).
F 1 - score = 2 P R P + R
Accuracy. Accuracy represents the ratio of all correct predictions to the total number of samples. Accuracy is calculated as described in Equation (4).
Accuracy = T P + T N T P + T N + F P + F N

6.2. Results

This section presents the initial results obtained with each of the tested tools using our generated repository. The detailed metrics from the previous section have been calculated. In addition, general-purpose and Deepfake-specific detection methods are compared.
The first tool used was Forensically. It is important to note that human intervention in the role of a forensic analyst is required in this case. Table 9 presents its confusion matrix for images, detailing the TP, FN, FP, and TN percentages. Later, the corresponding goodness metrics are described in a comparative manner.
Regarding Image Forgery Detection with CNNs, Table 10 shows the confusion matrix when using the CASIA2_WithRot_LR001_b128_nodrop.pt model for Deepfake detection in images. Its performance appears to be inferior to that of Forensically. This fact will be confirmed later with the goodness metrics.
Focusing on Deepfake-specific detection methods, we use the Autopsy plugin based on an SVM classifier. Its confusion matrix for images is shown in Table 11 and it exhibits substantially lower true positive and true negative rates compared to Forensically (Table 9). It is important to note that the Autopsy tool can only analyze images in JPG format. For this reason, the experiment does not include images in PNG format.
As for the MesoNet tool, the results obtained with the Meso4_DF model for the confusion matrix can be observed in Table 12. Its performance is worse than that of the Autopsy plugin (SVM).
The confusion matrix obtained for the Deepfake-Detection tool is shown in Table 13. These results were obtained using the Deepfake_c0_xception model. It should be noted that the experiment does not include images in PNG format.
To confirm our previous conclusions regarding images, Table 14 displays the quality metrics for both general-purpose and Deepfake-specific techniques. In general, Precision is higher for general-purpose methods than for Deepfake methods. However, the remaining metrics still show room for improvement. Analysis with MantraNet was not included because its trained model classified all multimedia content as Deepfake. These values may suggest potential overfitting in the employed models, highlighting the need for a more detailed investigation of the underlying algorithms used by these tools. These results are complemented with Figure 2, which provides a visual comparison of tool accuracy in image detection. Beyond the overall accuracy comparison, Figure 3 provides a detailed view of the leading general-purpose and Deepfake-specific tools in image detection, showing how they perform across multiple evaluation metrics.
Regarding the general-purpose techniques, Image Forgery Detection CNN is the tool with the highest precision, surpassing Forensically by 5.9%. Although the Image Forgery Detection CNN tool achieves a precision of 91.9%, its accuracy is significantly lower than Forensically by 36.1%. In contrast, Forensically achieves a precision of 86.0%. Although the precision is 5.9% lower, the rest of the metrics indicate that Forensically is the tool with best results.
If we consider the results obtained using Deepfake-specific techniques, Autopsy achieved the highest precision 67.0% and outperformed the other tools in most metrics, including MesoNet and Deepfake-Detection. Autopsy was 15.3% more accurate than Deepfake-Detection and 18.8% more accurate than MesoNet. Finally, MesoNet obtained the worst results, with a recall of only 3.0% and F 1 -score of 5.7%. Therefore, Autopsy is the best choice.
Overall, Image Forgery Detection with CNN achieves the highest precision of 91.9%. However, given the poor performance of the other metrics, Forensically stands out as the most balanced tool, with a precision rate of 86.0%. It also achieves the highest recall at 88.0% and an F 1 -score of 87.0%, outperforming Image Forgery Detection with CNN by 36.1% in terms of accuracy.
Table 15 shows the goodness metrics for both general-purpose and Deepfake-specific techniques applied to videos. In general, the behavior of the tested tools is similar to that observed with images. However, some tools such as Image Forgery Detection with CNN, the Autopsy plugin, and Deepfake-Detection achieved an accuracy close to 50.0%. It can be seen that their Recall, F 1 -score, and Accuracy values differ significantly from their precision values. This suggests the presence of overfitting during the configuration process of these tools.
The accuracy of the evaluated tools on video detection is summarized in Figure 4, which facilitates a direct comparison across methods. In addition, Figure 5 offers a complementary perspective by contrasting the best-performing general-purpose and Deepfake-specific tools in video detection, illustrating their behavior across key evaluation metrics.
Compared to general-purpose techniques, Forensically is the most accurate tool, outperforming Image Forgery Detection with CNN by 47.2% and surpassing it in all metrics, with a 96.7% higher recall and a 96.7% higher F 1 -score. Results obtained with Deepfake-specific techniques show Deepware Scanner performs best, achieving 79.5% precision and 91.2% accuracy.
It can therefore be concluded that the general-purpose tool Forensically achieves the highest precision, surpassing Deepware Scanner by 20.0%, with a 8.5% higher recall, a 14.5% higher F 1 -score and an 7.4% higher accuracy.
When image and video are compared with the general-purpose methods, in videos, the precision is improved by 8.1% for Image Forgery Detection with CNN. However, the rest of the metrics have similar results. This shows that analyzing videos, which contain more information, does not necessarily improve the detection probability of the studied tools. This is because general-purpose tools analyzed are not designed to process videos and do not take advantage of the additional information they offer, such as frame interpolation, pixel motion distribution analysis, and other techniques commonly used to improve forgery detection.
A comparison of Deepfake-specific tools reveals that the metrics obtained with the Autopsy plugin for images are similar to those obtained with its video detection counterpart. Specifically, in the context of image analysis, there has been an increase of 11.3% in the F 1 -score and 19.2% in accuracy. This result was anticipated, as the video plugin does not utilize any temporal information between frames. Instead, it undertakes an independent analysis of each frame. The classification of a video as authentic or fraudulent is determined by the manipulation of frames, with the manipulation of one-third of the frames resulting in the video’s classification as fake. The same behaviour is observed in Deepfake-Detection, which yields similar metrics in both image and video detection. The only tool that significantly outperforms others in video detection compared to image detection is Deepware Scanner, which achieves a precision of 79.5% and an accuracy of 91.2%. It is imperative to acknowledge that this tool is built upon multiple existing Deepfake detection projects.
Finally, if we disregard an analysis performed by humans, it is worth noting that Deepware Scanner achieves the best results of all the analyzed tools, regardless of category. A noteworthy aspect of this result is that each tool focuses solely on a specific detection method, whereas Deepware Scanner integrates multiple projects to make its final decision. This outcome suggests a promising avenue for future research and development of tools using a similar decision-making approach, as it was the only tool to achieve significantly superior results compared to the rest.

6.3. Discussion

A thorough examination of the findings from this study reveals that the available tools have the potential to enhance the detection of multimedia content created through the use of Deepfake techniques. Regarding to image detection, general-purpose detection methods demonstrate an average precision of 89.0%, while Deepfake-specific tools exhibit an average precision of 61.8%, as evaluated in the previous subsection. With regard to video detection, the mean precision of the general-purpose techniques was 99.8%. For the Deepfake-specific tools, the mean precision across the three evaluated detectors was 59.5%, with individual values ranging from 49.4% to 79.5% (see Table 15). This average is only moderately above the 50% random baseline expected in a binary classification setting and should therefore be interpreted with caution. A more informative picture is given by the per-tool precision, recall, F 1 -score, and accuracy values reported in Table 15, which highlight in particular the strong performance of Deepware Scanner compared to the other Deepfake-specific tools. In this context, the average precision of 59.5% for Deepfake-specific tools does not reflect strong discriminative power by itself, but rather underlines the current limitations of these detectors when applied to heterogeneous, real-world Deepfake manipulations.
Regarding the accuracy values, the results also indicate significant room for improvement. The general-purpose techniques achieved an average accuracy of 68.9% in image detection and 75.0% in video detection, while the Deepfake-specific tools reached 58.1% and 63.8%, respectively. These averages remain relatively close to the 50% random baseline associated with binary classification and therefore must be interpreted with caution. Their proximity to chance-level performance reflects the variability among individual tools and the challenges posed by heterogeneous manipulations that differ from those seen during training.
The effectiveness of these tools depends on the specific models used and their previous training. Therefore, for a more comprehensive study and comparison of these tools under uniform conditions, it is essential to conduct training with a set of characteristics that are closely to the real-world scenarios.
It is essential to note that all results obtained are derived from a repository comprising images and videos generated using various Deepfake generation tools. It is widely recognized that if the dataset differs significantly in features from the training files, the outcomes of this research may differ from those reported in studies that used similar datasets.
Therefore, it is crucial to emphasize that Forensically requires the expertise of a trained forensic analyst. In this case, the visual characteristics and the inherent semantics of the image itself serve as predominant factors used to distinguish between an authentic image and one that has been manipulated.
We can now say that two kinds of methods can both help to find Deepfake content: general-purpose and Deepfake-specific detection methods. However, there remains substantial work to be done in this new field.

7. Conclusions

This study first provides a thorough review of current Deepfake detection techniques and introduces a new reference model that organizes available tools into two categories: general-purpose and Deepfake-specific. This model offers a clear structure that can be expanded to include new tools as they are created. In addition, the model is instantiated through four methodological families (Blind Forensic, Handcrafted Feature-based Machine Learning, Deep Learning-based methods and Toolkits), which allows heterogeneous detectors to be compared under common criteria in a reproducible way.
Building on this framework, we constructed the SAFL dataset, a realistic benchmark composed of manipulated images and videos generated with publicly available tools such as FaceApp, DeepFaceLab, Avatarify and DALL·E. Using this dataset, we conducted a proof-of-concept evaluation that applied both general-purpose and Deepfake-specific tools to the same multimedia content, enabling a fair comparison across methods and media types.
A proof-of-concept evaluation applying the model revealed that only a subset of general-purpose forensic methods remains effective against newer Deepfakes. For images, Forensically emerged as the most balanced detector, achieving an accuracy of 86.9% with an F 1 -score of 87.0%, while Deepfake-specific tools such as the Autopsy plugin reached lower but still competitive values (69.5% accuracy and 71.3% F 1 -score). Other methods like Image Forgery Detection with CNN or MesoNet obtained very high precision but extremely low recall, which resulted in poor overall performance despite apparently strong partial metrics. In image detection tasks, general-purpose methods such as Forensically outperformed specialized detectors. In video detection, however, Deepware Scanner showed the most competitive performance among Deepfake-specific tools. In particular, Forensically reached 98.6% accuracy in video analysis, while Deepware Scanner obtained 91.2% accuracy with an F 1 -score of 84.1%, clearly surpassing the remaining Deepfake-oriented tools. These results confirm that some classical forensic tools remain highly competitive, especially when human expertise is involved, whereas Deepfake-specific detectors still show uneven behavior across different manipulation types and conditions. These findings confirm the need for hybrid approaches that combine general forensic evidence with Deepfake-specific features.
From a practical perspective, our evaluation suggests that no single family of methods is sufficient on its own. Blind forensic techniques and handcrafted-feature approaches offer interpretability and robustness in certain scenarios, while deep learning-based detectors can exploit subtle patterns at the cost of higher sensitivity to dataset shifts and overfitting. The strong performance of Deepware Scanner, which internally integrates several Deepfake detection projects, supports the idea that ensemble or hybrid systems are a promising direction for real-world deployments.
The proposed reference model and the resulting empirical data provide researchers with a structured basis for comparing detection strategies and prioritizing future improvements. The main limitation of this work is the small number of open-source tools with reproducible benchmarks, which restricts the scope of quantitative comparisons. Furthermore, the experiments were conducted on a single curated dataset generated from specific tools, which may not cover the full diversity of Deepfake techniques and post-processing pipelines encountered in the wild. As a consequence, the reported figures should be interpreted as indicative rather than definitive, and cross-dataset validation remains an open challenge.
Future work will focus on expanding the benchmark to include a larger set of detectors and synthetic datasets, integrating the most promising techniques into a unified multimedia analysis application to facilitate rapid deployment and continuous evaluation as Deepfake generation evolves. Another relevant line of research will be the systematic inclusion of audio Deepfakes, robustness studies under different compression levels and social-network style degradations, and the exploration of explainable AI mechanisms that help analysts understand why a given sample is flagged as fake. The experimental protocol will also be extended to widely used public datasets, such as FaceForensics++ and DFDC, to enhance generalizability and enable benchmarking against other methods.
Ultimately, this research contributes to the development of trustworthy and transparent artificial intelligence ecosystems, reinforcing ethical and responsible AI practices that are essential for the transition from Industry 4.0 to Industry 5.0. By clarifying the relative strengths and weaknesses of existing tools and proposing a structured way to analyze them, this work provides a foundation for future Deepfake detection systems that are not only more accurate, but also more reliable and aligned with societal needs.

Author Contributions

Conceptualization, S.A.F.-L. and L.T.; Data curation, S.A.F.-L.; Formal analysis, S.A.F.-L., A.R.-G. and R.P.-V.; Investigation, S.A.F.-L., L.T. and A.R.-G.; Methodology, S.A.F.-L., L.T., A.R.-G. and R.P.-V.; Project administration, L.T., A.R.-G. and R.P.-V.; Resources, L.T. and A.R.-G.; Software, S.A.F.-L.; Supervision, L.T. and A.R.-G.; Validation, S.A.F.-L., L.T. and A.R.-G.; Visualization, S.A.F.-L., L.T., A.R.-G. and R.P.-V.; Writing—original draft, S.A.F.-L.; Writing—review & editing, S.A.F.-L., L.T. and A.R.-G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset and code generated in this study are publicly available in the SAFL repository at https://github.com/oigres5/SAFL (accessed on 22 November 2025).

Acknowledgments

Authors would like to acknowledge the support of the Spanish Government (through the Cybersecurity Institute of Spain, INCIBE, with the Strategic Research Project “Analysis of mobile applications from the perspective of data protection: Cyber-protection and Cyber-risks of citizen information” [54] and the International Chair “Smart Rural IoT and Secured Environments” [55], within the context of the Recovery, Transformation and Resilience Plan financed by the European Union (NextGenerationEU/PRTR) [56]. The authors also thank the support of the UNED CiberCSI [57] research group, as well as the UNED CiberGID [58] innovation group. Additionally, we acknowledge the support of the In4Labs Project [59] with reference TED2021-131535B-I00 funded by MCIU/AEI/10.13039/501100011033 and “European Union NextGenerationEU/PRTR”. During the preparation of this manuscript, the author(s) used GPT 5 tools to find synonyms for words that are used frequently in the text to avoid repetition and make the manuscript easier to read and more varied in terms of lexicon. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the published article.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
AIArtificial Intelligence
CFAColor Filter Array
CNNConvolutional Neural Network
CQCCConstant Q Cepstral Coefficients
DFDCDeepFake Detection Challenge
DLDeep Learning
ELAError Level Analysis
GANGenerative Adversarial Network
INCIBENational Cybersecurity Institute of Spain
MLMachine Learning
MFCCMel-Frequency Cepstral Coefficients
MTCNNMultitask Cascaded Convolutional Networks
PRNUPhoto Response Non-Uniformity
RPLPRevised Perceptual Linear Prediction
RBFRadial Basis Function
SVMSupport Vector Machine
TTSText-to-Speech
SAFLSynthetic Audio and Forensic Lab dataset

References

  1. INCIBE. Deepfakes. 2020. Available online: https://www.incibe.es/aprendeciberseguridad/deepfakes (accessed on 22 November 2025). (In Spanish).
  2. Wu, Y.; Ngai, E.W.T.; Wu, P.; Wu, C. Fake News on the Internet: A Literature Review, Synthesis and Directions for Future Research. Internet Res. 2022, 32, 1662–1699. [Google Scholar] [CrossRef]
  3. Wang, T.; Liao, X.; Chow, K.P.; Lin, X.; Wang, Y. Deepfake Detection: A Comprehensive Survey from the Reliability Perspective. ACM Comput. Surv. 2025, 57, 1–35. [Google Scholar] [CrossRef]
  4. Tan, F.; Zhai, M.; Zhai, C. Foreign object detection in urban rail transit based on deep differentiation segmentation neural network. Heliyon 2024, 10, e37072. [Google Scholar] [CrossRef]
  5. Elías, C.; Montoro, P.; Osuna, S.; Fernádez-Roldán, A.; Robles-Gómez, A.; Gálvez, T. Fake News. La fábrica de Mentiras. TVE2 and UNED Canal. 2023. Available online: https://www.rtve.es/play/videos/uned/03-11-23/7003458/ (accessed on 22 November 2025). (In Spanish).
  6. Verdoliva, L. Media Forensics and DeepFakes: An Overview. IEEE J. Sel. Top. Signal Process. 2020, 14, 910–932. [Google Scholar] [CrossRef]
  7. FaceApp. FaceApp. 2025. Available online: https://www.faceapp.com/ (accessed on 22 November 2025).
  8. Perov, I.; Gao, D.; Chervoniy, N.; Liu, K.; Marangonda, S.; Umé, C.; Dpfks, M.; Facenheim, C.S.; RP, L.; Jiang, J.; et al. DeepFaceLab: Integrated, flexible and extensible face-swapping framework. arXiv 2021, arXiv:2005.05535. [Google Scholar]
  9. Shukla, D.K.; Bansal, A.; Singh, P. A Survey on Digital Image Forensic Methods Based on Blind Forgery Detection. Multimed. Tools Appl. 2024, 83, 67871–67902. [Google Scholar] [CrossRef]
  10. Wagner, J. Forensically. 2025. Available online: https://29a.ch/photo-forensics (accessed on 22 November 2025).
  11. Tanasi, A.; Buoncristiano, M. Ghiro: Automated Digital Image Forensics Tool. Available online: https://github.com/Ghirensics/ghiro (accessed on 22 November 2025).
  12. Kingra, S.; Aggarwal, N.; Kaur, N. Emergence of Deepfakes and Video Tampering Detection Approaches: A Survey. Multimed. Tools Appl. 2023, 82, 10165–10209. [Google Scholar] [CrossRef]
  13. Ferreira, S.; Antunes, M.J.G.; Correia, M.E. Exposing Manipulated Photos and Videos in Digital Forensics Analysis. J. Imaging 2021, 7, 102. [Google Scholar] [CrossRef]
  14. Bammey, Q. Analysis and Experimentation on the ManTraNet Image Forgery Detector. Image Process. Line 2022, 12, 457–468. [Google Scholar] [CrossRef]
  15. kPsarakis. Image Forgery Detection with CNN. 2021. Available online: https://github.com/kPsarakis/Image-Forgery-Detection-CNN (accessed on 22 November 2025).
  16. Shanmuganathan, C.; Thamizharasi, M.; Anish, T.P.; Sivasankari, K. Enhancing DeepFake Detection: Leveraging MesoNet for Video Fraud Identification. SN Comput. Sci. 2024, 5, 301. [Google Scholar] [CrossRef]
  17. Deepware. Deepware Scanner. 2025. Available online: https://deepware.ai/ (accessed on 22 November 2025).
  18. Verma, V.; Singh, D.; Khanna, N. Block-Level Double JPEG Compression Detection for Image Forgery Localization. Multimed. Tools Appl. 2024, 83, 9949–9971. [Google Scholar] [CrossRef]
  19. Li, Y.; Gardella, M.; Bammey, Q.; Nikoukhah, T.; Morel, J.M.; Colom, M.; Von Gioi, R.G. A Contrario Detection of H.264 Video Double Compression. In Proceedings of the International Conference on Image Processing (ICIP), Kuala Lumpur, Malaysia, 8–11 October 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1765–1769. [Google Scholar]
  20. Zhong, J.L.; Pun, C.M. An End-to-End Dense-InceptionNet for Image Copy-Move Forgery Detection. IEEE Trans. Inf. Forensics Secur. 2020, 15, 2134–2146. [Google Scholar] [CrossRef]
  21. Rana, K.; Singh, G.; Goyal, P. MSRD-CNN: Multi-Scale Residual Deep CNN for General-Purpose Image Manipulation Detection. IEEE Access 2022, 10, 41267–41275. [Google Scholar] [CrossRef]
  22. INCIBE. 2025. Available online: https://www.incibe.es (accessed on 22 November 2025). (In Spanish).
  23. Hass, C. JPEGsnoop: JPEG Image Decoder and Analysis Tool. Available online: https://github.com/ImpulseAdventure/JPEGsnoop (accessed on 22 November 2025).
  24. Harvey, P. ExifTool by Phil Harvey. Available online: https://exiftool.org (accessed on 22 November 2025).
  25. Autopsy. Available online: https://www.autopsy.com/ (accessed on 22 November 2025).
  26. Liu, H.; Toubal, I.; Bonilla, H. Deepfake-Detection. Available online: https://github.com/HongguLiu/Deepfake-Detection (accessed on 22 November 2025).
  27. Rössler, A.; Cozzolino, D.; Verdoliva, L.; Riess, C.; Thies, J.; Nießner, M. FaceForensics++: Learning to Detect Manipulated Facial Images. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
  28. OPENAI. Dall-E 3. 2025. Available online: https://openai.com/dall-e-3/ (accessed on 22 November 2025).
  29. TPDNE. This Person Does Not Exist. 2025. Available online: https://this-person-does-not-exist.com/ (accessed on 22 November 2025).
  30. Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and Improving the Image Quality of StyleGAN. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 8107–8116. [Google Scholar]
  31. ALIEVK. Avatarify. 2025. Available online: https://github.com/alievk/avatarify-desktop, (accessed on 22 November 2025).
  32. Altuncu, E.; Franqueira, V.N.L.; Li, S. Deepfake: Definitions, Performance Metrics and Standards, Datasets, and a Meta-Review. Front. Big Data 2024, 7, 1400024. [Google Scholar] [CrossRef]
  33. Korshunov, P.; Marcel, S. Vulnerability Assessment and Detection of Deepfake Videos. In Proceedings of the 2019 International Conference on Biometrics (ICB), Crete, Greece, 4–7 June 2019; pp. 1–6. [Google Scholar] [CrossRef]
  34. Li, Y.; Chang, M.C.; Lyu, S. In Ictu Oculi: Exposing AI Created Fake Videos by Detecting Eye Blinking. In Proceedings of the 2018 IEEE International Workshop on Information Forensics and Security (WIFS), Hong Kong, China, 11–13 December 2018; pp. 1–7. [Google Scholar] [CrossRef]
  35. Rössler, A.; Cozzolino, D.; Verdoliva, L.; Riess, C.; Thies, J.; Nießner, M. FaceForensics: A Large-Scale Video Dataset for Forgery Detection in Human Faces. arXiv 2018, arXiv:1803.09179. [Google Scholar]
  36. Li, Y.; Yang, X.; Sun, P.; Qi, H.; Lyu, S. Celeb-DF: A Large-Scale Challenging Dataset for DeepFake Forensics. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 3204–3213. [Google Scholar]
  37. Dolhansky, B.; Bitton, J.; Pflaum, B.; Lu, J.; Howes, R.; Wang, M.; Canton Ferrer, C. The DeepFake Detection Challenge (DFDC) Dataset. arXiv 2020, arXiv:2006.07397. [Google Scholar] [CrossRef]
  38. Zi, B.; Chang, M.; Chen, J.; Ma, X.; Jiang, Y.G. WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection. arXiv 2024, arXiv:2101.01456. [Google Scholar]
  39. Cai, Z.; Ghosh, S.; Adatia, A.P.; Hayat, M.; Dhall, A.; Gedeon, T.; Stefanov, K. AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset. In Proceedings of the 32nd ACM International Conference on Multimedia, Melbourne, Australia, 28 October–1 November 2024; MM ’24. pp. 7414–7423. [Google Scholar] [CrossRef]
  40. Zhou, P.; Han, X.; Morariu, V.I.; Davis, L.S. Two-Stream Neural Networks for Tampered Face Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 1831–1839. [Google Scholar] [CrossRef]
  41. Zhang, Y.; Yin, Z.; Li, Y.; Yin, G.; Yan, J.; Shao, J.; Liu, Z. CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations. In Proceedings of the Computer Vision—ECCV 2020, Virtually, 23–28 August 2020; Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M., Eds.; Springer: Cham, Switzerland, 2020; pp. 70–85. [Google Scholar]
  42. Dang, H.; Liu, F.; Stehouwer, J.; Liu, X.; Jain, A.K. On the Detection of Digital Face Manipulation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 5780–5789. [Google Scholar] [CrossRef]
  43. Wang, Z.; Bao, J.; Zhou, W.; Wang, W.; Hu, H.; Chen, H.; Li, H. DIRE for Diffusion-Generated Image Detection. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 22388–22398. [Google Scholar] [CrossRef]
  44. Song, H.; Huang, S.; Dong, Y.; Tu, W.W. Robustness and Generalizability of Deepfake Detection: A Study with Diffusion Models. arXiv 2023, arXiv:2309.02218. [Google Scholar] [CrossRef]
  45. Falcón-López, S.A. Synthetic and Authentic Forensic Lab. 2025. Available online: https://github.com/oigres5/SAFL (accessed on 22 November 2025).
  46. Tolosana, R.; Vera-Rodriguez, R.; Fierrez, J.; Morales, A.; Ortega-Garcia, J. Deepfakes and beyond: A Survey of face manipulation and fake detection. Inf. Fusion 2020, 64, 131–148. [Google Scholar] [CrossRef]
  47. Akhtar, Z. Deepfakes Generation and Detection: A Short Survey. J. Imaging 2023, 9, 18. [Google Scholar] [CrossRef] [PubMed]
  48. Yauri-Lozano, E.; Castillo-Cara, M.; Orozco-Barbosa, L.; García-Castro, R. Generative Adversarial Networks for Text-to-Face Synthesis & Generation: A Quantitative–Qualitative Analysis of Natural Language Processing Encoders for Spanish. Inf. Process. Manag. 2024, 61, 103667. [Google Scholar] [CrossRef]
  49. Thies, J.; Zollhöfer, M.; Stamminger, M.; Theobalt, C.; Nießner, M. Face2Face: Real-time Face Capture and Reenactment of RGB Videos. arXiv 2020, arXiv:2007.14808. [Google Scholar] [CrossRef]
  50. Studio, O. OBS Studio. 2025. Available online: https://obsproject.com/ (accessed on 22 November 2025).
  51. kPsarakis. Image Forgery Detection with CNN—Report. 2021. Available online: https://github.com/kPsarakis/Image-Forgery-Detection-CNN/blob/master/reports/Group_10-Image_Forgery_Detection_report.pdf (accessed on 22 November 2025).
  52. DariusAf. MesoNet. 2021. Available online: https://github.com/DariusAf/MesoNet (accessed on 22 November 2025).
  53. Ferreira, S.; Antunes, M.J.G.; Correia, M.E. Photo and Video Manipulations Detector. 2025. Available online: https://github.com/saraferreirascf/Photo-and-video-manipulations-detector (accessed on 22 November 2025).
  54. UNED. Analysis of Mobile Applications from the Perspective of Data Protection. Available online: https://srise.informatica.uned.es/proyecto/ (accessed on 22 November 2025).
  55. UNED. Smart Rural IoT and Secured Environments. Available online: https://srise.informatica.uned.es/ (accessed on 22 November 2025).
  56. NextGenerationEU. Available online: https://next-generation-eu.europa.eu/ (accessed on 22 November 2025).
  57. CiberCSI. Available online: https://blogs.uned.es/cibercsi/ (accessed on 22 November 2025).
  58. CiberGID. Available online: https://blogs.uned.es/cibergid/ (accessed on 22 November 2025).
  59. In4Labs. Available online: https://open.ieec.uned.es/in4labs/ (accessed on 22 November 2025).
Figure 1. Work-Flow for the Forensic Analysis of Multimedia Forgery. Source: authors’ contribution.
Figure 1. Work-Flow for the Forensic Analysis of Multimedia Forgery. Source: authors’ contribution.
Applsci 15 12664 g001
Figure 2. Accuracy of Tools in Image Detection. Source: authors’ contribution.
Figure 2. Accuracy of Tools in Image Detection. Source: authors’ contribution.
Applsci 15 12664 g002
Figure 3. Top General-purpose and Deepfake-specific Tools in Image Detection. Source: authors’ contribution.
Figure 3. Top General-purpose and Deepfake-specific Tools in Image Detection. Source: authors’ contribution.
Applsci 15 12664 g003
Figure 4. Accuracy of Tools in Video Detection. Source: authors’ contribution.
Figure 4. Accuracy of Tools in Video Detection. Source: authors’ contribution.
Applsci 15 12664 g004
Figure 5. Top general-purpose and Deepfake-specific tools in video detection. Source: authors’ contribution.
Figure 5. Top general-purpose and Deepfake-specific tools in video detection. Source: authors’ contribution.
Applsci 15 12664 g005
Table 1. Classification of the Main Detection Techniques and Analyzed Tools in Digital Images and Videos. Source: authors’ contribution.
Table 1. Classification of the Main Detection Techniques and Analyzed Tools in Digital Images and Videos. Source: authors’ contribution.
Type of TechniquesPrevious Knowledge RequiredDescriptionGeneral-Purpose ToolsDeepfake Tools
Blind methods  [9]Honest camera, image and edition software characteristicsAnomaly detectionForensically [10], Ghiro [11]-
Supervised methods with handcrafted features [12]Image/videos datasetsSupervised machine learning algorithms-Autopsy plugins [13]
Detection methods based on Deep Learning [3]Image/videos datasetsCNNs/Related approachesMantraNet  [14], Image Forgery Detection with CNNs [15]MesoNet [16], Deepware Scanner [17]
Table 2. Comparison of Our Proposed Dataset with Existing Benchmarks. Source: authors’ contribution.
Table 2. Comparison of Our Proposed Dataset with Existing Benchmarks. Source: authors’ contribution.
DatasetResourceSamplesGeneration MethodYear
DeepfakeTIMIT [33]Video620 fakeFace manipulation2018
UADFV [34]Video49 real 49 fakeMultiple GANs2018
FaceForensics++ [35]Video1000 real 4000 fakeMultiple Face manipulation methods2019
Celeb-DF v2 [36]Video590 real 5639 fakeFaceSwap2020
DFDC [37]Video23,654 real 104,500 fakeFaceSwap/AudioSwap2020
WildDeepfake [38]Video707 fakeOnline collection2020
AV-Deepfake1M [39]Video286,721 real 860,039 fakeFace reenactment + Text-to-speech2023
SwapMe & FaceSwap [40]Images2300 real 2010 fakeFaceSwap2017
CelebA-Spoof [41]Images202,599 real 625,537 fakeFace spoofing2020
DFFD [42]Images58,703 real 240,336 fakeFace manipulation2020
DiffusionForensics [43]Images615,200 fakePretrained diffusion models2023
DFF [44]Images30,000 real 90,000 fakeDiffusion models, Face manipulation2023
Our Proposed SAFL [45]Images2102 real 2095 fakeFaceSwap, Entire Face
Synthesis, Attribute
Manipulation,
Expression Swap
2025
Videos202 real 204 fake
Audios2000 real 2000 fake
Table 3. Licensing information and watermark behavior of the tools used to generate synthetic data. Source: authors’ contribution.
Table 3. Licensing information and watermark behavior of the tools used to generate synthetic data. Source: authors’ contribution.
ToolLicense/Terms of UseNotes on Allowed UsageWatermarks or Automatic Markers
DALL-E 2/DALL-E 3 [28]OpenAI Terms of UseOutput media may be used and redistributed; authors retain rights to generated content.No watermarks in downloaded images.
FaceApp [7]Proprietary (FaceApp Terms)Permits transformation and use of processed images; redistribution allowed under app terms.Adds a visible “FaceApp” text label at the bottom of the image.
DeepFaceLab [8]GPL-3.0 LicenseFully open-source; allowed for modification, redistribution, and academic use.No watermarks added.
Avatarify [31]MIT LicenseOpen-source; free use for research and publication.No watermarks added.
ThisPersonDoesNotExist [29]Website Terms of UseSynthetic faces are freely usable; no attribution required.No watermarks applied.
Table 4. Proposed Dataset Features for the Generated Deepfake Images. Source: authors’ contribution.
Table 4. Proposed Dataset Features for the Generated Deepfake Images. Source: authors’ contribution.
PrefixToolDescriptionSamples
TPNEThispersondoesnotexistImages generated with this web application.1009
FARC FAFaceAppUpdated images from the CelebA dataset and Images from the own application.613
DFLDeepFaceLabimages taken from Deepfake videos generated with face swapping.453
DDall-EImages generated by Dall-E.20
Table 5. Proposed Dataset Features for the Generated Deepfake Videos. Source: authors’ contribution.
Table 5. Proposed Dataset Features for the Generated Deepfake Videos. Source: authors’ contribution.
PrefixToolDescriptionSamples
DFLDeepFaceLabVideos from the Celeb-DF repository and generated from DeepFaceLab demo videos.100
AVAvatarifyVideos based on real images from the CelebA dataset.104
Table 6. Proposed Dataset Features for Real Images. Source: authors’ contribution.
Table 6. Proposed Dataset Features for Real Images. Source: authors’ contribution.
PrefixToolDescriptionSamples
RCCelebAImages from the “img_align_celeba.zip” in the CelebA repository.420
RCVCeleb-DFImages obtained from videos of the Celeb-DF repository: Celeb-real directory in“Celeb-DF.zip”.6
RDFLDeepFaceLabA total of 4: Images obtained from DeepFaceLab.4
Table 7. Proposed Dataset Features for Real Videos. Source: authors’ contribution.
Table 7. Proposed Dataset Features for Real Videos. Source: authors’ contribution.
PrefixToolDescriptionSamples
CCeleb-DFVideos of the Celeb-DF repository: Celeb-real directory in “Celeb-DF.zip”.212
Table 8. Classification of Tools according to Our Proposed Reference Model. Source: authors’ contribution.
Table 8. Classification of Tools according to Our Proposed Reference Model. Source: authors’ contribution.
Type of ResourceMethodSelected Tool
ImagesBlind ForensicForensically [10]
JPEGsnoop [23]
ExifTool [24]
Handcrafted MLAutopsy plugin (SVM) [53]
Deep Learning-basedImage Forgery Detection CNN [15]
MantraNet [14]
MesoNet [16]
Deepfake-Detection [26]
ToolkitGhiro [11]
VideosBlind ForensicForensically *
JPEGsnoop *
ExifTool
Handcrafted MLAutopsy plugin (SVM)
Deep Learning-basedImage Forgery Detection CNN *
MesoNet
Deepfake-Detection
ToolkitDeepware-Scanner [17]
* Tools designed for image analysis, usable on video frames.
Table 9. Forensically–Images: Confusion Matrix. Source: authors’ contribution.
Table 9. Forensically–Images: Confusion Matrix. Source: authors’ contribution.
Real Value/PredictionPositiveNegative
Positive (2095)1843252
Negative (2102)3001802
Table 10. Image Forgery Detection (CASIA2_WithRot_LR001_b128_nodrop.pt model)–Images: Confusion Matrix. Source: authors’ contribution.
Table 10. Image Forgery Detection (CASIA2_WithRot_LR001_b128_nodrop.pt model)–Images: Confusion Matrix. Source: authors’ contribution.
Real Value/PredictionPositiveNegative
Positive (2095)342061
Negative (2102)32099
Table 11. Autopsy plugin (SVM)–Images: Confusion Matrix. Source: authors’ contribution.
Table 11. Autopsy plugin (SVM)–Images: Confusion Matrix. Source: authors’ contribution.
Real Value/PredictionPositiveNegative
Positive (2077)1583494
Negative (2100)7811319
Table 12. MesoNet–Images: Confusion Matrix. Source: authors’ contribution.
Table 12. MesoNet–Images: Confusion Matrix. Source: authors’ contribution.
Real Value/PredictionPositiveNegative
Positive (2095)632032
Negative (2102)382064
Table 13. Deepfake-Detection–Images: Confusion Matrix. Source: authors’ contribution.
Table 13. Deepfake-Detection–Images: Confusion Matrix. Source: authors’ contribution.
Real Value/PredictionPositiveNegative
Positive (2077)7721305
Negative (2100)6071493
Table 14. Comparative Results with General-purpose/Deepfake-specific Methods for the Tested Tools (Metrics for Images). Source: authors’ contribution.
Table 14. Comparative Results with General-purpose/Deepfake-specific Methods for the Tested Tools (Metrics for Images). Source: authors’ contribution.
MethodToolPrecisionRecall F 1 -ScoreAccuracy
General-purposeForensically86.0%88.0%87.0%86.9%
Image Forgery Detection CNN91.9%1.6%3.2%50.8%
Deepfake-specificAutopsy plugin (SVM)67.0%76.2%71.3%69.5%
MesoNet62.4%3.0%5.7%50.7%
Deepfake-Detection55.9%37.2%44.7%54.2%
Table 15. Comparative Results with General-purpose/Deepfake-specific Methods for the Tested Tools (Metrics for Videos). Source: authors’ contribution.
Table 15. Comparative Results with General-purpose/Deepfake-specific Methods for the Tested Tools (Metrics for Videos). Source: authors’ contribution.
MethodToolPrecisionRecall F 1 -ScoreAccuracy
General-purposeForensically99.5%97.7%98.6%98.6%
Image Forgery Detection CNN100.0%1.0%1.9%51.4%
Deepfake-specificAutopsy plugin (SVM)49.7%75.6%60.0%50.3%
Deepware Scanner79.5 %89.2%84.1%91.2%
Deepfake-Detection49.4%100.0%66.1%49.9%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Falcón-López, S.A.; Tobarra, L.; Robles-Gómez, A.; Pastor-Vargas, R. Forensic Analysis of Manipulated Images and Videos. Appl. Sci. 2025, 15, 12664. https://doi.org/10.3390/app152312664

AMA Style

Falcón-López SA, Tobarra L, Robles-Gómez A, Pastor-Vargas R. Forensic Analysis of Manipulated Images and Videos. Applied Sciences. 2025; 15(23):12664. https://doi.org/10.3390/app152312664

Chicago/Turabian Style

Falcón-López, Sergio A., Llanos Tobarra, Antonio Robles-Gómez, and Rafael Pastor-Vargas. 2025. "Forensic Analysis of Manipulated Images and Videos" Applied Sciences 15, no. 23: 12664. https://doi.org/10.3390/app152312664

APA Style

Falcón-López, S. A., Tobarra, L., Robles-Gómez, A., & Pastor-Vargas, R. (2025). Forensic Analysis of Manipulated Images and Videos. Applied Sciences, 15(23), 12664. https://doi.org/10.3390/app152312664

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop