An Investigation of the Effectiveness of Deepfake Models and Tools

: With the development of computer vision and deep learning technologies, rapidly expanding approaches have been introduced that allow anyone to create videos and pictures that are both phony and incredibly lifelike. The term deepfake methodology is used to describe such technologies. Face alteration can be performed both in videos and pictures with extreme realism using deepfake innovation. Deepfake recordings, the majority of them targeting politicians or celebrity personalities, have been widely disseminated online. On the other hand, different strategies have been outlined in the research to combat the issues brought up by deepfake. In this paper, we carry out a review by analyzing and comparing (1) the notable research contributions in the ﬁeld of deepfake models and (2) widely used deepfake tools. We have also built two separate taxonomies for deepfake models and tools. These models and tools are also compared in terms of underlying algorithms, datasets they have used and their accuracy. A number of challenges and open issues have also been identiﬁed.


Introduction
With the advent of deepfake [1], it is now possible to produce very realistic videos to show humans talking and performing actions that have never happened.Deepfake is one of the most recent innovations making a severe social impact.Combining the velocity and accessibility of social media with convincing deepfake content can quickly reach a large number of audiences.Therefore, many users experience anxiety as a result of the spread of these phony videos, which have made deepfake scary as well as infamous.Distinguishing between legitimate and fake media (i.e., video, image, etc.) is becoming impossible with the naked eye due to the enhancement of deepfake technologies; it is even difficult to differentiate with the help of available tools.For facial expression swapping in photos and videos, artificial intelligence (AI)-based applications including Face2Face [2] and FaceSwap [3] have been widely utilized.Using such face-altering techniques, it is possible to change someone's appearance, haircut, age, movement of the lips and eyes, as well as other physical properties.In this paper, we first conduct a review that investigates different deepfake models and tools and their respective performance and effectiveness.
The advancement of computer vision and deep learning technologies has significantly contributed to the proliferation of deepfake techniques.Deepfake creators alter facial expressions, gestures, and even entire identities, which is viable due to the capability of manipulating and generating content by learning from vast datasets [4].The emergence of deep fake videos and images raises grave concerns in numerous fields.Deepfakes can undermine public trust and manipulate public opinion in politics [5].The technology can be used to create counterfeit interviews and speeches that misrepresent politicians, resulting in misinformation and the propagation of deceptive narratives.Deepfakes can be used in the entertainment industry to superimpose the images of celebrities onto explicit content, violating their privacy and harming their reputations [5].Moreover, in journalism, deepfakes can blur the line between reality and fiction, making it harder to verify news integrity and undermining media outlets' credibility.Beyond these domains, deep fake technology threatens security, privacy, and trust in numerous societal contexts.Therefore, understanding the effectiveness of available deepfake-detection tools [6][7][8] is a pressing topic to combat the social, political, and personal issues raised by deepfake images and videos.
In recent times, a significant amount of research [9][10][11] has been performed in the area of deepfake.As a result, various review papers have compiled summaries in various disciplines.In order to conduct this research, we first looked at papers for deepfake-detection tools.Various fundamental issues include privacy [12,13], face-swapping precision [13], model design, social effects [14], and AI threat [15] depending on the area and particular use cases.The purpose of this paper is to review the existing research literature and to summarize the cutting-edge strategies that have recently been invented to address these issues.We also determine the accuracy of deepfake detection in a variety of well-known online tools using different technologies and datasets they have employed.Additionally, by reviewing the most recent advancements in each of the mentioned deepfake fields of study, we pinpoint the holes in the studied deepfake surveys.We discuss the issues, usages, and design considerations comprehensively and suggest possible future research possibilities.
While some studies [16][17][18] have evaluated the quality of specific deepfake movies, these assessments are sometimes subjective and lack uniformity, making it difficult to compare findings across research or to identify areas for improvement.The specific objectives of this study are to review and evaluate deepfake generative models and detection tools.Instead of focusing solely on the technical details of deepfake tools and technologies, we intend to determine how effectively these tools can generate convincing and realistic synthetic media.This analysis will provide insight into the strengths and shortcomings of current deepfake technology, which may guide future development.Our goal is to determine how successfully these tools can produce authentic and convincing synthetic media that are indistinguishable from real footage or audio.These data will enable us to identify any shortcomings in current deepfake technology and direct future development.
In summary, we make the following contributions: • An in-depth, up-to-date review is carried out in the field of deepfake models and deepfake tools.• Two separate taxonomies are proposed that categorize the existing deepfake models and tools.

•
The effectiveness of existing deepfake models and detection tools are compared in terms of underlying algorithms, datasets used and accuracy.
The organization of the paper is graphically presented in Figure 1.Section 2 describes deepfake and its evolution, and Section 3 illustrates the materials and methods of the paper.Section 4 presents the positive and negative impacts of deepfake tools and techniques.Section 5 initializes deepfake classification.Section 6 describes different deepfake models, Section 7 describes different deepfake tools, and Section 8 presents deep-learning-based deepfake tools.Sections 9 and 10 discuss the open issues and challenges in deepfake, and Section 11 concludes the paper.

Deepfake and Its Evolution
In this section, we first explore the concept of deepfake technology and its generation process.Additionally, we examine the historical evolution of deepfake technologies.

What Is Deepfake?
Deepfakes [1] are pragmatic media (i.e., videos and images) that have been digitized to represent individuals acting and performing actions that have never actually occurred.They integrate "deep learning" and "fake".Deepfakes use neural networks that learn to replicate a person's facial gestures, characteristics, speech, and intonations by analyzing enormous quantities of data samples.In order to train a deep learning system to interchange facial expressions, two people's video contents are fed into the system.Deepfakes employ a facial positioning system and AI to replace one person's face into a video with the facial appearance of another individual.
Figure 2 depicts the face swapping of two persons with their respective facial characteristics and expressions where the FaceDancer [19,20] tool has been used for this.We collect these human photos from a public repository, Pexels (https://www.pexels.com/search/human/ (accessed on 3 February 2023)).To accomplish the face swapping, we provide the source and target photos.Prior to swapping in the target image, the tool first recognizes the source face's facial region and emotion.We see the facial region surrounded by a shaded circle in the figure.The facial region consists of the eyes, eyebrows, nose, and mouth region.The blue-shaded circle represents the source image's facial area that will be swapped with the red-shaded circle area of the target image.The facial attributes of the source image (e.g., eyes, nose, mouth, etc.) will be swapped with the target facial attributes, but the expression of the target image will remain the same.Finally, the output face-swapped image as a green-shaded circle area represents the difference.
The majority of successful deepfake video-generation and detection algorithms use two subgroups of neural networks: (1) autoencoders and (2) generative adversarial networks (GANs).
An autoencoder [21][22][23] is a form of neural network that is used in deepfakes.These are made up of an encoder, which shrinks an image to a hidden space with fewer dimensions, as well as a decoder, which builds the image back up from the series of hidden layers.
Deepfakes make use of this approach by encoding a person through to the latent space using a generic encoder.Important details about their face characteristics and body movements are contained in the series of hidden layers.Then, a model is trained particularly so that the objective may be decoded, which is presented in Figure 3.In other words, the latent space's representation of the source video's body characteristics and facial traits will be overlaid with the target's specific information.After training is finished, a latent face created from latent picture A can be sent to decoder A. Figure 4 shows how the decoder will attempt to recreate latent image B using data related to latent image A. On the other hand, there are some drawbacks of the autoencoder: the average of the input set from an autoencoder might always be received, it might always rebuild the input set precisely, it might trickily combine the two flaws, there may be an incorrect use case for training, decoding errors, failing to recognize key elements, etc.To fix any potential weaknesses in the autoencoders, there are generative adversarial network (GAN) [24] models.GAN employs the generator and discriminator as unsupervised polarized sub-models.The generator alters the source that it was trained on to produce phony visual or aural outputs.The discriminator tries to detect whether the image is generated, while the generator generates new images using the latent representation of the original material.As a result, the generator produces incredibly realistic images because any flaws would be detected by the discriminator.The fundamental GAN architecture procedure is depicted in Figure 5.The mentioned deepfake architectures are baseline models used for generating synthetic data.Based on these architectures, several optimized models have been proposed for the creation or detection of deepfakes.There are many areas of research that are being investigated to generate and detect cutting-edge models, novel datasets, and deepfake-detection methods.Additionally, the present research initiatives seek to lessen the main issues with deepfake, including its negative social impacts and privacy and security concerns.

Historical Development of Deepfake
The historical developments in deepfake technology by year from 2014 to 2022: 2014: First mention of deep learning in deepfake technology [11] and research on deep-learning-based face recognition systems.
2016: Face2Face: Real-time face capture and reenactment of RGB videos and the first significant deepfake viral video (President Obama) [2].
2017: Development of advanced algorithms for automatic face swap technology and the emergence of advanced tools such as DeepFaceLab [25].
2018: Deepfake pornographic videos [26] start to appear on the internet.Researchers develop detection methods for deepfakes and manipulated videos that impact political campaigns.
2019: Social media companies begin to take action against deepfake videos, and there is development of deepfake-detection software by AI companies.2020: Deepfake technology continues to evolve with the creation of deepfake voice technology [27], and AI companies develop advanced detection software to counter deepfakes.
2021: The emergence of deepfake [1] text technology, allowing for the creation of fake news articles and other written content, increases the use of deepfakes for fraud and scams and continued development of deepfake-detection methods and software.
2022: The emergence of deepfake technologies security [10], solving the creation of fake news articles and other written content.
Overall, the historical development of deepfake technology has seen significant advances in both the creation and detection of deepfakes.While the technology has many potential uses, such as in the entertainment industry, it also poses significant risks, particularly in regard to its potential for spreading false information and manipulating public opinion.Figure 6 depicts the historical development of deepfake technologies from 2014 to 2022.

Materials and Methods
In this section, first, we conduct a systematic review of deepfake tools, then we analyze the quality of the papers, and finally, we discussed the related works.

Systematic Review of Deepfake Tools' Effectiveness
According to Pilares [28], a systematic review is a suitable method for compiling existing studies and identifying the gaps that can suggest a new area of research.A comprehensive review was performed to compile a summary of the effectiveness of the deepfake tools now available and their underlying models' attempts to address the problems with synthetic media.The search phase and the definition of the inclusion and exclusion criteria are the two main sections of the review methodology.
The search step entails specifying the academic resources, digital databases, and search engines that may be used to look for appropriate research, as well as the questions that are going to use AND/OR Boolean operators to identify all connected results.The databases used for this systematic review are displayed in Table 1.All of the searches are included in Table 2.The following stage involves defining the inclusion criteria (IC) and exclusion criteria (EC).To improve query results, these were carefully laid out.The breakdown of IC and EC for this investigation is shown in Table 3.All EC-related studies were immediately disqualified.To obtain a more targeted search outcome, the titles, abstracts, and full texts of findings can be further filtered.
Title: The keywords given in Table 3 were used to filter out studies that failed to include at least one of them.
Abstract: Papers that fulfill no less than 40% of IC were kept for review.
Full text: Papers should describe diverse models and techniques as well as approaches that address deepfake issues.

IC1
Should contain at least one of the keywords

IC2
Must be included in one of the selected databases At the start of the study, a total of 1365 records were obtained, where 805 were gathered through database searches and 560 from other resources.After screening, 214 records from database searches and 128 from other resources were retained.Upon applying eligibility criteria, 31 papers from other sources and 38 studies from the database search were deemed suitable.Eventually, a total of 69 studies were considered relevant for this paper.Figure 7 provides a more detailed illustration of this process.

Analyzing the Effectiveness of Systematic Reviews
The quality of papers included in the systematic review is evaluated using two methods.The first method involves quality evaluation (QE) questions, where each question can be marked 0, 1, or 2 per reviewer ("no" (0), "partially" (1), or "yes" (2)).The score for each question is a cumulative score of two markers, with a maximum score of 4 for each question (i.e., if marker1 gives 2 and marker2 gives 1, then the quality evaluation of this question is 3).The second method involves a pass/fail criterion based on a cumulative score between 0 and 15 as a failure and between 16 and 28 as a pass.The reviewers decided on these criteria for inclusion/exclusion of papers in the review.Table 4 provides a summary of the quality assessment conducted for the included papers in the review.All 65 papers reviewed met the threshold criteria and were included in the review.A visual representation of these results can be seen in Figure 8. Table 5 illustrates related survey papers and Table 6 depicts the accuracy of deepfake-detection techniques based on underlying model and dataset.
The second approach involves making sure that the papers selected for the review are of high quality, with a conference ranking of A or B (conference rank obtained from (http://www.conferenceranks.com/(accessed on 11 January 2023)) or a journal ranking of Q1 or Q2.Out of the 38 papers that were included in this systematic review, 14 papers were from conferences ranked as A, 4 papers were from conferences ranked as B, 7 papers were from arXiv, and the remaining 13 papers were from Q1-and Q2-ranked journals.Figure 9 illustrates the distribution of these papers.

Related Works
The study of deepfake technology and its potential impact has gained significant attention in recent years.Previous research has explored the technical aspects of deepfake creation, including the use of machine learning algorithms and neural networks.However, there is a need for more comprehensive surveys that examine the effectiveness of deepfake tools in creating convincing synthetic media.Several studies [66] have evaluated the quality of specific deepfake videos, but these evaluations are often subjective and lack standardization.To address this gap in the literature, this paper presents a survey of the techniques used to measure the effectiveness of deepfake tools.
One of the most important areas of research regarding deepfake is creation and detection of deepfake.Several studies have illustrated various techniques for deepfake creation as well as detection.In [9], a survey of thorough details of deepfake creation and detection using various machine learning techniques was performed.The authors of the study [67] categorized deepfake methodologies into four groups, including deep-learning-based, classical machine-learning-based, statistical, and blockchain-based techniques.They conducted an evaluation of the performance of these methods in terms of their detection capability, using various datasets.Their findings indicate that deep-learning-based methods perform better than the other three categories in detecting deepfakes.On the other hand, in [68], the authors provide a summary of the deepfake-detection methods applied to both face images and videos, based on their performance, results, detection type, and methodology.In addition, they classify the existing deepfake creation techniques into five major categories, which will be reviewed in this study.Similarly, the focus of the paper [11] is on providing a survey of the algorithms utilized in the creation of deepfakes, as well as the proposed methods to detect such deepfakes in the current literature.The paper includes in-depth discussions regarding the challenges and research trends that are relevant to the domain of deepfake technologies and offers insights on possible future directions for research in this area.The paper [9] conducts a survey on deepfake creation and detection techniques, emphasizing the network architectures used for this purpose.It includes thorough discussions on the effectiveness of different deep learning networks and their corresponding architectures utilized in various studies.The paper by [69] surveys the tools and algorithms employed for creating and detecting deepfakes.Although it briefly touches on the challenges, advances, and strategies related to deepfakes, the number of studies discussed is limited.The paper [70] conducts a systematic literature review (SLR) of deepfake creation and detection techniques, covering images and videos, and studies related to deepfake tweets, which are not commonly included in similar surveys.
Finally, there is a growing body of research that has examined the ethical implications of deepfake technology.This includes studies on the potential risks and harms associated with deepfakes, as well as discussions of the ethical considerations involved in their creation and use.
Overall, the literature suggests that there is a need for more standardized and objective techniques for measuring the effectiveness of deepfake tools.This survey aims to provide a comprehensive overview of the existing evaluation methods, to identify areas for improvement, and to inform the development of more robust and effective techniques for evaluating deepfake technology.

Social Impact of Deepfake
One of the most well-known innovations, deepfake, has many ethical concerns with both positive and negative effects on society.We first briefly explain the merits and limitations of this technology as follows.

Positive Impact
Films, curriculum content, electronic communications, videogames, entertainment, social platforms, medical, nanotechnology, and numerous industry sectors along with clothing and e-commerce are just a few of the industries that benefit from deepfake innovation [1,71].
• Deepfake technology has many advantages for the movie business.For instance, it can be used to update film footage rather than reshoot it or to create artificial voices for performers who lost theirs due to illness.The ability for filmmakers to reproduce iconic movie moments and produce new films starring long-dead performers can be brought back to life in post-production with the use of cutting-edge facial editing and visual effects.Deepfake technology also enables automatic and lifelike voice dubbing for films in any language, enhancing the viewing experience for different audiences of movies and instructional media.
• Deepfake technology enables digital doubles of individuals, realistic-sounding and smart-looking assistants [72], and enhanced telepresence in online games and virtual chat environments [73].This promotes improved online communication and interpersonal relationships [74,75].In the social and medical spheres, technology can also be beneficial.By digitally bringing a deceased friend "back to life," deepfakes can assist a grieving loved one in saying goodbye to her.This can help people deal with the death of a loved one [76,77].Additionally, it can be used to digitally replicate an amputee's leg or assist transgender people in better visualizing their preferred gender.
It is even possible to engage with a younger face that the used may remember thanks to deep-fake technology [76].In order to accelerate the development of new materials and medical treatments [78], researchers are also investigating the use of GANs to detect anomalies in X-rays [79].

•
Businesses are intrigued by the possibility of brand-applicable deepfake technology since it has the chance to significantly change e-commerce and marketing [75].For instance, businesses can hire phony models and actresses to display fashionable attire on a wide range of models with various heights, weights, and skin colors [80].Additionally, deepfakes facilitate super-personal information that transforms customers into models; the technology allows for virtual modification to help customers see how an outfit would appear on themselves before buying it and can produce specifically aimed fashion advertisements that change based on the period, climate, and viewing public [75,80].

Negative Impact
Deepfakes create overwhelming risks, nevertheless.They have been employed as a famous weapon for political propaganda efforts, to tarnish the image of journalists by including them in pornographic films, and to make totally new types of interpersonal deceiving.Because of this, decision makers and experts are alerting the public to the threats that deepfakes may pose.We go over a few of them and describe their effects on society, politics, and the economy.
Recrimination Porn: Without bringing up its pornographic use, the list of harms that deepfakes cause can never be fully expressed.As we explained before in this essay, the term "deepfakes" derives its name from being used to create pornographic material on purpose.To put this into context, VOX research [81][82][83][84] indicates that 96% of the deepfake movies in online pornography in 2020 were produced to harm the reputations of their victims.In particular, this puts the prestige of female celebrities at risk, which may cause severe harm if not regulated.
Propagation of Politicians: Deepfakes can be used by political enemies to sway the people and to foster mistrust.Barack Obama was caught on camera in 2008, 2012, and 2016 stating that individuals in hard-hit areas frequently turned to religion and guns, Mitt Romney said that 47% of Americans were content to rely on the government for basic necessities, and Hillary Clinton was caught on camera rejecting a community of Trump fans and calling them "deplorable", all of which were later revealed to be forgeries.Deepfakes' political complexities and effects have the potential to harm global democracy if they are not well regulated.
Scamming and phishing: Deepfake technology might potentially lead to a rise in internet scams, fraudulent allegations, or complaints against businesses.A deepfake such as this is created by capturing an actual incident and then editing the audio to add new conversation in order to deceive viewers.

Ethics of Deepfake
Even though deepfakes have become beneficial in areas such as the film industry, expression, and the arts, they have primarily been weaponized for malicious ends.Deepfakes have the potential to weaken democracy, do harm to people and businesses, and further diminish public confidence in the media.The use of deepfakes to create a false story is risky and may hurt people and the wider community, whether intentionally or accidentally.Deepfakes, which are not just fake but which are also incredibly lifelike, could exacerbate the post-truth dilemma since they deceive our most basic auditory and visual senses.Deepfakes made with the intention of intimidating, humiliating or blackmailing a person are categorically unethical, and their effects on the democratic system need to be considered.There are ethical problems with different deepfake domains.

•
A person may be threatened, intimidated, or suffer mental damage as a result of pornographic deepfakes.Women are treated with harshness and discrimination, which results in psychological pain, injury to one's reputation, bullying, and in certain situations even loss of money or career.When it comes to consenting to artificial pornography, the ethical problem is considerably more complicated.Mutually acceptable deepfakes could normalize the concept of synthetic pornography, which might increase worries about the harmful effects of pornography on emotional and sexual development.Some may claim that this is similar to the ethically right activity of sexual fantasizing.• Synthetic resurrection is one more field of worry.People have the ability to determine how their likenesses are used for commercial purposes.The biggest issue with public figures is who will control their voice and appearance once they pass away.Most of the time, they are used primarily for marketing, propaganda, and financial benefit.Deepfakes can be employed to falsely depict political leaders' reputations after their deaths in order to further political and legal objectives, which raises issues of morality and ethics.Despite the fact that there are valid barriers against using a dead person's voice or image for profit, relatives who are granted the right to utilize these attributes may do so for their own business advantage.

•
Extending the truth, emphasizing a political platform, and offering other facts are common strategies in politics.They aid in organizing, influencing, and persuading individuals to collect funds and votes.Although unethical, political opportunism has become the standard.If politicians decide to employ deepfakes and artificial media, the results of the election could be significantly affected.People who are deceived cannot make judgments that are in their individual greatest advantage because deception prevents them from doing so.Voters are manipulated into supporting the deceiver's agenda when misleading information about the opposing party is purposefully spread or a candidate is presented with a different version of events [85].There is a little legal remedy for these immoral activities.A deepfake that is employed to frighten people into not casting their ballots is also unethical.
Deepfake suppliers and manufacturers need to be certain that they use and apply artificial media in an ethical manner.Large technological companies such as Microsoft, Google, and Amazon have an ethical responsibility [86] since they offer the cloud infrastructure and tools needed to quickly and efficiently produce deepfakes.The use of deepfakes requires that social media such as Facebook, Twitter, LinkedIn, and TikTok, as well as news media organizations, news reporters, politicians and policymakers, and civil society, demonstrate an ethical and moral obligation.These platforms allow for the mass distribution of deepfakes.To overcome ethical concerns, paper [70] proposes how to counter the threats from deepfake technology and to alleviate its impact.Figure 10 depicts the ethics of deepfake.

Deepfake Effectiveness
This study has two taxonomies: (i) underlying deepfake models and (ii) deepfake tools.Both of these taxonomies are divided into two sections: detection and deepfake creation.For the first section of the first taxonomy (the deepfake-detection model), ML-based and DL-based models are analyzed on benchmark datasets such as FF, FF++, VidTIMIT, Celeb-DF, DFDC, and COHFACE based on accuracy.The second section (creation model) of the first taxonomy analyzes the GAN-based model, which is studied with a brief inclusion of the process of the models, application area, and limitations.The first section of the second taxonomy (deepfake tools analysis) is introduced for the creation tools and for analyzing the comprehensive details of the underlying model, its main focus, accuracy, speed, usability, and security.For the second section, the detection tools, along with the accuracy and speed of user-friendliness and scalability, are also introduced.Section 6 analyzes the deepfake models, and Section 7 explores the tools.

Underlying Deepfake Models
This section covers different techniques for creating and detecting deepfake content such as videos, images, audio, and text.Additionally, deepfake models are divided into creation and detection models.In Figure 11, we present the taxonomy of deepfake models.

Detection Models
The existing models in the literature that are being developed to detect deepfake are studied in this section.Based on the underlying detection algorithm, we can broadly categorize these models into two types: (1) conventional machine-learning-based models and (2) deep-learning-based models.Deep-learning-based models can be further classified into CNN, RNN, and transformer-based models.

Conventional ML-Based Deepfake
Modern machine learning techniques are crucial for comprehending the reasoning behind any choice that may be justified from a human perspective.These techniques provide more control over data and procedures, making them appropriate for the deepfake area.Additionally, changing the model's design and hyperparameters is significantly simpler.Decision trees [40,87], random forests [41,43,88], and other tree-based machine learning techniques use a tree to symbolize the decision-making process.As a result, there are no interpretability issues with the tree-based approach.On the other hand, there are several studies that utilize support vector machine [42][43][44]89,90], logistic regression [43,44,46], and KNN [44] classifiers and some boosting models (e.g., XGBoost [40,45,91], ADABoost [46]) to identify deepfake.

RNN-based Models
One of the most used models for sequential data in deep learning is RNN.Numerous RNN models have been found to be employed in the creation and detection of deepfake images and videos.Here are several RNN models that can be used to create and detect deepfakes: BiLSTM [94], FaceNet, FacenetLSTM [62], Neural-ODE [95], CLRNet [96,97], CNN+(Bidirectional+entropy RNN) [98], CNN+RNN [99].
We looked at different deep-learning-based models to determine which ones used which datasets and had the best accuracy for detecting and generating fake news, as shown in Table 6.Our taxonomy of these models is based on the technologies illustrated in Figure 11, namely traditional machine learning and deep learning (CNN, GAN, RNN, and transformer).

Accuracy of deepfake-detection models
Table 6 presents different deepfake-detection model-based algorithms, their types of models, popular datasets, and the respective accuracy of those models in detecting deepfake with very high accuracy.The table presents both independent (i.e., CNN, RNN, and Transformer) and hybrid (MTCNN+RNN) models.We find that CNN-based deepfake models have a greater diversity with many transfer learning and attention-based techniques.The majority of the models have been tested over FaceForensics++ (FF++), VidTIMIT, and other datasets.The accuracy of the models is in the range from 64.10% (DeepRhythm model and DFDC dataset) to 99.90% (CNN+attention hybrid model and Celeb-DF dataset).

Creation Models
This section examines the models that are currently being used in the literature to generate deepfake.We concentrate on GAN-based creation methods based on the underlying creation methodology.

GANs-Based Deepfake
Deepfakes are typically produced using methods that rely on generative adversarial networks (GANs), which Goodfellow et al. [24] initially introduced.In an adversarial mode, the authors devised a new method for determining generative models that involve training two models at once: (1) Discriminative model D, and (2) Generative model G.We may observe from Figure 5 that a sample's probability of coming from the training data as opposed to the Generative model G can be determined by the Discriminative model D.
To make it more likely that D will make a mistake and start a min-max two-player game, the training procedure for G aims to enhance the likelihood of this happening.The generator mathematically accepts a random input t with a density Wt and then generates an output x = G(t; θg) with a particular probability distribution Wg (θg: parameters of the generative model).
The discriminator, D(z; θd), determines the likelihood that x originates from the real example data Wdata (d denotes the discriminative model's parameters).After the training phase, the main goal is to obtain a generator that is a factor that gives rise to Wdata.As a result, the desired probability distribution, i.e., Wdata, will be followed by pg since the discriminator is "deluded" and can no longer tell the samples apart from Wdata and Pg.
By treating the unsupervised problem as supervised and producing photo-realistic imitation faces in photos or videos, GANs are utilized to autonomously train a generative model.Many different types of GAN-based techniques are utilized to identify deepfake.Table 7 provides an overview of the GAN-based deepfake detection and generation model.Our Systematic review MDPI Provides a thorough and current evaluation of deepfake models and tools.It proposes two classification systems to categorize these models and tools and compares their effectiveness based on factors such as algorithms, datasets, and accuracy.
→ symbol used to mark that the feature is present in this paper.× → symbol used to mark that the feature is not present in this paper.[33] is a cutting-edge generative adversarial network that successfully trains on image data from all categories and learns the mappings between diverse aspects with only one generator as well as a discriminator.This technique is capable of performing image-to-image translation and can translate images between several domains with just one model, as shown in Figure 12.It has therefore been contrasted with other currently used techniques [36,102] and demonstrates how STAR-GAN is able to produce images of greater graphical excellence.STYLE-GAN [34] or style generative adversarial network, modifies the generator model of STAR-GAN by mapping points into latent space to an intermediary latent space that regulates the style output at each point of the process of generation, as shown in Figure 13.Additionally, adding noise as a source of fluctuation in the previously described elements produces superior outcomes.

GAN Model Process of the Model Application Area Limitation Used in Deepfake
STARGAN [33] To use a particular model, image-to-image translations at several domains.
StarGAN tries to manipulate the age of the source images and is unable to generate facial expressions when the incorrect mask vector is utilized.
Create incredibly genuine, high-resolution photographs of people's faces.
It is obvious from looking at the distribution of training data that low-density regions are underrepresented, making it more challenging for the generator to learn in those areas.

Attribute intensity control, attribute style manipulation
Cannot manipulate style attribute.[105,111,114] CycleGAN [36] To convert a visual representation from one domain to another when there are not any paired examples.
Style transfer, object transfiguration, season transfer, photo enhancement.
Many of their outcomes are rendered hazy and do not keep the level of clarity as observed in the input, failing to keep the identification of the input.[115][116][117] GDWCT [37] Increases the capacity for styling.
Specially designed for image translations.
It makes mistakes while classifying gender.[37,105,111] Ic-GAN [38] Mapping a real picture into a conditional interpretation and latent space.
Apply to recreate and alter real-world pictures of faces based on random attributes.
By producing photographs of men, the self-identity in the picture is not preserved. [38] Table 7. Cont.

GAN Model Process of the Model Application Area Limitation
Used in Deepfake VAE/GAN [39] While translating an image, swaps element-wise mistakes for feature-wise defects to effectively capture the distribution of data.
Use it on pictures of faces and identify element-wise matching.
The relationship between the latent representation and the characteristics cannot be modeled.
Adjustments for changes in attitude and expressions in an image or a video clip.
The texture is blurred when the face recreation generator is used too frequently and the sparse landmark tracking technique fails to capture the complexity of facial emotions.[31,110,119] As a result, STYLE-GAN is able to create images of people's faces that are not only astonishingly lifelike and of high quality but that also provide control settings for the image's overall style at various levels of detail.Even if it is possible to produce pseudo-portraits that seem realistic, tiny details may point out that the photos are unreal.Karras et al. [120] proposed STYLE-GAN2 to address these flaws in STYLEGAN.They improved the generator by redesigning the regularization, multi-resolution, and normalizing approaches.
In ATT-GAN [35], a revolutionary technique, an attribute categorization constraint is added to the generated image to ensure that only the necessary characteristics are changed in the proper ways.This technique avoids limiting latent representation.Results show that ATTGAN outperforms the state-of-the-art techniques, i.e., STARGAN [33], CycleGAN [36], Ic-GAN [38], Fader Networks [121], VAE/GAN [39], in terms of realistically modifying facial characteristics.Figure 14 represents the learning framework for attention-based GAN.The formula for attention GAN where the objective for the encoder and decoder [35] is as follows: min and the objective for the discriminator and the attribute classifier: Here, in Equations ( 1) and ( 2), G enc and G dec represent the encoder and decoder networks, respectively.L rec is the reconstruction loss, which measures the difference between the input and the reconstructed output.L clsg is the attribute classification constraint loss, which encourages the network to classify the attributes of the input correctly.L advg is the adversarial loss for the generator (encoder-decoder), which encourages the generated output to be realistic and attribute-preserving. D is the discriminator network, which distinguishes between real and generated samples.C is the attribute classifier network, which predicts the attributes of the input.L clsc is the attribute classification loss, which measures the difference between the predicted and true attribute labels.L advd is the adversarial loss for the discriminator and attribute classifier, which encourages them to distinguish between real and generated samples accurately.λ 1 , λ 2 , and λ 3 are hyperparameters that balance the different losses in the objectives.

Summary on GAN Table 7 presents variants of GAN-based deepfake creation models. The table first
shows the process of how these GAN models apply deepfake in images.These processes largely use image-to-image translation, sending semantic information to a target with a style, and make use of latent and conditional visual space, face swapping, etc.We also present major application areas such as genuine image creation, intensity control, style transfer, season transfer, etc., of these GAN-based deepfake generative models.In addition to this, we highlight major limitations of these models in the table.We also present different independent studies where these generative models have been used depending on the nature of the problem.

Deepfake Tools
This section covers different tools and methods utilized in creating and detecting deepfakes across various types of media, including videos, images, audio, and text.Additionally, the deepfake tools are divided into two main categories: those used for creating deepfakes and those used for detecting them.
The accuracy of the deepfake tools has been evaluated based on their ability to create high-quality deepfakes, with a high accuracy rating being indicative of better performance.SimSwap, Fewshot FT GAN, FaceShifter, DiscoFaceGAN, Faceapp, StarGan-V2, ATTGAN, Style-Gan, Style-Gan2, Style-Gan3, and CycleGAN have been rated high in terms of accuracy, while StarGan and FaceSwap-Gan have been rated moderate.
The speed of the deepfake creation tools refers to the time taken to generate the deepfakes, with a faster speed rating being better.SimSwap, FaceShifter, and CycleGAN have been rated high in terms of speed, while FaceSwap-Gan, StarGan, Style-Gan, Style-Gan2, and Style-Gan3 have been rated slow.
The usability of the tools is based on their ease of use and the availability of tutorials or documentation to assist users.Faceapp has been rated high in terms of usability, while FaceShifter and ATTGAN have been rated moderate.DiscoFaceGAN and the StarGan variants have been rated low due to the lack of documentation.
The security of the deepfake creation tools is evaluated based on their ability to prevent the creation of malicious deepfakes or to protect the privacy of users.Most of the tools listed in the table have low security ratings, with only Fewshot FT GAN and SimSwap having a moderate rating.
Finally, the availability of deepfake tools refers to their accessibility, with open-source tools being more widely available than paid tools.The table shows that most of the deepfake creation tools are open source, with only Faceapp and SimSwap being paid tools.

Comparative Analysis of Deepfake Creation Tools
In this discussion, we examine and compare various deepfake creation tools based on their computational efficiency, resilience against adversarial attacks, scalability, and usability.We also explore the advantages and limitations associated with these tools.

Computational efficiency
CycleGAN stands out among these techniques for its comparatively efficient computational performance.It achieves this by utilizing a cycle-consistency loss and by not requiring paired training data.CycleGAN is particularly effective in transferring facial attributes and performing image-to-image tasks.On the other hand, Style-GAN, Style-GAN2, and Style-GAN3 are well known for their impressive image-synthesis capabilities, but they tend to be computationally demanding due to their complex architecture and high-resolution image generation.
For face swapping and modification, tools such as FaceSwap-GAN, SimSwap, Fewshot FT GAN, FaceShifter, DiscoFaceGAN, and Faceapp offer varying levels of computational efficiency.SimSwap, for example, stands out for its lightweight network design, delivering good results with faster inference times.
Tools such as StarGan and StarGan-V2 excel in attribute transfer across different domains but may require moderate to high computational resources, especially for large datasets or high-resolution images.ATTGAN strikes a balance between computational efficiency and result quality, specifically designed for attribute manipulation in face photographs.
Overall, the computational efficiency of these deepfake tools depends on factors such as model complexity, dataset size, image resolution, and specific task requirements.Researchers and practitioners should consider their computing needs when selecting a tool based on available resources and desired performance levels.Through the integration of cutting-edge techniques with edge-cloud services and energy-harvesting methods, it has been observed that computational complexity can be significantly reduced (e.g., [127]).

Robustness against adversarial attack
When considering the resistance to adversarial attacks, deepfake techniques are vulnerable to such attacks due to the inherent characteristics of their generative models.Adversarial attacks can manipulate or deceive these models, compromising the credibility and authenticity of the generated content.However, there are several noteworthy observations regarding the resilience of the mentioned tools.
CycleGAN: CycleGAN is recognized for its capability to handle unpaired data and to perform image-to-image translation.Although it may lack specific defenses against adversarial attacks, its reliance on cycle-consistency loss contributes to preserving the overall integrity of the generated images to some extent.
Style-GAN, Style-GAN2, Style-GAN3: These models are extensively used for producing high-quality images but may be more susceptible to adversarial attacks.Their intricate architectures and the generation of high-resolution images make them potential targets for manipulation and alteration.
FaceSwap-GAN, SimSwap, Fewshot FT GAN, FaceShifter, DiscoFaceGAN, Faceapp: These tools primarily focus on face swapping and modification, and their resistance to adversarial attacks can vary depending on the specific techniques employed.It is important to consider the underlying defenses and precautions implemented in each tool to mitigate potential vulnerabilities.
StarGan, StarGan-V2: These models are designed for attribute transfer between different domains and may not possess explicit defenses against adversarial attacks.Their susceptibility to such attacks can vary depending on the particular implementation and the precautions taken during the training process.
ATTGAN: ATTGAN specializes in manipulating facial attributes and may incorporate certain levels of robustness against adversarial attacks.However, the specific defenses implemented may vary depending on the implementation and training strategies employed.
In summary, the robustness against adversarial attacks in deepfake tools is influenced by several factors, including the underlying architecture, training techniques, and the extent to which specific defenses are integrated.Evaluating the robustness of these tools in real-world scenarios requires careful consideration of potential vulnerabilities and countermeasures.Ongoing research in adversarial attacks and defense mechanisms continues to enhance the resilience of these tools against potential threats.On the other hand, various blockchain-based techniques (e.g., [128]) are used to protect against such kinds of issues.The decentralized and distributed characteristics of blockchain technology provide a level of resistance against tampering with the data stored on the blockchain, making it challenging for adversaries to manipulate the information.

Scalability and usability
When considering the scalability and usability of the mentioned deepfake tools, there are important aspects to consider.In terms of scalability, CycleGAN stands out, as it can handle unpaired data and perform various image-to-image translation tasks effectively.However, models such as Style-GAN, Style-GAN2, and Style-GAN3 may face scalability limitations due to their complex architectures and high-resolution image generation, demanding substantial computational resources.The scalability of face swapping and attribute manipulation tools, such as FaceSwap-GAN, SimSwap, Fewshot FT GAN, FaceShifter, DiscoFaceGAN, Faceapp, StarGan, StarGan-V2, and ATTGAN, varies depending on their specific implementation and the complexity of the tasks they handle.Regarding usability, CycleGAN is user-friendly, does not require paired training data, and accommodates unpaired image datasets seamlessly.In contrast, models such as Style-GAN, Style-GAN2, and Style-GAN3 can be more challenging to use due to their intricate architectures and advanced image-synthesis capabilities, often necessitating deep learning expertise and sufficient computational resources.The usability of face swapping and attribute manipulation tools depends on factors such as user interface, documentation, and the technical expertise level required.Ultimately, practitioners and researchers should carefully evaluate scalability and usability factors to select the most suitable deepfake tool for their specific needs and available resources.

Advantages and Limitations
Each deepfake tool has its own set of advantages and limitations, making them suitable for different use cases.It is important to consider the specific requirements, desired outcomes, and available resources when selecting the most appropriate tool for a given task.Additionally, ongoing research and advancements in deepfake technologies may lead to further improvements in the advantages and limitations of these tools.Table 9 presents a comprehensive overview of the advantages and limitations associated with deepfake creation tools across different scenarios.

Tools Advantages Limitations
DiscoFace-GAN Facial attribute editing with control.
Data-handling considerations, limitations with complex facial expressions.

Faceapp
User-friendly interface, a range of transformation filters.

Limited customization and
fine-grained control over facial transformations.
Computational resource requirements, challenges with high-resolution images.

ATTGAN
Balance between computational efficiency and result quality.
Hyperparameter tuning, limitations with extreme facial transformations.
Computational intensity, limitations with large-scale datasets.
Hyperparameter tuning, challenges in preserving fine details during image translation.

Deepfake Detection Tools
This section covers an evaluation of the effectiveness of deepfake-detection tools.We have examined several widely used and effective deepfake-detection tools in the field of deepfake.a.
Sensity AI.A deepfake detection software called Sensity AI makes use of machine learning and artificial intelligence methods to spot altered material.It is renowned for finding deepfakes quickly and precisely, even in huge datasets.To stop the spread of deepfakes, the technique has been used by a number of groups, including social networking sites and law enforcement organizations.Sensity AI is made to scan vast volumes of data rapidly and has demonstrated great levels of accuracy, with accuracy scores of up to 95%.The tool's easy-to-use layout makes it usable to both technical and lay users, and it is scalable and adaptable to various sectors and user needs.
In conclusion, Sensity AI is a useful tool for recognizing and mitigating the risks of deepfakes.b.
Truepic.This is a platform that provides services to verify the authenticity of photos and videos and to detect any manipulation or tampering in the media.It offers various features, including cryptographic techniques to verify authenticity, advanced forensic analysis capabilities, real-time verification, integration with different platforms, a user-friendly interface, and high accuracy rates in detecting manipulated media.Truepic can verify media from various sources and formats, ensuring wide coverage.
Additionally, it provides a transparent and auditable trail of the verification process for reliable validation of the media's authenticity.c.D-ID.This is a tool that utilizes advanced algorithms to protect users' privacy by transforming photos and videos in a way that prevents facial recognition systems from identifying the individual in the media.The tool is effective in achieving this goal, with high accuracy rates in facial anonymization.It is easy to integrate into various applications, works quickly, and is compatible with various platforms such as iOS, Android, and web applications.Additionally, D-ID's algorithm is robust against attacks by adversarial machine learning techniques, ensuring that anonymization remains effective even in the face of such attacks.
d. Amber Video.This is a deepfake-detection tool that is known for its high accuracy in detecting deepfakes, particularly those created using generative models.The tool's effectiveness parameters include the use of multiple levels of detection, continuous learning to adapt to emerging threats, real-time detection with scalability, a cloudbased solution for easy deployment and integration, and customization options for detection rules and thresholds to meet specific needs.e.
Deeptrace.This is a tool for detecting deepfake videos that uses machine learning algorithms.The tool is effective in detecting deepfakes due to its high accuracy rate, which is achieved through the use of state-of-the-art machine learning algorithms.It can detect deepfakes in real-time, making it useful for identifying fake videos as they are being shared.Deeptrace uses a combination of audio-, visual-, and textbased analysis to detect deepfakes, which makes it more comprehensive than other tools.The tool is scalable and can analyze large datasets, meeting the needs of different organizations.Additionally, Deeptrace continuously learns and adapts to new deepfake techniques, ensuring that it can effectively detect the latest types of deepfakes.f.
FooSpidy's Fake Finder.This is a deepfake-detection tool that uses image forensics and deep neural networks to identify manipulations in images and videos.Its effectiveness parameters include high accuracy, real-time detection, a user-friendly interface, and customizable settings.The tool has an accuracy rate of over 90% and allows users to quickly upload and scan media files for deepfakes.Additionally, users can customize detection settings to enhance accuracy and precision in identifying deepfakes.g.
DeepSecure.ai.This is a deepfake-detection tool that uses a unique approach of analyzing the semantic content of videos to detect deepfakes.Some of its effectiveness parameters include high accuracy, real-time speed, scalability, ease of use, and versatility.It has a reported accuracy rate of 96% in detecting deepfakes and can handle large volumes of videos.Additionally, it can detect a wide range of deepfake techniques, including face-swapping and voice cloning, among others.The tool is user-friendly and can be easily integrated into existing workflows, making it accessible to a wide range of users.h.HooYu, This is a platform for digital identity verification that offers fraud protection, customer onboarding, and identity authentication solutions.To assure high accuracy, the platform's verification technology makes use of a variety of verification methods and data sources.HooYu's automatic verification procedure is rapid and effective, allowing companies to quickly onboard clients.The platform is built to provide clients with a seamless and user-friendly experience while adhering to regulatory regulations.It is adaptable and may be tailored to fit the unique requirements of different businesses, including e-commerce and financial services.i.
iProov.This is a tool designed to detect deepfakes by using a proprietary technology called Flashmark to identify any signs of manipulation in facial biometric data.This tool's effectiveness lies in its ability to accurately detect even the most sophisticated deepfake attempts, providing real-time verification of users' faces, and it has a userfriendly interface.It is platform-agnostic, working seamlessly across iOS, Android, and web browsers, with a high level of trust from various government agencies that use it to verify identities and prevent fraud.j.
Blackbird.AI.This is a tool designed for deepfake detection that combines machine learning algorithms with human intelligence to identify and classify manipulated media.Its effectiveness parameters include a high detection accuracy of 98%, advanced machine learning algorithms to detect signs of manipulation, and real-time monitoring of various online sources.Blackbird.AI also uses a team of trained analysts to review flagged videos and confirm their authenticity.Users can customize their settings to meet their specific needs, such as setting the threshold for deepfake detection or excluding certain sources from monitoring.Overall, Blackbird.AI is a reliable and effective solution to combat the proliferation of manipulated media online.k.
Cogito.This is an AI-based behavioral analytics platform that uses machine learning algorithms to identify and prevent deepfakes in real time.Its effectiveness parameters include high accuracy and precision in detecting even the most advanced deepfakes, real-time detection capabilities for immediate flagging and minimization of potential harm, user-friendly and seamless integration into business workflows, customization of deepfake detection protocols, and scalability to address evolving threats and challenges.Overall, Cogito's platform is an effective and adaptable solution for businesses seeking to combat the spread of manipulated media online.l.
Veracity.ai.This is a deepfake-detection tool that uses algorithms based on ML and AI to accurately pinpoint media that has been altered.The program is especially helpful for use in live video streams since it can identify deepfakes in real-time.
In order to provide thorough coverage, it can also identify deepfakes across a variety of modalities, including video, audio, and pictures.Veracity.ai is simple to use and does not require any technical knowledge to use.To further its efficiency, it is additionally regularly updated with the most recent deepfake detecting methods and methodologies.m.XRVision Sentinel.A deep learning technology called XRVision Sentinel analyzes the structure and composition of facial photos and videos to detect those that have been altered.It is capable of spotting deep fakes in a variety of contexts, including political campaigns, news media, and social media.Advanced machine learning techniques are used by the instrument to identify minute variations in facial expressions, lip movements, and eye movements.Additionally, it has the ability to recognize deep fakes created using a variety of methods, such as GAN-based models and facial reenactment techniques.Testing of XRVision Sentinel on datasets such as the Deepfake Detection Challenge dataset showed that it had a high degree of accuracy in identifying deepfakes and a low proportion of false positives.n.Amber Authenticate.This is a tool that utilizes cryptographic techniques to validate the authenticity of image and video content and to prevent the spread of deepfakes.
The tool has been shown to be highly accurate and efficient in detecting deepfakes in real time.It is compatible with various media file formats and has a user-friendly interface, making it easy for users of all technical levels to operate.Overall, Amber Authenticate is a versatile and reliable tool for detecting deepfakes across various platforms and applications.o.
FaceForensics++.A deepfake detection program called FaceForensics++ focuses on identifying face exchanges in videos.It is highly effective at identifying deepfakes by studying minute facial movements and has the capacity to recognize deep fakes produced using a variety of techniques.FaceForensics++ features an easy-to-use interface that requires little technical knowledge, is open-source, and may be freely used and modified by researchers.Finally, it is a scalable tool that can be incorporated into real-time deepfake detection systems because it can effectively analyze large datasets of videos.p. FakeSpot.This is an AI-powered web-based tool that can identify fake reviews on ecommerce websites with an accuracy rate of 90%.It can analyze thousands of reviews in seconds and is easy to use, even for non-technical users.FakeSpot also offers a browser extension that can be installed on Chrome, Firefox, and Safari, making it more accessible to users.Additionally, it is compatible with multiple e-commerce platforms including Amazon, Yelp, TripAdvisor, and Walmart, making it a versatile tool for detecting fake reviews across various websites.

Comparative Analysis of Deepfake Detection Tools
In this section, we compare the tools discussed above in terms of computational efficiency, scalability, robustness against adversarial attacks, and usability.Table 10 provides an overview of various deepfake-detection tools and their effectiveness parameters.Each tool is evaluated based on detection accuracy, speed, user-friendliness, scalability, and integration.Most of the tools show high accuracy in detecting deepfakes, with fast speed and easy-to-use interfaces.They are scalable and can be integrated using APIs and SDKs.
FaceForensics++, an open-source tool, shows high detection accuracy but slow speed, and its use requires technical expertise.FakeSpot, a web-based tool, has moderate detection accuracy but fast speed, with an easy-to-use interface and high scalability.Overall, the table demonstrates the effectiveness of these deepfake-detection tools, which can be useful in combating the spread of fake content.We can consider the following insights.

Computational Efficiency
Sensity AI, D-ID, Amber Video, Deeptrace, and XRVision Sentinel are known for their efficient computational capabilities.They employ advanced algorithms and optimizations to process and analyze large volumes of data efficiently.Tools such as Truepic, HooYu, and Veracity.aialso prioritize computational efficiency but may not offer the same level of optimization as the aforementioned tools.

Scalability
Scalability can vary among these tools depending on their architecture and infrastructure.Tools such as Sensity AI, Amber Video, and Deeptrace have built scalable platforms that can handle high volumes of data and scale with increasing demands.Truepic, XRVision Sentinel, and Veracity.aialso focus on scalability, providing solutions that can adapt to varying data volumes and user requirements.

Robustness against Adversarial Attacks
Robustness against adversarial attacks refers to the ability of the tools to detect and mitigate attempts to manipulate or deceive the system.Tools such as Deeptrace, XRVision Sentinel, and FaceForensics++ have advanced techniques to detect deepfakes and other forms of manipulated media, showcasing strong robustness against adversarial attacks.Truepic, iProov, and Amber Authenticate also prioritize robustness by implementing various verification and authentication mechanisms.

Usability
Usability considers how user-friendly and intuitive the tools are in terms of their interfaces, integration capabilities, and ease of adoption.Tools such as Truepic, HooYu, and Veracity.aifocus on providing user-friendly interfaces and seamless integration options for easy adoption and usage.D-ID, DeepSecure.ai, and Cogito also prioritize usability by offering intuitive workflows and easy-to-understand features.

Advantages and Limitations
In the preceding discussion, we have examined the benefits and drawbacks of the deepfake-detection tools mentioned earlier.
Sensity AI is renowned for its deepfake detection and proactive monitoring capabilities, while Truepic specializes in image and video verification for supply chain authentication.D-ID focuses on anonymizing and securing facial images to protect personally identifiable information (PII), and Amber Video offers AI-powered video verification and real-time monitoring.Deeptrace provides comprehensive deepfake detection and analysis with advanced attribution capabilities, while FooSpidy's Fake Finder offers customizable solutions for specific industries.DeepSecure.ai specializes in securing AI models from adversarial attacks, HooYu focuses on identity verification, and iProov offers biometric authentication.Blackbird.AI focuses on disinformation detection, Cogito offers emotional analysis and sentiment analysis, and Veracity.aiprovides media verification and fact-checking solutions.XRVision Sentinel specializes in deepfake detection and video analytics.Limitations include limited coverage of different types of media manipulation, narrow focuses on specific areas, scalability constraints, and potentially higher implementation costs for enterprise-level solutions.

CNN-Based Tools
We discuss a few of the popular CNN-based modeling tools that we located on the internet.

•
Face2Face [2,141] shows a much more accurate real-time facial emotion exchange from a source to a target film.It displays the effects of live manipulation of a target YouTube video using a webcam-captured source video stream.Additionally, it is compared to cutting-edge reenactment techniques and exceeds in terms of the final video quality and run-time.

•
Face swapping is carried out by FaceSwap [3] using picture blending, Gauss-Newton optimization, and a deep neural network-based face alignment.The detected face and features for a given input photo are first found by the algorithm.Additionally, a 3D model matches the features whose edges are mapped to the picture space and are transformed into textural positions.• Deepfake Faceswap [142] is a platform for swapping face applications that consist of a set of encoder-decoder-based deep learning models.The goal of developing FaceSwap is to reduce its abuse potential while enhancing its usefulness as a tool for research, experimentation, and legal face swapping.• FaceSwap_Nirkin [143,144] is an automatic image-swapping tool.It demonstrates that, rather than designing algorithms specifically for face segmentation, a typical fully convolutional network (FCN) can perform amazingly quick and precise segmentation if trained on a large enough number of rich sample sets.It makes use of specialized image segmentation to provide face identification under unusual circumstances, to fit 3D facial features, and to assess the impact of intra-subject and inter-subject face swapping on identification.It gives a face-swapping accuracy of around 98.12% in the COFW dataset [145].• Deepware Scanner [6] is a deepfake-detection tool that produces results on a variety of deepfake sets of data, together with natural deepfake and actual videos.Here, an EfficientNet B7 [57,146] model that has been pre-trained on the ImageNet dataset is used, and the classification algorithm is trained using only Facebook's DFDC [110] dataset, which contains 120k videos.Then, the model is trained to work in production, with an emphasis on fewer false positives.The model is a frame-based classifier, which means it does not take into account temporal coherence.Because video is a temporal medium, we believe this is a significant shortcoming that must be addressed.• DFace [7,147] is a face-recognition and identification toolkit with attention to efficiency and usability.With certain enhancements, most importantly on storage overflows, this is a narrowed version of Timesler's FaceNet [148] (constructed using Inception Resnet (V1) models that have undergone VGGFace2 and CASIA-Webface pretraining) repository.FaceNet is employed to create facial embeddings, and MTCNN [149] is applied to detect faces.• MesoNet [8] is a compact facial video detection techniques network.In [52], they examined a technique for dynamically spotting altered faces in video recordings.Deepfake and Face2Face are two contemporary methods used to produce forged clips that are incredibly lifelike.Clips typically do not lend themselves well to classical visual forensic approaches because of how tightly compressed they are, which severely affects the data.As a result, they use deep learning and build two networks with a few layers each to concentrate on the mesoscopic characteristics of the image.Utilizing both a new dataset and an existing dataset we created from web videos, we evaluated those rapid networks.For Face2Face and deepfake, our testing shows a success rate of over 98% and 95%, respectively.• OpenPose [150] is the initial genuine multi-person technology that identifies 135 feature points overall on the facial, human body, hand, and foot feature points on a single image.Zhe Cao et al. [151] offer a real-time method for spotting numerous 2D poses in a picture.The method learns to link parts of bodies with persons in the picture, which is represented using a nonparametric method known as part affinity fields (PAFs).No matter how many people are in the image, our bottom-up approach delivers great accuracy and real-time performance.In earlier research, PAFs and part of the body position measure were improved concurrently during the training stages.
Here, PAF-only refinement rather than PAF and part of the body position adjustment leads to a significant improvement in accuracy and runtime efficiency.Initially, an integrated body and foot keypoint detector is also presented, and it is based on a privately published internal foot dataset.In the end, it trains a multi-stage CNN model with a deepfake-detection accuracy of about 84.9%, utilizing data from the COCO 2016 keypoints challenge [152] and MPII human multi-person [153] datasets.• A framework called DeepfakesONPhys [154] uses a physiological assessment to identify deepfake.Utilizing remote photoplethysmography, it specifically takes into account data regarding the heartbeat (rPPG).In order to more effectively identify fraudulent films, DeepfakesON-Phys employs a convolutional attention network (CAN) [55], which pulls out both spatial and temporal data from video sequences.Utilizing the most recent open datasets in the industry, Celeb-DF and DFDC, it has been systematically assessed.The findings obtained, approximately 98% AUC with both datasets, surpass the current state of science and demonstrate the effectiveness of physiologically based fake classifiers for spotting the most recent Deepfake movies.

•
For the purpose of detecting deep fake videos, EfficientNet_ViT [57,155] combines EfficientNet and Vision Transformer.The method does not employ either distillation or ensemble methods, in contrast to cutting-edge techniques.In addition, the technique provides a simple voting-based inference approach for managing many faces in a single video frame.On the DFDC dataset, the best model had an accuracy of 95.10%.• DeepFaceLab [25,156] is the most popular program for making deepfakes.DeepFace-Lab is used to make more than 95% of deepfake videos and is used by well-known YouTube and TikTok channels (e.g., deeptomcruise [157], arnoldschwarzneggar [158], diepnep [159], deepcaprio [160], VFXChris Ume [161], Sham00k [162], NextFace [163], Deepfaker [164], Deepfakes in movie [165], DeepfakeCreator [166], Jarkan [167]).It is possible to substitute faces, reverse aging, replace heads, and even manipulate politicians' lips using this tool.S3FD [168] is used to detect faces in DFL, 2DFAN [169] and PRNet [170] are used to align faces in DFL, and a fine-grained Face-Segmentation network (TernausNet [171]) is used to segment faces.To train the DFL model, the FF++ dataset is used and gains deepfake detection accuracy of around 99% better than Face2Face, FaceSwap, and deepfake.• FakeApp [54,172] is a computer program that enables the production of what is now referred to as "deepfakes".• The Deepfakesweb [173] app is a cloud-based deepfake tool.This app handles everything else; the user only needs t upload clips and photos and then press a button.This app allows the model to be used again after training.By doing so, users can create new films or enhance the outcome's face-swapping quality without having to train a model again.The excellence and duration of the films determine how good a deepfake is.

GAN-Based Tools
Here, we have covered some of the most popular GAN-based modeling tools we could find online.

•
FaceShifter [29,124] is a special two-stage face-swapping technique for high accuracy and occlusion-sensitive face swapping.In contrast to earlier techniques, it can handle facial occlusions utilizing a second synthesis step that consists of a heuristic error acknowledging refinement network (HEAR-Net).
It is capable of producing high-quality identity-preserving face-swapping outcomes.First, a high-quality face-swapping outcome is produced using an adaptive embedding integration network (AEINet) based on the integration of information.Then, it produces a heuristic error recognizing network (HEAR-Net) to handle the difficult facial occlusions.The datasets CelebA-HQ [174], and FFHQ [34] are used to train AEI-NET.HEAR-Net, on the other hand, makes use of the faces' upper half.It gives a fake classification accuracy of around 97.38%.• SimSwap [30,123] is a highly effective face-swapping tool.In order to effectively aid their system in implicitly preserving the face attributes, Simswap presents the Weak Feature Matching Loss.According to experimental findings, they are more capable of preserving qualities than earlier state-of-the-art techniques.It employs a GAN-based model and trains the model using VGGFace2 [175] and FF++ [54] datasets.Encoder, ID Injection Module (IIM), and Decoder are the three components that make up the generator.It produces deepfakes with a 96.57percent accuracy rate.• FaceSwap-GAN [31,122] is one of the GAN-based deepfake-detection techniques.It can produce accurate and reliable eye motions, and it produces videos with better face orientation and improved quality.With a deepfake-detection accuracy of over 99%, it uses the Segmentation CNN + Recurrent Reenactment Generator model and is trained on the FF++ [54] dataset.• DiscoFaceGAN [32,125] is a technique used for producing synthetic faces of people using perfectly adjustable, completely separated latent representations of their identity, appearance, position, and lighting.Adversarial learning is used in this case to incorporate three-dimensional priors, and the network is trained to mimic the picture creation of an analytical three-dimensional facial image modification and rendering procedure.

•
Faceapp [126] is a mobile application that enables users to make customized deepfake videos.This is a deep-learning-based tool that uses cycleGAN as a model with extremely accurate results.It simply creates amazingly realistic facial changes for pictures.Using a mobile phone, it is possible to alter the face, haircut, age, gender, as well as other characteristics.
Table 11 shows specific models used with tools such as EfficientNet of CNN model, HEAR-Net, etc.The table also presents popular datasets and shows the accuracy of these tools.

Challenges
During the development of this study, we encountered a number of difficulties, which is covered in this section.Figure 16 represents five crucial issues and challenges in deepfake.

•
It is associated with the body of research.In this research, we compile the related papers from various conferences, journals, websites, and archives of numerous e-libraries.There is still a chance that our database of studies lacks some of the relevant papers.Additionally, we might have made a few errors while categorizing these experiments using the selection or rejection criteria we employed.In order to remedy such inaccuracies, we double-checked our evaluation of the papers in our collection.

•
When encountering low-quality films compared to high-resolution videos, detection algorithms frequently show a performance decline.Videos may also undergo procedures such as picture reshaping and rotations in addition to compression techniques.When constructing detection algorithms, flexibility becomes a crucial quality that must be taken into account.
• When used in a real-world setting, time consumption assumes substantial significance.Deepfake-detection techniques will be broadly applied to media services in the near future to minimize the harm that deepfake films cause to social security.Moreover, because of their extensive time requirements, existing detection techniques are still far from being widely used in real-world situations.

•
When we wish to create deepfake movies using a character, deepfake models are frequently trained on a specific collection of datasets, but the model is unable to produce an accurate output since there is not enough data for this character.Finding sufficient information for a single character, however, could be challenging.It takes time to retrain the model to recognize each distinct target.

•
The majority of datasets are developed in highly favorable conditions (such as ideal lighting, flawless facial expression, high-quality photographs or videos, etc.), but in the testing phase, we give data that do not keep this quality.This makes dataset quality one of the challenging areas.
Despite the variety of deepfake-generation tools available, they are not flawless.In actuality, the tools at hand are specially created and solely concentrate on specific traits.Because of the above difficulties, developing deepfake-generation tools needs additional study to boost efficiency.Consequently, creating a deepfake-generation tool is a difficult process.

Open Research and Future Work
We also foresee several potential research possibilities for deepfake generation and detection to solve issues with the methods now in use. Figure 16 represents three main future scopes for deepfake to develop.

•
Because of flaws in present face-forensic innovation, antiforensic technology has been invented.Neural networks are frequently employed in the area of deepfake detection to identify fake videos.Neural networks are unable to fend off attacks from adversarial samples because of inherent flaws [176].Researchers must develop more flexible strategies that can withstand prospective threats that are identified in order to prevent these attacks in certain situations.

•
It has been demonstrated that multitask learning, which involves carrying out several tasks at once, improves prediction performance when compared to single-task learning.It has been discovered that combining forgery location and deepfake-detection tasks can increase deepfake-detection task accuracy.The model may complete two jobs at once while taking into account the losses incurred by each, significantly enhancing the performance of the model.The authors in [177,178] demonstrate how the placement of a forgery is crucial to the deepfake detection challenge.Thus, there is a lot of room for improving deepfake detection using multitasking.• One of the key areas where researchers can focus their research is to enhance deepfake photo-or video-generation models that cannot produce deepfake movies more realistically using one or a few photographs or videos.Because the majority of deepfake creation models are trained on such a qualitative dataset, users face many challenges in managing many quality images or videos as the testing or source data.
In summary, the advancements in deepfake technology have highlighted the need for more robust detection methods to counter the flaws in existing face-forensic technology.Antiforensic techniques have emerged to exploit the vulnerabilities of neural networks used for deepfake detection.Multitask learning has shown promise in improving detection accuracy by combining forgery location and deepfake detection tasks.Enhancing deepfakegeneration models to create more realistic videos from limited source data is another important research direction.Overall, addressing these challenges is crucial to effectively detect and mitigate the risks associated with deepfakes.

Conclusions
Deep-learning-based falsified innovations have been growing at an entirely unexpected rate during this period.The worldwide spread of the Internet makes it possible for illegal face-altered films produced by deepfake technologies to spread quickly, harming social stability and individual rights.In order to mitigate the harmful effects of deepfake films on individuals, business enterprises and various scientific organizations across the globe are conducting a significant amount of studies.In this paper, we discussed various deepfake models and the models that are employed in the development of well-known online deepfake tools.Here, we presented examples of numerous well-known deepfake tools, together with their traits, accuracy of deepfake models and tools, and model-based taxonomy.Finally, we covered the existing issues, gave insights into unresolved problems, and addressed the next research on deepfake production and detection technologies.

Figure 1 .
Figure 1.Structure of the paper.

Figure 2 .
Figure 2. Example of face swapping in deepfake.

Figure 7 .
Figure 7. Flow diagram of systematic review.

Figure 8 .
Figure 8. Outcome based on the quality assessment of the 38 papers.

Figure 11 .
Figure 11.Taxonomy of deepfake models.6.1.2.DL-Based Deepfake Deep learning models have been extensively utilized in computer vision because of their method for the selection and extraction of features, which allows them to immediately retrieve or learn features from input.
Provides an overview of deepfake creation algorithms and detection methods.It also covers the challenges and future directions of deepfake technology.Improves understanding of deepfakes by discussing their creation, detection, trends, limitations of current defenses, and areas requiring further research.Offers survey of deepfake algorithms and tools, along with discussions on challenges and research trends.Focuses on recent research on deepfake creation and detection methods, covering tweets, pictures, and videos.It also discusses popular deepfake apps and research in the field.

Figure 16 .
Figure 16.Summary of challenges and future works.

Table 1 .
Electronic database search.

Table 2 .
Search queries used for the systematic review.

Table 3 .
Inclusion and exclusion criteria.

Table 4 .
Summary of the results for rating the quality of the papers.

Table 5 .
Related survey papers.Groups 112 articles into 4 categories: deep learning, classical machine learning, statistical, and blockchain techniques.Evaluates detection performance on various datasets.Explores trends and challenges in deepfake datasets and detection models, as well as challenges in creating and detecting deepfakes.

Table 6 .
Accuracy of deepfake-detection techniques based on underlying model and dataset.

Table 7 .
Overview of GAN models.

Table 8 .
Effectiveness of deepfake creation tools.

Table 9 .
Advantages and limitations of deepfake creation tools.

Table 11 .
Accuracy of some deepfake tools based on models and datasets.