A Decade of Deepfake Research in the Generative AI Era, 2014–2024: A Bibliometric Analysis

Acim, Btissam; Boukhlif, Mohamed; Ouhnni, Hamid; Kharmoum, Nassim; Ziti, Soumia

doi:10.3390/publications13040050

Open AccessArticle

A Decade of Deepfake Research in the Generative AI Era, 2014–2024: A Bibliometric Analysis

by

Btissam Acim

^1,*

,

Mohamed Boukhlif

²

,

Hamid Ouhnni

¹

,

Nassim Kharmoum

^1,3,4

and

Soumia Ziti

^1,4,5

¹

IPSS Team, Faculty of Sciences, Mohammed V University in Rabat, Rabat 10000, Morocco

²

LTI Laboratory, National School of Applied Sciences, Chouaib Doukkali University, El Jadida 24000, Morocco

³

National Center for Scientific and Technical Research (CNRST), Rabat 10000, Morocco

⁴

SMSD, Moroccan Society of Digital Health, Rabat 08007, Morocco

⁵

LIAS, Artificial Intelligence and Systems Laboratory, Hassan II University in Casablanca, Casablanca 20360, Morocco

^*

Author to whom correspondence should be addressed.

Publications 2025, 13(4), 50; https://doi.org/10.3390/publications13040050

Submission received: 26 July 2025 / Revised: 18 September 2025 / Accepted: 19 September 2025 / Published: 2 October 2025

(This article belongs to the Special Issue AI in Academic Metrics and Impact Analysis)

Download

Browse Figures

Versions Notes

Abstract

The recent growth of generative artificial intelligence (AI) has brought new possibilities and revolutionary applications in many fields. It has also, however, created important ethical and security issues, especially with the abusive use of deepfakes, which are artificial media that can propagate very realistic but false information. This paper provides an extensive bibliometric, statistical, and trend analysis of deepfake research in the age of generative AI. Utilizing the Web of Science (WoS) database for the years 2014–2024, the research identifies key authors, influential publications, collaboration networks, and leading institutions. Biblioshiny (Bibliometrix R package, University of Naples Federico II, Naples, Italy) and VOSviewer (version 1.6.20, Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands) are utilized in the research for mapping the science production, theme development, and geographical distribution. The cutoff point of ten keyword frequencies by occurrence was applied to the data for relevance. This study aims to provide a comprehensive snapshot of the research status, identify gaps in the knowledge, and direct upcoming studies in the creation, detection, and mitigation of deepfakes. The study is intended to help researchers, developers, and policymakers understand the trajectory and impact of deepfake technology, supporting innovation and governance strategies. The findings highlight a strong average annual growth rate of 61.94% in publications between 2014 and 2024, with China, the United States, and India as leading contributors, IEEE Access among the most influential sources, and three dominant clusters emerging around disinformation, generative models, and detection methods.

Keywords:

deepfake; artificial intelligence (AI); generative AI; bibliometric analysis; Bibliometrix; biblioshiny; R package; VOSviewer

1. Introduction

In 2024, 78 countries, including the United States, India, and the European Union, held major elections involving nearly a billion voters, according to the Integrity Institute (Harbath & Khizanishvili, 2023), a nonprofit organization dedicated to improving social media integrity.

The rise of generative AI, especially deepfake technology, could render the 2024 and subsequent elections highly contentious. Deepfakes, enabled by advances in AI and deep learning (Acim et al., 2025a), are a global concern because of their capacity to spread false information, sway public opinion, and endanger democratic processes. In particular, Generative Adversarial Networks (GANs) (Acim et al., 2025b), along with other neural architectures and algorithms, are employed to generate hyper-realistic synthetic media, including videos, images, and audio, commonly referred to as deepfakes (Patel et al., 2023a). Both experts and non-experts can now produce deepfakes with increasing ease thanks to the availability of a wide range of open-source frameworks and commercial applications. Some of the key technologies currently in use (Ouhnni et al., 2025) are shown in Figure 1.

Deepfakes show promise in useful fields (Nikkel & Geradts, 2022) such as education, communication, entertainment, and health, despite being frequently linked to malicious use cases. The increasing dual-use nature of this technology, however, emphasizes the necessity of a methodical and scientific comprehension of its creation, application, and effects. Bibliometric analysis provides a robust quantitative approach to evaluate the development of the scientific literature on deepfakes in order to meet this need.

It facilitates the mapping of intellectual structures, the identification of new subjects, the analysis of citation networks, and the discovery of patterns of collaboration between institutions and researchers (Boukhlif et al., 2023). The increasing use of curated databases, like Scopus (Elsevier, Amsterdam, The Netherlands) and Web of Science (Clarivate Analytics, Philadelphia, PA, USA), in conjunction with bibliometric tools, like VOSviewer, CiteSpace (version 6.2.R2, Drexel University, Philadelphia, PA, USA), Bibliometrix (Bibliometrix R package, University of Naples Federico II, Naples, Italy), and Gephi (version 0.10.1, Gephi Consortium, Paris, France), has significantly expanded the breadth and depth of such analyses (Boukhlif et al., 2024a). Under the direction of the main research questions (W. M. Lim & Kumar, 2023) shown in Figure 2, this study conducts an extensive bibliometric review (Dervis, 2019) of deepfake-related research published between 2014 and 2024.

The review questions offer insights into the dissemination of research, dominant trends, authors, citation patterns, and the overall intellectual climate of a field (D. Garg & Gill, 2023). Moreover, research questions guide bibliometric analysis to explore scientific production, researcher collaborations, emerging themes, and publication activity. This helps to grasp the research terrain and provides insights for decision-makers and the research community. The research provides both theoretical and practical contributions to deepfake research (Donthu et al., 2021).

Theoretical analysis involves identifying research trends, including emerging topics, emerging trends, top authors and publications, and trend-setter journals (Saadouni et al., 2025). This helps researchers navigate the research terrain, revealing research gaps, highlighting emerging directions, and consolidating well-established themes.

Integrating different bibliometric approaches (Hamza et al., 2022) helps researchers plan and innovate research activities. It helps create a well-rounded and systematic picture of deepfakes research and thus establishes a robust foundation for future research to be pursued more strategically and innovatively (Mubarak et al., 2023; Twomey et al., 2025).

This study distinguishes itself by

(i): Combining a selection of bibliometric indicators for more nuanced examination;
(ii): Covering a full ten years (2014–2024), recognized as a landmark moment in the history of deepfakes and related events, thus providing a pioneering longitudinal overview of deepfake research in the era of generative AI;
(iii): Identifying underrepresented topics and collaboration hotspots to guide future research trends in generative AI.

The remainder of this paper is structured as follows: A concise review of the literature on deepfake technologies and their effects on society is given in Section 2. The data collection, search query strategy, and bibliometric methodology are explained in Section 3. In Section 4, the primary results are presented. The implications, study limitations, and suggested future research directions are covered in Section 5. The study is concluded in Section 6.

2. A Brief Literature Review on Deepfake Research

The scientific community has become increasingly interested in deepfake technology (Hydara et al., 2024) over the last ten years, which has resulted in an increase in scholarly publications across a variety of fields. Although the technical, ethical, and societal aspects of deepfakes (Ennejjai et al., 2023; Van Eck & Waltman, 2010) have been the subject of numerous studies, very few have offered a thorough bibliometric analysis (Patel et al., 2023b) of the development of this field of study.

Most prior work has focused on specific aspects, such as generation or detection (Ur Rehman Ahmed et al., 2025; Tolosana et al., 2020; Guarnera et al., 2020) without giving a full picture of scientific dynamics, thematic trends, or the structure of research communities. It is against this background that our study stands out; by conducting a decade-long quantitative bibliometric analysis (2014–2024), this study maps the core domains of deepfake research while also examining their temporal evolution, thematic connections, and emerging trends.

This longitudinal framework thus offers an innovative and coherent explanation of the evolution of the field, both technologically and in terms of practical applications, highlighting the growing role of generative AI in the functioning of scientific production.

Early research work has mainly concentrated on the development of generation models, particularly those based on generative adversarial networks (GANs) (Suratkar et al., 2020), autoencoders (Guo et al., 2021), or unsupervised techniques (Mira, 2023; Z. Sun et al., 2025; Kietzmann et al., 2020). These works significantly improved the visual and audio fidelity of deepfakes (Chintha et al., 2020; Raza et al., 2022; Kohli & Gupta, 2021).

At the same time, synthetic content detection has become a leading research direction, employing convolutional neural networks (CNNs) (Ding et al., 2022), visual artifact detection (Ansorge, 2024), and biometric feature extraction (Abbaoui et al., 2024).

However, these studies are typically conducted in isolation, without establishing connections between generation and detection trends over time.

As artificial intelligence, and more specifically generative AI, has arrived, deepfake research has accelerated to an unprecedented extent (Hu et al., 2022). The availability of increasingly sophisticated models has facilitated the creation of photorealistic synthetic content and simultaneously escalated the challenge of detection and regulation (Mao et al., 2022; Nguyen et al., 2022).

The integration of these new technologies has introduced a qualitative shift in the creation of synthetic videos (Waseem et al., 2023), images (S. Y. Lim et al., 2022), and voices (Kasita, 2022), significantly transforming the course of scientific investigation in this space.

In addition to the strictly technical, several literature reviews have weighed the dangers of malicious deepfake use, particularly disinformation, political manipulation, fraud, and non-consensual pornography (Yasrab et al., 2021). They have been invaluable in raising awareness among the community regarding potential misuse, but not many of them have suggested a systematic classification of topics and fields of application over a period of time (Lakshmi & Hemanth, 2024).

Several recent bibliometric reviews have explored the evolution of deepfake research. “A Decade and a Half of Deepfake Research: A Bibliometric Investigation into Key Themes “ analyzed 217 Scopus database articles from 2011 to 2024 (Bisht & Taneja, 2024). The study highlights India’s emergence as a leading contributor in terms of publication volume.

VOSviewer and R were used to visualize collaboration networks and emerging trends. Its limitation in generalizing its results, however, since it is derived from a single database, cannot be ignored.

“Living in the Age of Deepfakes: A Bibliometric Exploration of Trends, Challenges, and Detection Approaches” highlighted the time from 2018 to 2023, studying 918 articles on the Web of Science. It revealed institutions, authors, and global collaboration with a focus on deepfake detection methods (Domenteanu et al., 2024).

“Deepfakes: Evolution and Trends” considered 331 articles from both Web of Science and Scopus. The study examined leading research fields, emerging trends, and funding organizations, presenting a general view of the field (Gil et al., 2023).

Finally, “A Bibliometric Analysis of Deepfakes: Trends, Applications and Challenges” similarly observed a rapid spike in articles from January 2019 through July 2023. The study highlighted higher international collaboration, with the United States leading the trend, and enumerated top journals and organizations that contributed to deepfake research (D. P. Garg & Gill, 2024).

In addition to these bibliometric studies, several works have outlined the historical trajectory of deepfake technology. The period between 2018 and 2020 was marked by democratization of the technology. Tools like DeepFaceLab and FakeApp made access to creating deepfakes available to more individuals. In 2019, apps like Zao and Reface made it easy for individuals to create deepfakes on their mobile phones.

These years also saw media frenzy, with deepfakes of politicians bringing the danger of misinformation into the limelight. By 2020, deepfakes were widespread in entertainment and financial fraud schemes (Boukhlif et al., 2024b).

Between 2021 and 2024, deepfakes underwent a phase of regulation and maturation. Facebook, Twitter, and YouTube introduced measures to fight and remove such content, and most countries initiated legislation against abuse. Initiatives such as the Deepfake Detection Challenge were created to stimulate deepfake detection research (Whittaker et al., 2023).

Though the technological advancements have been phenomenal, the regulatory and ethical frameworks are yet to keep pace. Such a mismatch puts a premium on interdisciplinary collaboration for establishing ethical boundaries and effective governance mechanisms (Roe et al., 2024).

By 2023–2024, deepfake technology had so advanced that they were pretty much imperceptible to the human eye, making it a very difficult task for security and anti-disinformation campaigns. The future of deepfakes in reality seems to be convoluted and obscure (Ramluckan, 2024).

At the same time, there have been a few studies that began researching the creative or benign uses of deepfakes (Cover, 2022; Siegel et al., 2021). In the entertainment industry, the technology has been used to create digital replicas of deceased actors’ faces, dub dialogues with different languages, or create personalized visual effects (Kalaiarasu et al., 2024). In education, new uses are designed to personalize learning materials or provide interactive avatars for training (Lu & Ebrahimi, 2024).

In therapy and health, experimental initiatives have explored the use of deepfakes to simulate therapeutic dialogue, facilitate rehabilitation, or assist cognitively impaired patients (Bukar et al., 2023). Other promising applications have emerged in cultural contexts (museology, historical reenactment) (Aria & Cuccurullo, 2017), media (realistic dubbing, synthetic journalism) (Park et al., 2024), and security (training for visual fraud detection) (Kılıç & Kahraman, 2023). Existing research is still sparse in spite of this growing variety of applications.

Compared to these existing bibliometric studies, the present work makes three additional contributions. First, it emphasizes the intersection of deepfakes with generative AI, unlike earlier studies that considered deepfakes either in isolation or in a broader, less focused context. Second, it employs a full ten-year longitudinal period (2014–2024), providing a more comprehensive temporal analysis than studies restricted to shorter intervals (e.g., 2018–2023).

Third, it integrates multiple bibliometric indicators, including co-authorship, co-citation, keyword evolution, and thematic mapping, to offer a multidimensional view of research evolution, highlighting both technological drivers and societal impacts. Together, these features differentiate our study while maintaining its connection to the trajectory established by previous bibliometric analyses.

Guided by tools such as VOSviewer (Öztürk et al., 2024) and Bibliometrix (Abafe et al., 2022), our study identifies the most significant research avenues, visualizes the main thematic groupings, and compares the scientific productivity trends within generation, detection (Birkle et al., 2020), and application domains (like therapeutic, educational, cultural, security-related, media, and health-related usages).

This approach gives a unique overview of the scientific landscape surrounding deepfakes and highlights future opportunities and research gaps to be addressed to create a better-balanced picture of the threats and opportunities.

3. Materials and Methods

3.1. Data Collection and Filtering

Bibliometric analysis constitutes a rigorous method for mapping and evaluating scientific research, identifying trends, and understanding research dynamics (Mongeon & Paul-Hus, 2016). In contrast to qualitative methods in systematic reviews and meta-analyses, bibliometric analysis uses quantitative indicators like publication and citation numbers, minimizing potential biases stemming from subjective interpretation (Thelwall, 2018).

Bibliometric software enhances the classification, organization, and advancement of scientific research. Bibliographic databases are essential for accessing and analyzing research data. Key options include Web of Science (WoS), Scopus (Elsevier, Amsterdam, The Netherlands), PubMed (National Center for Biotechnology Information, U.S. National Library of Medicine, Bethesda, MD, USA), and Google Scholar (Google LLC, Mountain View, CA, USA), each with distinct strengths (X. Sun et al., 2022).

WoS stands out with advanced search features, like keyword, author, title, affiliation, and cited reference searches. Scopus and PubMed provide comparable functionalities, albeit with narrower scopes. Google Scholar offers a user-friendly interface and broad disciplinary coverage.

To ensure methodological transparency and reproducibility, a systematic bibliometric process was employed to identify, choose, and incorporate pertinent publications for analysis (Radha & Arumugam, 2021). The process commenced with an exploratory search, which identified the most frequently co-occurring keywords related to deepfakes and generative artificial intelligence from the Web of Science (WoS) Core Collection database (Arruda et al., 2022).

To assess dataset representativeness, we conducted a targeted cross-validation in Scopus and IEEE Xplore (IEEE, Piscataway, NJ, USA) using the same Boolean query. Although Scopus retrieved a greater number of conference papers, especially from major venues like CVPR and ICCV, the overarching thematic trends, collaboration networks, and leading authors remained consistent with those identified in the WoS dataset. These findings confirm that WoS reliably captures the scholarly core of peer-reviewed literature, supporting the validity of our bibliometric analysis.

Based on these outcomes, a narrower Boolean query was constructed by combining the significant keywords with the logical operators “AND” and “OR”, covering thematic topics such as generation, manipulation, detection, and media-related concepts. The query was sufficiently comprehensive to ensure accurate and extensive retrieval of relevant literature (Wang et al., 2021).

The final Boolean query used in the Web of Science was

TS = ( (1) (“deepfake” or “deep fake” or “fake news” or “ disinformation” or “manipulation” or “misinformation” or “forgery” or “spoofing”) and (2) (“deep learning” or “machine learning” or “transfer learning” or “self-supervised learning” or “gan” or “generative adversarial networks” or “cnn” or “convolutional neural networks” or “feature extraction”) and (3) (“artificial intelligence” or “ai” or “computer vision” or “generative ai” or “generative artificial intelligence”) and (4) (“detection” or “identification” or “recognition”) and (5) (“face” or “image” or “video” or “voice” or “speech”)).

The search was limited to English-language articles published between 2014 and 2024, excluding reviews, early access, and retracted papers. Following the extraction of the dataset, a multi-stage filtering procedure was applied to ensure quality and relevance of the selected records. In addition to making the bibliometric analysis more robust, this strict and reproducible process also ensures that the dataset represents mature and peer-reviewed scientific publications.

Figure 3 presents a graphical representation of the search strategy, keyword structure, filtering criteria, and an inclusion and exclusion process that led to the final dataset used in the analysis.

Figure 3 illustrates a three-stage process of article selection: identification, screening, and inclusion. In the identification phase (dark blue section), 1867 articles were retrieved from the WoS database using keywords related to deepfakes, selected based on their frequency of occurrence.

At the screening stage (light blue), the set of articles was refined using an advanced query built from the most frequent keywords (≥10 occurrences) that were grouped into five semantic categories: (1) deepfake-specific terminology, (2) AI techniques, (3) AI general terms, (4) detection methods, and (5) media types. They were combined using the Boolean operators “AND” and “OR” to ensure a comprehensive span of the concept space.

(1): Capture the broad spectrum of deceptive practices associated with deepfakes, ensuring the inclusion of diverse forms of media falsification.
(2): Reflect the technological foundations used both to generate and detect deepfakes.
(3): Position the study within the broader context of intelligent systems and content synthesis.
(4): Emphasize the focus on methods for identifying manipulated content, a core concern in deepfake research.
(5): Specify the formats most affected by deepfakes, enabling precise targeting of relevant studies.

This refined query narrowed the dataset to 1082, and further to 510 after applying topic filtering. Additional exclusion criteria (language, time period of publication, and document type) were applied, which excluded an additional 100 documents.

Finally, during the inclusion phase (aqua blue section), 410 papers were retained for the final bibliometric analysis, representing the final dataset for bibliometric analysis.

3.2. Bibliometric Methodology

A bibliometric analysis evaluates scientific research production and impact, focusing on deepfake studies by examining paper quantity and quality, leading authors and institutions, prevailing research themes, emerging trends, and impact indicators (Aghaei Chadegani et al., 2013).

It also identifies research gaps, collaboration networks, geographical distribution, and keyword usage. This comprehensive analysis provides a detailed view of deepfake research dynamics and the relevance of published work. The graphical analyses were primarily generated using data from the WoS database with the Bibliometrix R Package (biblioshiny v4.3.2) (Anker et al., 2019).

Additional graphical representations were produced with VOSviewer, a free tool for bibliometric mapping (Dhiman, 2023). The analysis began with a descriptive overview of the dataset elements, which consisted of the following components: the main dataset characteristics, annual scientific production trends, average citations per year, and a three-field plot showing relationships between countries, authors, and cited references.

Scientific mapping involves analyzing and visually representing a field of study using various elements such as scientific observations, measuring instruments, bibliometric metrics, and data analysis tools (El-Gayar et al., 2024). This approach helps researchers identify trends, patterns, and transformations, supporting the development of conceptual frameworks and analytical approaches. During this stage, we focused on some of the most significant areas of bibliometric analysis, like pioneer institutions; the most relevant and cited authors, journals, and countries; the world’s most highly cited papers; productivity patterns based on Lotka’s law; the most frequently used keywords and their co-occurrence networks; and collaboration and co-citation networks at the author and journal levels.

The third section addresses network analysis, which analyzes connections among entities represented as nodes and edges within a graph theoretical framework. Key metrics considered included proximity and centrality indices, with centrality indicating the social importance of nodes based on their role and position in network communications. During this stage, analysis brought to light several elements: thematic mapping of keyword interrelationships, multiple correspondence analysis (MCA) to explore relationships among categorical data, correspondence analysis (CA) to visually represent the relationship between items in frequency tables, and multidimensional scaling (MDS) to reduce data dimensionality and map the network structure.

All of these elements are discussed in Section 5 of this paper. The research approach began with data collection from Web of Science (WoS) using detailed keyword queries for deepfakes and generative artificial intelligence, assisted by a Python (version 3.10, Python Software Foundation, Wilmington, DE, USA) script for keyword extraction (Qu et al., 2024).

The dataset was limited to English-language articles between 2014 and 2024, excluding review articles, retracted articles, early-access articles, and 2025-listed articles. These methodological elements constitute the core of the Discussion and Conclusion Sections of this paper.

Whereas previous bibliometric studies have typically relied on single dimensions, such as citation frequency or keyword co-patterning, this study integrates multiple dimensions to provide a greater comparative view of deepfake studies. This multidisciplinary approach permits comparison with current work, the recognition of under researched areas, and the identification of future long-term trends. A full comparison with relevant literature is provided in Section 5.

4. Results

4.1. Overview Analysis

4.1.1. Main Data

The main dataset on deepfake research was derived from the Web of Science (WoS) database. It covers the period 2014–2024, including 289 sources and 410 documents, with an average annual growth rate of 61.94%. A total of 1354 authors contributed to the dataset, with only 11 single-authored documents and an international co-authorship rate of 26.83%, resulting in an average of 3.08 co-authors per paper. The dataset also includes 1420 author-selected keywords, complemented by the WoS-exclusive ‘Keywords Plus’, which significantly enrich the corpus. It also has 13,831 references, an average document age of 2.81 years, and an average of 11.98 citations per document. These metrics offer a thorough summary of the dataset’s productivity, collaboration trends, and scope (Martin-Rodriguez et al., 2023).

4.1.2. Annual Scientific Production and Average Citation per Year

The annual scientific output on deepfake research has steadily increased over the past decade. Annual production rose from a single article in 2014 to 124 articles in 2024, confirming a strong upward trend in research output (Alnaim et al., 2023). Distinct peaks can be observed, with a sharp rise in 2019–2020 linked to public controversies, such as the DeepNude incident, followed by stabilization in 2021–2022, and another surge in 2023–2024 driven by advances in multimodal generative AI models (e.g., DALL·E, Sora) and the increasing prevalence of voice-based deepfake fraud (Lee et al., 2021). Technological developments in GANs, autoencoders, and widely available open-source tools, along with the proliferation of deepfake applications in social, political, and cybersecurity contexts, have been key drivers of this growth (Li et al., 2018).

At the same time, citation habits indicate that the first years were cited more intensive, with an average of 1.25 citations per paper in 2014 and a maximum of 8.08 citations per paper in 2017. Excluding the most recent three years due to delays in citations, the general rate of citations averaged 4.80 citations per paper by 2021. For the whole dataset, the average is 11.98 citations per paper, marking the growing academic impact of this research domain. These trends are all illustrated in Figure 4, graphing combined annual scientific output and average annual citations (2014–2024).

However, it should be noted that these results are affected by the well-known time lag effect, which systematically reduces the number of citations for the most recent publications (2023–2024). To address this limitation, we introduced an indicator of the Annualized Citation Rate (ACR), defined as

{A C R}_{i} = \frac{C_{i}}{Y_{i}}

where

{A C R}_{i}

is the annualized rate of citation of publication i, C_i is the frequency with which the document has been cited, and Y_i is the years since the publication date.

Figure 5 provides a comparison of absolute citation counts and annualized citation rates for 2014–2024. The results reflect that although 2023–2024 publications indicate lower absolute citation counts, they have comparable, or even greater, annualized citation rates than other years and thereby substantiate their scientific merit despite the time bias.

4.2. Sources Analysis

4.2.1. Top 10 Most-Cited and Productive Sources

The bibliometric analysis identified the top sources in deepfake research in terms of both productivity and impact, as presented in Table 1. IEEE Access emerges as the most productive journal with 30 publications, followed by Multimedia Tools and Applications and Sensors, each with 7 documents. In terms of citations, the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (PROC CVPR IEEE) leads with 1353 local citations, followed by arXiv with 1250 citations and the IEEE Transactions on Information Forensics and Security with 455 citations. Notably, although arXiv is not among the most prolific outlets in terms of publication count, it ranks second in terms of citations, revealing the high visibility of a small group of highly cited preprints. In this context, “local citations” refer to the number of times a source is cited within the analyzed dataset (deepfake research, 2014–2024), whereas “global citations” denotes the total number of citations a source has received across the entire Web of Science database. It is important to note that productivity (number of documents) and impact (local citations) are presented side by side in the table but are independent measures and do not correspond line by line.

4.2.2. Top 10 Most Relevant Authors

Figure 6 presents the top ten most relevant authors (based on number of publications) in deepfake research. Among 1354 contributing authors across 410 publications, Li X led with ten articles and a fractionalized score of 1.95. It is important to note that this ranking reflects bibliometric productivity (publication count and fractionalized value) and does not assess the originality or impact of the contributions. Zhao Y and Javed A ranked second and third, with six publications each and fractionalized scores of 1.53 and 1.39, respectively.

4.2.3. Top 10 Corresponding Author’s Countries

As illustrated in Table 2 and Figure 7, this study also examines the countries where the relevant authors were published in relation to their contributions to the deepfake field. China ranked first with 64 single-country publications, 19 multi-country collaborations, and the highest frequency score of 0.202. The top three countries in terms of scientific productivity are China, India, and the United States.

The analysis also highlights India’s emergence as a major participant in the study of deepfakes. Ranking second to China, India now surpasses the United States in the number of published papers. This shift reflects the rapid growth of Indian research in generative AI and signals the country’s increasingly prominent role in the global scientific community on emerging technologies.

4.2.4. Top 10 Most Cited Authors Globally

Local citations reflect how often an author in this dataset has been cited by other publications within the same collection. To provide a more detailed analysis of the source papers, this study employed both global and local citation scores. The local citation score from the WoS database tracks the frequency with which items in the same collection cite an author’s work, while the global citation score represents the total number of citations, showing how many times the papers in this collection have been cited overall. A higher local citation score indicates a publication’s greater significance to the field. Bibliometric techniques were also applied to examine the publishing output of the most influential authors in this field. According to Table 3, CALDELLI R ranked first with 188 citations, averaging 26.857 citations per year. Further analysis indicates that deepfake detection is a key theme in many of the most-cited papers.

4.2.5. Leading Contributors and Influential Publications

Table 4 presents two complementary perspectives in deepfake research. The first part synthesizes the three-field analysis linking top countries, authors, and keywords. The results show that the top contributors are China, the United States, and India, and illustrious authors such as Li X, Zhao Y, Zhang Y, Caldelli R, and Javed A are deeply linked with core themes, including deep learning, deepfake detection, and computer vision. The above synopsis shows not only the spatial concentration of scientific production but also these researchers’ leading roles in forming the evolution of the discipline. In order to avoid confusion, the term “deepfake” refers to the technology in general, “deepfakes” refers to individual instances of fake media, and “deepfake detection” refers to research that seeks to develop identification methods. Technical categories, such as feature extraction, represent methodologies applied in detection pipelines.

The second part of Table 4 reports the most globally cited documents in deepfake research (2014–2024). “Globally cited documents” are those that have received citations from other papers across the entire WoS Core Collection. The bibliometric analysis shows that (Mirsky & Lee, 2021), published in ACM Computing Surveys, ranks first with 326 global citations, followed by (Hussain et al., 2019) and (Cozzolino et al., 2017). It is important to note that productivity and influence indicators (authors, countries, keywords vs. globally cited documents) are presented side by side for compactness, but they represent independent dimensions of analysis and do not correspond line by line. This dual presentation provides both a structural overview of the field (countries, authors, keywords) and a clear indication of its most influential publications.

4.2.6. Top 10 Most Cited Documents Locally

The local citation count of a deepfake-related document refers to the number of times it is cited by other articles within the dataset. Figure 8 presents a list of papers with the highest local citations. The top ranked work, with 25 citations, is “The Creation and Detection of Deepfakes: A Survey,” published by Mirsky Y in 2021.

4.2.7. Lotka’s Frequency Distribution of Scientific Productivity

The coefficients of Lotka’s law (Coile, 1977) were calculated for articles in this bibliometric analysis on deepfakes (Amerini et al., 2019). Lotka’s law explains the connection between the quantity of publications produced by authors and the number of authors producing that output. It describes the distribution of authors across specific informatics domains or over time (Mirsky & Lee, 2021). Figure 9 displays the frequency distribution of scientific output according to Lotka’s law. Table 3 provides an analysis of how closely Lotka’s law is adhered to in terms of the quantity of publications and the frequency of authors within the subject under investigation. The solid line represents the theoretical distribution of author productivity expected under Lotka’s law, while the dashed line illustrates the observed distribution in our dataset. The proximity between the two curves indicates the degree to which the data conforms to Lotka’s law.

The findings indicate that Lotka’s law applies to deepfake research, as the comparison between observed and theoretical author productivity distribution reveals that most authors contributed a single publication, whereas a small minority produced multiple works. This pattern highlights the predominance of occasional contributors, with a limited number of highly productive authors generating the bulk of scientific output in the field.

4.2.8. Lot Top 10 Most Frequent Words

The most frequently used terms were ‘deep learning,’ ‘computer vision,’ and ‘deepfake,’ which appeared 240 times in total, compared with 44 occurrences of ‘artificial intelligence’. Figure 10 illustrates the most popular terms in the realm of deepfake.

Artificial intelligence (AI) has played a central role in deepfake technology development and dissemination. The invention of AI-based techniques, notably deep learning and generative adversarial networks (GANs), significantly enhanced the realism and democratization of synthetic media generation. While “artificial intelligence” was used less frequently than “deep learning” or “computer vision,” its use is essential because it encompasses the general computational paradigm enabling these innovations.

4.2.9. Keywords Co-Occurrence Network

The co-occurrence network of terms is displayed in Figure 11. Each node in this network represents a keyword, and the size of the node indicates how frequently the keyword occurred. The connections between nodes illustrate how often these terms appear together, highlighting the relationships and co-occurrence of keywords within the dataset. The thickness of the connections indicates how frequently two or more terms appear together. A topic cluster is represented by each color. Among the most frequently recurring keywords across several clusters are deepfake, video, deepfake detection, machine learning, and artificial intelligence techniques.

4.2.10. Keywords Co-Occurrence Network

China leads the most prominent collaboration cluster, as evidenced in Figure 12 and Figure 13, followed closely by the USA, which anchors a separate yet highly connected cluster. While China and the USA are linked, they serve as the central nodes of distinct major collaboration networks. Other notable regional collaborations include Pakistan–USA and India–USA partnerships, both contributing to the red cluster. The colors indicate the intensity of scientific output by country (darker shades correspond to higher productivity), while the lines represent international collaboration links (thicker lines indicate stronger collaboration).

4.2.11. Organizations Co-Authorship

Figure 14 presents the co-authorship networks among institutions engaged in deepfake research. King Saud University and Shenzhen University emerge as the most central institutions, forming strong collaborative clusters with regional partners, including King Abdulaziz University, Princess Nourah Bint Abdulrahman University, and Taif University in Saudi Arabia, as well as Nanjing University of Information Science and Technology and Hangzhou Dianzi University in China. Additionally, institutions such as the University of Michigan and the University of California, Berkeley participate in international collaborations. These patterns highlight the significant role of both national and cross-border institutional partnerships in advancing research on deepfakes.

4.2.12. Author Co-Citation Network

Figure 15 presents a network illustrating the co-citations of authors, with nine clusters capturing researcher relationships, co-publications, and key contributors within this academic network. “Zhao Y”, “Caldelli R”, “Li Y”, “Liu X,” and “Coccomini Da” emerge as the most influential authors, highlighting their central roles and the collaborative dynamics within the scientific community in the context of this deepfake study.

4.2.13. Network Visualization Map of Journal Co-Citations

Figure 16 exhibits a network illustrating co-citations among journals, where six clusters represent the relationships between these publications. The PROC CVPR IEEE (IEEE Conference on Computer Vision and Pattern Recognition) is the most cited source in this network, followed by ARXIV, the IEEE I CONF COMP VIS (IEEE International Conference on Computer Vision), and the IEEE Access journal.

4.3. Network Analysis

4.3.1. Thematic Mapping

Thematic maps depict clusters of keywords, where each keyword is plotted within a circle based on centrality and density metrics, creating a two-dimensional representation (Singh et al., 2021). As demonstrated in Figure 17, the map is split into quadrants. The upper right quadrant contains “motor themes,” which encompass keywords showing peak development and highest relevance. The lower right quadrant includes “basic themes,” featuring keywords with high relevance but varying levels of development. Keywords with moderate values in both relevance and development are situated in the lower left quadrant, indicating emerging or declining topics. Niche themes in the upper left quadrant represent keywords with notable development potential but potentially lower relevance.

The ‘motor themes’ quadrant appears relatively sparse visually, as key terms such as GANs, detection, and misinformation overlap with the ‘basic themes’ cluster. This reflects a transitional phase in which central topics shift between thematic categories.

4.3.2. Conceptual Mapping of Keywords (MCA, CA, MDS)

To streamline the presentation of results, the results of the factorial and multivariate analyses are integrated in a single block. Figure 18a–c bring together five complementary visualizations: (a) multiple correspondence analysis (MCA) factorial map, (b) correspondence analysis (CA) factorial map, and (c) multidimensional scaling (MDS) projection. Overall, the MCA factorial map and dendrogram reveal the presence of two dominant clusters: one centered on deepfakes and misinformation (fake news, media manipulation, political content) and another on detection technologies (generative adversarial network (GAN)-based detection, convolutional neural networks (CNNs), and forensics). These clusters illustrate the dual orientation of the field toward creation and mitigation. The CA factorial map and dendrogram confirm these findings, highlighting the structural proximity between keywords associated with generation (GANs, autoencoders, diffusion models) and those linked to detection and authentication. This indicates that, although often studied separately, these two domains are increasingly interconnected. Finally, the MDS visualization synthesizes the conceptual landscape, mapping thematic proximities in a reduced two-dimensional space. It shows a stable triangle between generation models, detection strategies, and societal/ethical concerns. Together, these five analyses provide a coherent conceptual mapping of the domain. They highlight that deepfake research is structured around three poles: creation of synthetic content, development of detection methods, and societal/ethical implications, reflecting both technological drivers and interdisciplinary concerns.

Figure 18 provides an integrated view of the conceptual structure of the field. It confirms that deepfake research is mainly organized around three poles: content creation, detection methods, and societal/ethical implications.

5. Discussion

This study offers a structured bibliometric overview of the evolution of deepfake research in the context of generative AI, reflecting the increasing academic attention to both the risks and the potential of this technology. The bibliometric analysis of 410 articles related to deepfake technology, authored by 1354 researchers (2014–2024), reveals strong collaboration, with only 11 articles being single-authored. Publications show an average annual growth rate of 61.94%, with each article being cited an average of 11.98 times. Key findings include leading Chinese authors (Li X, Zhao Y, and Javed A) and significant connections between research conducted in China, India, and the USA. The analysis identified six distinct groups of keywords, revealing the main themes of research on deepfakes. The top three core topics are as follows:

The first group includes 13 keywords related to “deepfakes”, “fake news” scenarios, and their effects, highlighting their role in accelerating the spread of misinformation.
The second group of 12 keywords is predominately AI-related, particularly generative AI and GANs (generative adversarial networks), which are extremely important for creating deepfakes.
The third group includes 11 keywords pertaining to the detection of “deepfakes”, emphasizing methods for identifying and combating manipulated content.

These findings provide direct responses to the research questions outlined in Figure 2. RQ1, concerning global trends, is reflected in the substantial annual growth of publications from 2014 to 2024. RQ2 and RQ3, regarding the most influential authors, institutions, countries, and sources, are addressed through the identification of leading Chinese researchers and prominent outlets, such as IEEE Access. RQ4, on citation dynamics, is illustrated by observed co-citation patterns and average citation rates. RQ5, on scientific collaborations, is supported by the extensive co-authorship networks spanning countries like China, India, and the United States. RQ6, on emerging themes, is captured by keyword clusters emphasizing disinformation, generative models, and detection methods. Finally, RQ7, on research gaps, highlights underexplored topics, such as multimodal deepfakes and adversarial detection, offering guidance for future studies.

The presence of distinct clusters of keywords signals the specialization of deepfake research into specialized subdomains. Subsequent research could be advanced by connecting these clusters, for instance, uniting detection methods with media literacy studies to enable societal resistance to manipulated media (Alkhammash, 2023).

The study indicates the ubiquitous use of AI in deepfakes and the necessity of strict controls due to the potential for abuse. Thematic maps represent the makeup and evolution of research, highlighting leading authors and the significant citation numbers of powerful researches. Top producers of deepfake research are China, India, and the United States, just as they are in overall scientific productivity.

This study also projects the rapid ascension of India as a predominant contributor, with the country performing high in the number of publications and surpassing that of the United States, establishing its position among the major powerhouses in the research on deepfakes. These findings match previous bibliometric studies on technology based on AI where similar surges in volume of publications and multi-country collaborations has been observed by China, the USA, and India. However, this study demonstrates a sharper focus on disinformation and deepfake detection compared to broader AI bibliometric work.

These findings are consistent with comparative bibliographic database analyses. Singh et al. (2021) demonstrated that 99.11% of the journals covered by WoS are also indexed in Scopus, thus confirming that WoS accurately captures the bulk of the academic literature. Similarly, Pranckutė (2021) emphasized that WoS, being less comprehensive but selective, having rigorous indexing guidelines, and being historically stable, is particularly suited to longitudinal bibliometric analyses. Recently, Maddi et al. (2025) confirmed that WoS and Scopus have overall congruent disciplinary and geographical coverage, which legitimizes applying WoS to determine core academic trends.

In comparison to earlier bibliometric studies, our research contributes additional depth by explicitly examining the intersection of generative AI and deepfakes. This focus demonstrates how rapid advances in generative models, such as GANs, diffusion models, and transformers, have directly driven both the volume and scope of deepfake research, an aspect not highlighted in prior works. By employing a decade-long longitudinal framework (2014–2024) and incorporating multiple indicators (co-authorship, co-citation, keyword evolution, and thematic mapping), this study offers a more comprehensive and multidimensional perspective on the evolution of the field.

Unlike previous bibliometric studies, which have generally adopted a broader perspective by focusing on artificial intelligence as a whole or on image generation techniques, the present analysis provides a dedicated longitudinal focus on deepfakes in the context of generative AI, over the full decade of 2014–2024. Its exclusive focus on publications establishes an instant and inherent relationship between deepfakes and generative AI over a long period, from 2014 to 2024, coinciding with the emergence and development of the deepfake phenomenon.

While previous studies have pointed out the proliferation of international AI research and the identification of falsified content, the unique socio-technological dimensions of deepfakes have not received sufficient attention. Our study thus provides significant and unique added value by highlighting the decisive role of generative technologies, or GANs, in driving the rapid growth of deepfake research. It also shows that this dynamic is not merely a short-lived technological trend but the result of interdisciplinary cross-pollination driven by pressing ethical, security, and media concerns.

In this manner, this analysis offers a unified and objective perspective on the phenomenon, one that facilitates improved understanding of the structural determinants shaping and directing the evolution of deepfake research throughout the period of generative AI.

This study also underscores the interdisciplinary nature of deepfake research, with productive scholars employing diverse methodological approaches. The best journals to publish research on this topic are the IEEE Access Journal and IEEE Transactions on Information Forensics and Security. Analysis has been carried out on deepfake papers procured from the Web of Science Core Collection, noting that the use of different databases may yield varying outcomes.

It is noteworthy that the focus on the decade 2014–2024 reflects the widespread emergence of deepfakes and significant advances in digital content creation and manipulation. The revolution began in 2014 with the introduction of generative adversarial networks (GANs) by Lan Goodfellow (Di Franco, 2016), marking a milestone in generating realistic images and videos. Between 2015 and 2016, the earliest experimental applications of GANs were demonstrated, but deepfakes remained mostly academic.

A breakthrough occurred in 2017 when a Reddit user, ‘deepfakes,’ posted manipulated videos using GANs to superimpose celebrity faces onto explicit content. This broke news around the globe and triggered a wave of bans on different platforms for such content, paradoxically making them more popular (Nenadić & Greenacre, 2007).

Although this technology presents substantial risks, including disinformation, privacy violations, and malicious use, it also offers considerable opportunities for virtualizing the real world across multiple domains. In entertainment and media, they allow for realistic visual content creation, revolutionizing the film industry. For education and training, these technologies provide rich simulations, enriching experiential learning.

In video games and virtual reality, deepfakes contribute to lifelike avatars, enhancing user immersion (Hagele et al., 2023). The technologies are utilized in psychology to create controlled virtual worlds for exposure therapy, assisting patients in overcoming phobias (Goodfellow et al., 2020). Deepfakes are thus changing numerous industries while contributing to a growing virtualization of the real world, opening up opportunities as well as ethical challenges (Maddocks, 2020).

But these prospects must be weighed against the enormous dangers. Under insufficient regulation, these technologies could be misappropriated for manipulation and dissemination of false information in education and medicine (Liu et al., 2023). Therefore, achieving an ethical balance is essential, highlighting responsible design and transparent deployment (Malik et al., 2022). This study not only maps the evolving landscape of deepfake research but also underscores the need for ethical reflection and interdisciplinary collaboration to ensure that innovations in generative AI benefit society (Apolo & Michael, 2024).

5.1. Implications

This bibliometric map offers a comprehensive and current overview of the scientific evolution of deepfake research, emphasizing major research dynamics, global collaborations, and key sources. The findings provide concrete implications for academic research, technological innovation, and public policy.

Further identification of premier thematic areas, deepfake detection, GANs, deep learning algorithms, and explainable methods is a strategic foundation for guiding future development. Identification of scientific clusters captures the growing importance of hybrid approaches combining artificial intelligence, computer vision, and digital forensics, thereby creating a pathway toward tangible innovations in automated detection.

However, this study also found significant underrepresentation of genuinely interdisciplinary work that integrates AI with legal, ethical, psychological, and cybersecurity considerations. There are extremely few publications that address both the technical difficulties of automatic detection and the field-level challenges of deployment in high-stakes contexts such as social networks, election processes, or courtrooms. This gap represents a significant opportunity for socially oriented, high-impact innovation.

In addition, the concentration of publications in certain countries and institutions and the greater visibility of specialist journals reflect a lively and well-consolidated scientific community. These results may encourage researchers, policymakers, and funding agencies to pursue strategic partnerships and foster greater transnational collaboration. They also underscore the need to broaden participation from underrepresented countries, particularly those severely threatened by disinformation based on deepfakes.

Finally, the mapping of scientific contributions reveals a significant geographical imbalance. There is considerable potential for the application of local pilot programs in risk areas North Africa, Latin America, and South Asia where deepfakes would be likely to have a devastating effect on public trust. These results point to the need to develop anticipatory and regulatory capacities in response to the proliferation of synthetic content. Future studies should focus on explainable and scalable solutions that strike a chord of addressing today’s information integrity and digital trust issues.

5.2. Limitations

Although the bibliometric analysis discussed here is a serious and detailed overview of deepfake research under the broad category of generative artificial intelligence, there are certain following limitations to be considered.

Although this study relied exclusively on the WoS Core Collection, which ensures data quality and methodological reproducibility, this choice may exclude certain conference proceedings more comprehensively indexed in Scopus and IEEE Xplore. Nonetheless, WoS already captures the academic core of peer-reviewed literature, as confirmed by comparative studies. Future bibliometric research should therefore adopt a multi-database approach (WoS + Scopus + IEEE Xplore) to broaden coverage, reduce potential omissions, and better integrate the growing importance of conferences on deepfake, artificial intelligence, and computer vision.

Additionally, this study was limited to English-language publications. This language-based limitation, which is common in bibliometric studies, ensures data consistency and enables result interpretation at the international level. However, this limitation may result in the exclusion of valuable works written in other languages.

Furthermore, the 2014–2024 time frame was chosen to capture the decade in which deepfake research grew exponentially, especially starting in 2017 with the widespread adoption of GANs. The chosen time frame enhances the relevance of the findings by capturing the core period of accelerating scientific activity in this area. Nevertheless, very recent work (published in 2025) and foundational works (prior to 2014) may have been excluded.

Finally, although the Boolean query was constructed from the most frequently used keywords (≥10 times) with uniform semantic groups, there are likely opportunities to miss some papers that employ non-standardized terms or contain less frequent terms. This limitation is inherent to any keyword-based search method.

Furthermore, one must recognize that the findings regarding new publications (2023–2024) remain affected by the time lag effect since these publications have had little time to gain citations. The inclusion of the annualized citation rate serves to partially offset this bias; nevertheless, conclusions drawn in relation to this period ought to be viewed as tentative and interpreted cautiously. Future longitudinal studies, conducted after 2025, will need to verify the long-term scientific validity and power of these recent results.

Despite these limitations, the methodological choices made here increase the robustness of the study on the basis of established criteria: a good data source, a scientifically relevant time frame, and an appropriately structured inquiry. As a result, this study presents a solid and orderly perspective on the scientific progression, collaborative networks, and main themes regarding deepfakes.

5.3. Future Research Directions

This bibliometric study offers a rigorous mapping of research on deepfakes and generative artificial intelligence from 2014 to 2024. Based on a robust methodology (semantic query, WoS database, targeted time frame), it serves as a reliable foundation for guiding future research. This mapping may assist researchers in the following:

Determining highly concentrated research areas and unexplored gaps (e.g., bias in generative models, ethical issues, or deepfakes in mental health).

Targeting emerging niches and generating novel insights in under-researched subfields.
Examining interdisciplinary dimensions through co-citation analysis, thematic evolution, or the study of emerging research communities.

This work can support the following:

Mixed-method or qualitative studies to complement bibliometric findings.
Ethical, legal, and societal considerations for the implications of deepfakes.

Longitudinal studies beyond 2025 to monitor the evolution of multimodal generative technologies.

An emerging research direction involves the intersection of large language models (LLMs) and deepfakes. Recent developments in text-driven video generation platforms (e.g., Sora, Runway, Pika Labs) highlight the rapid growth of multimodal deepfakes, in which textual prompts produce highly realistic dynamic videos. However, our bibliometric analysis indicates that this theme is currently underrepresented. Future studies should address this gap by investigating adversarial detection strategies specifically designed for LLM-driven multimodal content, including techniques such as model watermarking for provenance verification and anomaly detection in attention mechanisms. These strategies are crucial for anticipating next-generation threats and ensuring the integrity of digital ecosystems.

Overall, this study is a strategic point of reference for researchers, policymakers, and practitioners who want to create effective work based on current and emerging scientific trends.

6. Conclusions

This paper presents a comprehensive bibliometric analysis of 410 deepfake technology articles published between 2014 and 2024. Using data retrieved from the Web of Science (WoS) database, this study conducted a comparative analysis of global research output, the most prolific authors and organizations, and thematic directions. The analysis employed VOSviewer and Biblioshiny (Bibliometrix R package), enabling high-quality visualization and facilitating effective interpretation.

The results exhibit a staggering average annual growth rate of 61.94% in publications and an average citation rate of 11.98 citations per article. China, the USA, and India are the top three countries in terms of research output. Of the 1354 authors listed, only 11 papers were single-authored, reflecting close international collaboration. Chinese researchers, especially Li X, Zhao Y, and Javed A, are revealed to be leading actors in the area. The most prolific and highly cited journals include the IEEE Access Journal, the ACM COMPUT SURV, Association for Computing Machinery (ACM) Computing Surveys, the IEEE T INF FOREN SEC (IEEE Transactions on Information Forensics and Security), and the IEEE Sensors Journal.

Keyword co-occurrence analysis identified six thematic clusters in deepfake research. The most prominent among them are as follows: (1) deepfakes and disinformation, highlighting the role played by synthetic media in the spread of misinformation; (2) the central role of generative models, particularly GANs (generative adversarial networks), in the generation of deepfakes; and (3) detection algorithms and methods that are able to identify falsified content. These clusters of themes reflect a dual focus in the literature: as the creation and realism of deepfakes are improving at a rapid rate, detection techniques are also evolving to keep up with these advancements.

Technologically, this study found the widespread use of AI methods such as GANs, convolutional neural networks (CNNs), recurrent neural networks (RNNs), and natural language processing (NLP) methods. The integration of these models produces increasingly advanced and realistic deepfake content that is harder to detect. This technological development poses significant ethical, legal, and societal challenges. Researchers are encouraged to adopt interdisciplinary strategies that go beyond technical solutions, integrating explainable AI detection techniques with perspectives from media studies, legal frameworks, and ethics. Policymakers should implement robust governance measures, such as digital watermarking, provenance tracking, and public awareness initiatives, to effectively mitigate associated risks.

The timeline illustration reveals a major historical progression from the origins of GANs in 2014 and early academic research to the public dissemination of deepfakes in 2017–2019 via online platforms like Reddit, FakeApp, Zao, and Reface. This democratization of deepfake technologies triggered both social alarm and additional media attention. Deepfake technology entered a new phase between 2021 and 2024, marked by international legislation, regulation, and detection campaigns like the Deepfake Detection Challenge. Deepfakes had become almost indistinguishable to the naked eye by 2023–2024, creating new security, privacy, and trust issues.

Despite all its potential for misuse, deepfake technology also holds a transformative upside. It supports applications in virtual reality, experiential learning, psychology, and therapy use cases, such as exposure therapy. In entertainment, deepfakes are revolutionizing visual effects and simulation experiences. These positive use cases highlight the need to balance innovation with ethical safeguards.

In conclusion, this study provides a comprehensive mapping of deepfake research while emphasizing its future implications. For research, it highlights critical gaps, including multimodal deepfakes and adversarial detection approaches. For policy, it underscores the need for international coordination and the implementation of ethical-by-design regulations. For practice, it emphasizes the enhancement of detection methods, preparedness for the impact of large generative models, and promotion of global collaboration. Future research should balance technical advancements with ethical design, user education, and legal oversight. These integrative strategies are essential to ensure that deepfake technologies develop responsibly and contribute positively to the digital ecosystem.

Author Contributions

Conceptualization, B.A. and S.Z.; Methodology, B.A., M.B., N.K. and S.Z.; Software, B.A., M.B. and H.O.; Validation, B.A., H.O., N.K. and S.Z.; Formal analysis, B.A., N.K. and S.Z.; Investigation, B.A.; Resources, N.K. and S.Z.; Data curation, B.A., M.B., H.O. and S.Z.; Writing—original draft, B.A. and H.O.; Writing—review & editing, B.A., M.B. and H.O.; Visualization, B.A., H.O., N.K. and S.Z.; Supervision, N.K. and S.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding. The APC was funded by the corresponding author, Btissam Acim.

Data Availability Statement

The data supporting the findings of this study are available within the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Abafe, E. A., Bahta, Y. T., & Jordaan, H. (2022). Exploring Biblioshiny for historical assessment of global research on sustainable use of water in agriculture. Sustainability, 14(17), 10651. [Google Scholar] [CrossRef]
Abbaoui, W., Retal, S., El Bhiri, B., Kharmoum, N., & Ziti, S. (2024). Towards revolutionizing precision healthcare: A systematic literature review of artificial intelligence methods in precision medicine. Informatics in Medicine Unlocked, 46, 101475. [Google Scholar] [CrossRef]
Acim, B., Kharmoum, N., Ezziyyani, M., & Ziti, S. (2025a). Mental health therapy: A comparative study of generative AI and deepfake technology. In M. Ezziyyani, J. Kacprzyk, & V. E. Balas (Eds.), International conference on advanced intelligent systems for sustainable development (AI2SD 2024) (Vol. 1403, pp. 351–357). Lecture Notes in Networks and Systems. Springer. [Google Scholar] [CrossRef]
Acim, B., Kharmoum, N., Lagmiri, S. N., & Ziti, S. (2025b). The role of generative AI in deepfake detection: A systematic literature review. In S. N. Lagmiri, M. Lazaar, & F. M. Amine (Eds.), Smart business and technologies. ICSBT 2024 (Vol. 1330, pp. 349–357). Lecture Notes in Networks and Systems. Springer. [Google Scholar] [CrossRef]
Agarwal, S., Farid, H., Fried, O., & Agrawala, M. (2020, June 14–19). Detecting deep-fake videos from phoneme–Viseme mismatches. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2020) (pp. 2814–2822), Seattle, WA, USA. [Google Scholar] [CrossRef]
Aghaei Chadegani, A., Salehi, H., Md Yunus, M. M., Farhadi, H., Fooladi, M., Farhadi, M., & Ale Ebrahim, N. (2013). A comparison between two main academic literature collections: Web of science and scopus databases. Asian Social Science, 9(5), 18–26. [Google Scholar] [CrossRef]
Alkhammash, R. (2023). Bibliometric, network, and thematic mapping analyses of metaphor and discourse in COVID-19 publications from 2020 to 2022. Frontiers in Psychology, 13, 1062943. [Google Scholar] [CrossRef] [PubMed]
Alnaim, N. M., Almutairi, Z. M., Alsuwat, M. S., Alalawi, H. H., Alshobaili, A., & Alenezi, F. S. (2023). DFFMD: A deepfake face mask dataset for infectious disease era with deepfake detection algorithms. IEEE Access, 11, 16711–16722. [Google Scholar] [CrossRef]
Amerini, I., Galteri, L., Caldelli, R., & Del Bimbo, A. (2019, October 27–28). Deepfake video detection through optical flow based CNN. IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) (pp. 1205–1207), Seoul, Republic of Korea. [Google Scholar] [CrossRef]
Anker, M. S., Hadzibegovic, S., Lena, A., & Haverkamp, W. (2019). The difference in referencing in Web of Science, Scopus, and Google Scholar. ESC Heart Failure, 6(6), 1291–1312. [Google Scholar] [CrossRef]
Ansorge, L. (2024). Bibliometric studies as a publication strategy. Metrics, 1(1), 5. [Google Scholar] [CrossRef]
Apolo, Y., & Michael, K. (2024). Beyond a reasonable doubt? Audiovisual evidence, AI manipulation, deepfakes, and the law. IEEE Transactions on Technology and Society, 5(2), 156–168. [Google Scholar] [CrossRef]
Aria, M., & Cuccurullo, C. (2017). Bibliometrix: An R-tool for comprehensive science mapping analysis. Journal of Informetrics, 11(4), 959–975. [Google Scholar] [CrossRef]
Arruda, H., Silva, E. R., Lessa, M., Proença, D., & Bartholo, R. (2022). VOSviewer and Bibliometrix. Journal of the Medical Library Association, 110(4), 482–488. [Google Scholar] [CrossRef]
Birkle, C., Pendlebury, D. A., Schnell, J., & Adams, J. (2020). Web of Science as a data source for research on scientific and scholarly activity. Quantitative Science Studies, 1(1), 363–376. [Google Scholar] [CrossRef]
Bisht, V., & Taneja, S. (2024). A decade and a half of deepfake research: A bibliometric investigation into key themes. In G. Lakhera, S. Taneja, E. Ozen, M. Kukreti, & P. Kumar (Eds.), Navigating the world of deepfake technology (pp. 1–25). IGI Global. [Google Scholar] [CrossRef]
Boukhlif, M., Hanine, M., & Kharmoum, N. (2023). A decade of intelligent software testing research: A bibliometric analysis. Electronics, 12(9), 2109. [Google Scholar] [CrossRef]
Boukhlif, M., Hanine, M., Kharmoum, N., Ruigómez Noriega, A., García Obeso, D., & Ashraf, I. (2024a). Natural language processing-based software testing: A systematic literature review. IEEE Access, 12, 79383–79400. [Google Scholar] [CrossRef]
Boukhlif, M., Kharmoum, N., Hanine, M., Elasri, C., Rhalem, W., & Ezziyyani, M. (2024b). Exploring the application of classical and intelligent software testing in medicine: A literature review. In Lecture notes in networks and systems (Vol. 904, pp. 37–46). Springer. [Google Scholar] [CrossRef]
Bukar, U. A., Sayeed, M. S., Razak, S. F. A., Yogarayan, S., Amodu, O. A., & Mahmood, R. A. R. (2023). A method for analyzing text using VOSviewer. MethodsX, 11, 102339. [Google Scholar] [CrossRef]
Chintha, A., Thai, B., Sohrawardi, S. J., Bhatt, K., Hickerson, A., & Wright, M. (2020). Recurrent convolutional structures for audio spoof and video deepfake detection. IEEE Journal of Selected Topics in Signal Processing, 14(5), 1024–1037. [Google Scholar] [CrossRef]
Coile, R. C. (1977). Lotka’s frequency distribution of scientific productivity. Journal of the American Society for Information Science, 28(6), 366–370. [Google Scholar] [CrossRef]
Cover, R. (2022). Deepfake culture: The emergence of audio-video deception as an object of social anxiety and regulation. Continuum, 36(4), 609–621. [Google Scholar] [CrossRef]
Cozzolino, D., Poggi, G., & Verdoliva, L. (2017, June 20–22). Recasting residual-based local descriptors as convolutional neural networks: An application to image forgery detection. 5th ACM Workshop on Information Hiding and Multimedia Security (IH&MMSec ’17) (pp. 159–164), Philadelphia, PA, USA. [Google Scholar] [CrossRef]
Dervis, H. (2019). Bibliometric analysis using Bibliometrix: An R package. Journal of Scientific Research, 8(3), 156–160. [Google Scholar] [CrossRef]
Dhiman, B. (2023). Exploding AI-generated deepfakes and misinformation: A threat to global concern in the 21st century. Qeios. [Google Scholar] [CrossRef]
Di Franco, G. (2016). Multiple correspondence analysis: One only or several techniques? Quality & Quantity, 50, 1299–1315. [Google Scholar] [CrossRef]
Ding, F., Zhu, G., Li, Y., Zhang, X., Atrey, P. K., & Lyu, S. (2022). Anti-forensics for face swapping videos via adversarial training. IEEE Transactions on Multimedia, 24, 3429–3441. [Google Scholar] [CrossRef]
Domenteanu, A., Tătaru, G.-C., Crăciun, L., Molănescu, A.-G., Cotfas, L.-A., & Delcea, C. (2024). Living in the age of deepfakes: A bibliometric exploration of trends, challenges, and detection approaches. Information, 15(9), 525. [Google Scholar] [CrossRef]
Donthu, N., Kumar, S., Mukherjee, D., Pandey, N., & Lim, W. M. (2021). How to conduct a bibliometric analysis: An overview and guidelines. Journal of Business Research, 133, 285–296. [Google Scholar] [CrossRef]
El-Gayar, M. M., Abouhawwash, M., Askar, S. S., & Sweidan, S. (2024). A novel approach for detecting deep fake videos using graph neural network. Journal of Big Data, 11(1), 27. [Google Scholar] [CrossRef]
Ennejjai, I., Ariss, A., Kharmoum, N., Rhalem, W., Ziti, S., & Ezziyyani, M. (2023). Artificial intelligence for fake news. In J. Kacprzyk, M. Ezziyyani, & V. E. Balas (Eds.), International conference on advanced intelligent systems for sustainable development (AI2SD) (Vol. 637, pp. 65–74). Lecture Notes in Networks and Systems. Springer. [Google Scholar] [CrossRef]
Garg, D., & Gill, R. (2023, December 1–3). Deepfake generation and detection—An exploratory study. 2023 10th IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON) (pp. 888–893), Gautam Buddha Nagar, India. [Google Scholar] [CrossRef]
Garg, D. P., & Gill, R. (2024). A bibliometric analysis of deepfakes: Trends, applications and challenges. ICST Transactions on Scalable Information Systems, 11(6), e4883. [Google Scholar] [CrossRef]
Gil, R., Virgili-Gomà, J., López-Gil, J. M., & García, R. (2023). Deepfakes: Evolution and trends. Soft Computing, 27(14), 11295–11318. [Google Scholar] [CrossRef]
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11), 139–144. [Google Scholar] [CrossRef]
Guarnera, L., Giudice, O., & Battiato, S. (2020). Fighting deepfake by exposing the convolutional traces on images. IEEE Access, 8, 165085–165098. [Google Scholar] [CrossRef]
Guo, Z., Yang, G., Chen, J., & Sun, X. (2021). Fake face detection via adaptive manipulation traces extraction network. Computer Vision and Image Understanding, 204, 103170. [Google Scholar] [CrossRef]
Hagele, D., Krake, T., & Weiskopf, D. (2023). Uncertainty-aware multidimensional scaling. IEEE Transactions on Visualization and Computer Graphics, 29(1), 23–32. [Google Scholar] [CrossRef]
Hamza, A., Javed, A. R., Iqbal, F., Kryvinska, N., Almadhor, A. S., & Jalil, Z. (2022). Deepfake audio detection via MFCC features using machine learning. IEEE Access, 10, 134018–134028. [Google Scholar] [CrossRef]
Harbath, K., & Khizanishvili, A. (2023). Insights from data: What the numbers tell us about elections and future of democracy. Integrity Institute. Available online: https://integrityinstitute.org/blog/insights-from-data (accessed on 15 July 2025).
Hu, J., Liao, X., Wang, W., & Qin, Z. (2022). Detecting compressed deepfake videos in social networks using frame-temporality two-stream convolutional network. IEEE Transactions on Circuits and Systems for Video Technology, 32(3), 1089–1102. [Google Scholar] [CrossRef]
Huang, L., Li, J., Hao, H., & Li, X. (2018). Micro-seismic event detection and location in underground mines by using Convolutional Neural Networks (CNN) and deep learning. Tunnelling and Underground Space Technology, 81, 265–276. [Google Scholar] [CrossRef]
Hussain, M., Bird, J. J., & Faria, D. R. (2019). A study on CNN transfer learning for image classification. In A. Lotfi, H. Bouchachia, A. Gegov, C. Langensiepen, & M. McGinnity (Eds.), Advances in computational intelligence systems. UKCI 2018; Advances in intelligent systems and computing (Vol. 840, pp. 191–202). Springer. [Google Scholar] [CrossRef]
Hydara, E., Kikuchi, M., & Ozono, T. (2024). Empirical assessment of deepfake detection: Advancing judicial evidence verification through artificial intelligence. IEEE Access, 12, 151188–151203. [Google Scholar] [CrossRef]
Kalaiarasu, S., Rahman, N. A. A., & Harun, K. S. (2024). Deepfake impact, security threats and potential preventions. In AIP conference proceedings (Vol. 2802, p. 050020). AIP Publishing. [Google Scholar] [CrossRef]
Kasita, I. D. (2022). Deepfake pornografi: Tren kekerasan gender berbasis online (KGBO) di era pandemi COVID-19. Jurnal Wanita dan Keluarga, 3(1), 16–26. [Google Scholar] [CrossRef]
Kietzmann, J., Lee, L. W., McCarthy, I. P., & Kietzmann, T. C. (2020). Deepfakes: Trick or treat? Business Horizons, 63, 135–146. [Google Scholar] [CrossRef]
Kılıç, B., & Kahraman, M. E. (2023). Current usage areas of deepfake applications with artificial intelligence technology. İletişim ve Toplum Araştırmaları Dergisi, 3(2), 301–332. [Google Scholar] [CrossRef]
Kohli, A., & Gupta, A. (2021). Detecting DeepFake, FaceSwap and Face2Face facial forgeries using frequency CNN. Multimedia Tools and Applications, 80, 18461–18478. [Google Scholar] [CrossRef]
Lakshmi, D., & Hemanth, D. J. (2024). An overview of deepfake methods in medical image processing for health care applications. In Frontiers in artificial intelligence and applications (Vol. 383, pp. 304–311). IOS Press. [Google Scholar] [CrossRef]
Lee, S., Tariq, S., Shin, Y., & Woo, S. S. (2021). Detecting handcrafted facial image manipulations and GAN-generated facial images using Shallow-FakeFaceNet. Applied Soft Computing, 105, 107256. [Google Scholar] [CrossRef]
Li, Y., Chang, M.-C., & Lyu, S. (2018, December 11–13). In ictu oculi: Exposing AI-created fake videos by detecting eye blinking. IEEE International Workshop on Information Forensics and Security (WIFS) (pp. 1–7), Hong Kong, China. [Google Scholar] [CrossRef]
Lim, S. Y., Chae, D. K., & Lee, S. C. (2022). Detecting deepfake voice using explainable deep learning techniques. Applied Sciences, 12(8), 3926. [Google Scholar] [CrossRef]
Lim, W. M., & Kumar, S. (2023). Guidelines for interpreting the results of bibliometric analysis: A sensemaking approach. Global Business and Organizational Excellence, 43(1), 17–26. [Google Scholar] [CrossRef]
Liu, K., Perov, I., Gao, D., Chervoniy, N., Zhou, W., & Zhang, W. (2023). DeepFaceLab: Integrated, flexible and extensible face-swapping framework. Pattern Recognition, 141, 109628. [Google Scholar] [CrossRef]
Lu, Y., & Ebrahimi, T. (2024). Assessment framework for deepfake detection in real-world situations. EURASIP Journal on Image and Video Processing, 2024(1), 16. [Google Scholar] [CrossRef]
Maddi, A., Maisonobe, M., & Boukacem-Zeghmouri, C. (2025). Geographical and disciplinary coverage of open access journals: OpenAlex, Scopus, and WoS. PLoS ONE, 20(4), e0320347. [Google Scholar] [CrossRef]
Maddocks, S. (2020). ‘A deepfake porn plot intended to silence me’: Exploring continuities between pornographic and ‘political’ deep fakes. Porn Studies, 7(4), 415–423. [Google Scholar] [CrossRef]
Malik, A., Kuribayashi, M., Abdullahi, S. M., & Khan, A. N. (2022). Deepfake detection for human face images and videos: A survey. IEEE Access, 10, 18757–18775. [Google Scholar] [CrossRef]
Mao, D., Zhao, S., & Hao, Z. (2022). A shared updatable method of content regulation for deepfake videos based on blockchain. Applied Intelligence, 52(14), 15557–15574. [Google Scholar] [CrossRef]
Martin-Rodriguez, F., Garcia-Mojon, R., & Fernandez-Barciela, M. (2023). Detection of AI-created images using pixel-wise feature extraction and convolutional neural networks. Sensors, 23(22), 9037. [Google Scholar] [CrossRef] [PubMed]
Masood, M., Nawaz, M., Malik, K. M., Javed, A., Irtaza, A., & Malik, H. (2023). Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward. Applied Intelligence, 53(4), 3974–4026. [Google Scholar] [CrossRef]
Mira, F. (2023, May 19–21). Deep learning technique for recognition of deep fake videos. 2023 IEEE IAS Global Conference on Emerging Technologies (GlobConET) (pp. 1–4), London, UK. [Google Scholar] [CrossRef]
Mirsky, Y., & Lee, W. (2021). The Creation and Detection of Deepfakes: A Survey. ACM Computing Surveys (CSUR), 54(1), 1–41. [Google Scholar] [CrossRef]
Mongeon, P., & Paul-Hus, A. (2016). The journal coverage of Web of Science and Scopus: A comparative analysis. Scientometrics, 106(1), 213–228. [Google Scholar] [CrossRef]
Mubarak, R., Alsboui, T., Alshaikh, O., Inuwa-Dutse, I., Khan, S., & Parkinson, S. (2023). A survey on the detection and impacts of deepfakes in visual, audio, and textual formats. IEEE Access, 11, 144497–144529. [Google Scholar] [CrossRef]
Nenadić, O., & Greenacre, M. (2007). Correspondence analysis in R, with two- and three-dimensional graphics: The ca package. Journal of Statistical Software, 20(3), 1–13. [Google Scholar] [CrossRef]
Nguyen, T. T., Nguyen, Q. V. H., Nguyen, D. T., Nguyen, D. T., Huynh-The, T., Nahavandi, S., & Nguyen, C. M. (2022). Deep learning for deepfakes creation and detection: A survey. Computer Vision and Image Understanding, 223, 103525. [Google Scholar] [CrossRef]
Nikkel, B., & Geradts, Z. (2022). Likelihood ratios, health apps, artificial intelligence and deepfakes. Forensic Science International: Digital Investigation, 41, 301394. [Google Scholar] [CrossRef]
Ouhnni, H., Acim, B., Belhiah, M., El Bouchti, K., Seghroucheni, Y. Z., Lagmiri, S. N., Benachir, R., & Ziti, S. (2025). The evolution of virtual identity: A systematic review of avatar customization technologies and their behavioral effects in VR environments. Frontiers in Virtual Reality, 6, 1496128. [Google Scholar] [CrossRef]
Öztürk, O., Kocaman, R., & Kanbach, D. K. (2024). How to design bibliometric research: An overview and a framework proposal. Review of Managerial Science, 18, 3333–3361. [Google Scholar] [CrossRef]
Park, J., Park, L. H., Ahn, H. E., & Kwon, T. (2024). Coexistence of deepfake defenses: Addressing the poisoning challenge. IEEE Access, 12, 11674–11687. [Google Scholar] [CrossRef]
Patel, Y., Goel, A., Mehra, S., Singh, R., Kumar, V., & Gupta, A. (2023a). An improved dense CNN architecture for deepfake image detection. IEEE Access, 11, 22081–22095. [Google Scholar] [CrossRef]
Patel, Y., Tanwar, S., Gupta, R., Bhattacharya, P., Davidson, I. E., & Nyameko, R. (2023b). Deepfake generation and detection: Case study and challenges. IEEE Access, 11, 143296–143323. [Google Scholar] [CrossRef]
Pranckutė, R. (2021). Web of Science (WoS) and Scopus: The titans of bibliographic information in today’s academic world. Publications, 9(1), 12. [Google Scholar] [CrossRef]
Qu, Z., Yin, Q., Sheng, Z., Wu, J., Zhang, B., Yu, S., & Lu, W. (2024). Overview of deepfake proactive defense techniques. Journal of Image and Graphics, 29(2), 318–342. [Google Scholar]
Radha, L., & Arumugam, J. (2021). The research output of bibliometrics using Bibliometrix R package and VOSviewer. Shanlax International Journal of Arts, Science and Humanities, 9(2), 44–49. [Google Scholar] [CrossRef]
Ramluckan, T. (2024, March 26–27). Deepfakes: The legal implications. 19th International Conference on Cyber Warfare and Security (ICCWS) (pp. 120–130), Johannesburg, South Africa. [Google Scholar] [CrossRef]
Raza, A., Munir, K., & Almutairi, M. (2022). A novel deep learning approach for deepfake image detection. Applied Sciences, 12, 9820. [Google Scholar] [CrossRef]
Roe, J., Perkins, M., & Furze, L. (2024). Deepfakes and higher education: A research agenda and scoping review of synthetic media. Journal of University Teaching and Learning Practice, 21(3), 15. [Google Scholar] [CrossRef]
Saadouni, C., Jaouhari, S. E., Tamani, N., Ziti, S., Mroueh, L., & El Bouchti, K. (2025). Identification techniques in the Internet of Things: Survey, taxonomy and research frontier. IEEE Communications Surveys & Tutorials. [Google Scholar] [CrossRef]
Shorten, C., Khoshgoftaar, T. M., & Furht, B. (2021). Deep learning applications for COVID-19. Journal of Big Data, 8(1), 18. [Google Scholar] [CrossRef]
Siegel, D., Kraetzer, C., Seidlitz, S., & Dittmann, J. (2021). Media forensics considerations on deepfake detection with handcrafted features. Journal of Imaging, 7(7), 108. [Google Scholar] [CrossRef]
Singh, V. K., Singh, P., Karmakar, M., Leta, J., & Mayr, P. (2021). The journal coverage of Web of Science, Scopus and Dimensions: A comparative analysis. Scientometrics, 126, 5113–5142. [Google Scholar] [CrossRef]
Sun, X., Ge, S., Wang, X., Lu, H., & Herrera-Viedma, E. (2022). A bibliometric analysis of IEEE T-ITS literature between 2010 and 2019. IEEE Transactions on Intelligent Transportation Systems, 23(4), 17157–17166. [Google Scholar] [CrossRef]
Sun, Z., Ruan, N., & Li, J. (2025). DDL: Effective and comprehensible interpretation framework for diverse deepfake detectors. IEEE Transactions on Information Forensics and Security, 20, 3601–3615. [Google Scholar] [CrossRef]
Suratkar, S., Kazi, F., Sakhalkar, M., Abhyankar, N., & Kshirsagar, M. (2020, December 10–13). Exposing deepfakes using convolutional neural networks and transfer learning approaches. 2020 IEEE 17th India Council International Conference (INDICON) (pp. 1–8), New Delhi, India. [Google Scholar] [CrossRef]
Thelwall, M. (2018). Dimensions: A competitor to Scopus and the Web of Science? Journal of Informetrics, 12(2), 430–435. [Google Scholar] [CrossRef]
Tolosana, R., Vera-Rodriguez, R., Fierrez, J., Morales, A., & Ortega-Garcia, J. (2020). Deepfakes and beyond: A survey of face manipulation and fake detection. Information Fusion, 64, 131–148. [Google Scholar] [CrossRef]
Twomey, J., Ching, D., Aylett, M. P., Quayle, M., Linehan, C., & Murphy, G. (2025). What is so deep about deepfakes? A multidisciplinary thematic analysis of academic narratives about deepfake technology. IEEE Transactions on Technology and Society, 6, 64–79. [Google Scholar] [CrossRef]
Ur Rehman Ahmed, N., Badshah, A., Adeel, H., Tajammul, A., Daud, A., & Alsahfi, T. (2025). Visual deepfake detection: Review of techniques, tools, limitations, and future prospects. IEEE Access, 13, 1923–1961. [Google Scholar] [CrossRef]
Van Eck, N. J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84, 523–538. [Google Scholar] [CrossRef]
Wang, Y., Zhang, F., Wang, J., Liu, L., & Wang, B. (2021). A bibliometric analysis of edge computing for internet of things. Security and Communication Networks, 2021, 5563868. [Google Scholar] [CrossRef]
Waseem, S., Abu Bakar, S. A. R. S., Ahmed, B. A., Omar, Z., Eisa, T. A. E., & Dalam, M. E. E. (2023). Deepfake on face and expression swap: A review. IEEE Access, 11, 117865–117906. [Google Scholar] [CrossRef]
Whittaker, L., Mulcahy, R., Letheren, K., Kietzmann, J., & Russell-Bennett, R. (2023). Mapping the deepfake landscape for innovation: A multidisciplinary systematic review and future research agenda. Technovation, 125, 102780. [Google Scholar] [CrossRef]
Xu, Z., Liu, J., Lu, W., Xu, B., Zhao, X., Li, B., & Huang, J. (2021). Detecting facial manipulated videos based on set convolutional neural networks. Communication and Image Representation, 77, 103119. [Google Scholar] [CrossRef]
Yasrab, R., Jiang, W., & Riaz, A. (2021). Fighting deepfakes using body language analysis. Forecasting, 3(2), 303–321. [Google Scholar] [CrossRef]

Figure 1. Techniques used in deepfake creation.

Figure 2. Review questions.

Figure 3. Features of the search strategy’s flow.

Figure 4. Annual scientific production and average citations per year (2014–2024).

Figure 5. Comparison of absolute and annualized citations counts (2014–2024).

Figure 6. Top 10 most relevant authors (based on number of publications).

Figure 7. Top 10 most relevant corresponding author’s countries (based on number of documents, SCP vs. MCP).

Figure 8. Top 10 locally cited documents, based on the bibliometric analysis of 410 articles (2014–2024). Cited works: (Mirsky and Lee, 2021; Cozzolino et al., 2017; Guarnera et al., 2020; Masood et al., 2023; S. Y. Lim et al., 2022; Mallolan et al., 2020; Raza et al., 2022; Hamza et al., 2022; Patel et al., 2023; Tran et al., 2021). Note: Tran et al. (2021) was retrieved from the bibliometric dataset but is not included in the reference list. (Tran et al., 2021) Tran, V.-N.; Lee, S.-H.; Le, H.-S.; & Kwon, K.-R. (2021). High performance DeepFake video detection on CNN-based with attention target-specific regions and manual distillation extraction. Applied Sciences, 11(16), 7678. https://doi.org/10.3390/app11167678.

Figure 9. The frequency distribution of scientific productivity based on Lotka’s law.

Figure 10. Top 10 most relevant words used in deepfake research.

Figure 11. Map of collaborations between countries.

Figure 12. Global map of international collaborations.

Figure 13. Network of collaborations between countries.

Figure 14. Co-authorship between organizations.

Figure 15. The co-citation network of authors.

Figure 16. Journal co-citation network.

Figure 17. Thematic Map of deepfake research.

Figure 18. Conceptual mapping of deepfake research keywords using multivariate analyses: (a) multiple correspondence analysis (MCA) factorial map; (b) correspondence analysis (CA) factorial map; (c) multidimensional scaling (MDS) projection.

Table 1. Top 10 sources in deepfake research.

Most Relevant Sources	N. of Documents	Most Local Cited Sources	N. of Local Citations
IEEE Access	30	Proceedings CVPR IEEE	1353
Multimedia Tools and Applications	7	arXiv	1250
Sensors	7	IEEE Transactions on Information Forensics and Security	350
IEEE Transactions on Information Forensics and Security	6	Lecture Notes in Computer Science	148
ACM Transactions on Multimedia Computing Communication	6	IEEE Conference on Computer Vision	101
Expert Systems with Applications	5	IEEE Access	80
Journal of Imaging	5	Advances in Neural Information Processing Systems	72
PeerJ Computer Science	5	IEEE International Workshop on Information Forensics and Security	72
Applied Sciences-Basel	4	IEEE Computer Society Conference	51
Electronics	4	International Conference on Acoustics, Speech, and Signal Processing	50

Table 2. Corresponding author’s countries.

Country	Articles	Single-Country Publication (SCP)	Multiple-Country Publication (MCP)	Frequency	Multiple-Country Publication Ratio
CHINA	83	64	19	0.202	0.229
INDIA	62	52	10	0.151	0.161
USA	59	48	11	0.144	0.186
ITALY	25	21	4	0.061	0.160
PAKISTAN	20	8	12	0.049	0.600
KOREA	16	12	4	0.039	0.250
UNITED KINGDOM	14	7	7	0.034	0.500
SAUDI ARABIA	11	7	4	0.027	0.364
GERMANY	9	5	4	0.022	0.444
SPAIN	9	6	3	0.022	0.333

Table 3. Top 10 globally cited authors in deepfake research.

Author	Year	Title	Journal	Global Citations	Average Citations per Document
Caldelli R.	2019	Deepfake video detection through optical flow based CNN	2019 IEEE/CVF International Conference On Computer Vision Workshops (ICCVW)	188	27.0
Javed A.	2023	Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward	Applied Intelligence	129	43.0
Li X.	2018	Micro-seismic event detection and location in underground mines by using convolutional neural networks (CNN) and deep learning	Tunnelling And Underground Space Technology	120	15.0
Li M.	2016	An original face anti-spoofing approach using partial convolutional neural network	2016 Sixth International Conference On Image Processing Theory, Tools And Applications (Ipta)	65	6.5
Caldelli R.	2021	Optical flow based CNN for detection of unlearnt deepfake manipulations	Pattern Recognition Letters	57	11.4
Guarnera L.	2020	Fighting deepfake by exposing the convolutional traces on images	IEEE Access	39	6.5
Giudice O.	2020	Fighting deepfake by exposing the convolutional traces on images	IEEE Access	39	6.5
Battiato S.	2020	Fighting deepfake by exposing the convolutional traces on images	IEEE Access	39	6.5
Zhao Y.	2021	Practical attacks on deep neural networks by memory trojaning	IEEE Transactions On Computer-Aided Design Of Integrated Circuits And Systems	19	3.8
Guarnera L.	2022	The face deepfake detection challenge	Journal Of Imaging	17	4.25

Table 4. Top countries, authors, keywords, and globally cited documents in deepfake research.

Top Authors	Top Countries	Top Keywords	Most Globally Cited Documents	Global Citations
Li JX	China	Deep Learning	Mirsky Y, 2021, ACM COMPUT SURV (Mirsky & Lee, 2021)	326
Zhao Y	USA	Deepfake Detection	Hussain M, 2019, ADV COMPUT INTELL SYST (Hussain et al., 2019)	310
Liu M	India	Deepfake	Cozzolino D, 2017, ACM Workshop on Info. Hiding & Multimedia Security (Cozzolino et al., 2017)	200
Liu X	Pakistan	Computer Vision	Shorten C, 2021, J BIG DATA (Shorten et al., 2021)	194
Zhang Y	Italy	Feature Extraction	Amerini I, 2019, IEEE/CVF ICCVW (Amerini et al., 2019)	158
Calderllir	Saudi Arabia	Artificial Intelligence	Masood M, 2023, APPL INTELL (Masood et al., 2023)	92
Battiato S	Korea	Deepfakes	Thanh Thi Nguyen TTN, 2022, COMPUT VIS IMAGE UNDERST (Nguyen et al., 2022)	82
Giudice O	Australia	Machine Learning	Huang L, 2018, TUNN UNDERGR SPACE TECHNOL (Huang et al., 2018)	62
Guarnera L	Spain	Learning	Xu X, 2021, J VIS COMMUN IMAGE REPRESENT (Xu et al., 2021)	51
Javed A	United Kingdom	Detection	Agarwal S, 2020, IEEE/CVF CVPRW (Agarwal et al., 2020)	48

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Acim, B.; Boukhlif, M.; Ouhnni, H.; Kharmoum, N.; Ziti, S. A Decade of Deepfake Research in the Generative AI Era, 2014–2024: A Bibliometric Analysis. Publications 2025, 13, 50. https://doi.org/10.3390/publications13040050

AMA Style

Acim B, Boukhlif M, Ouhnni H, Kharmoum N, Ziti S. A Decade of Deepfake Research in the Generative AI Era, 2014–2024: A Bibliometric Analysis. Publications. 2025; 13(4):50. https://doi.org/10.3390/publications13040050

Chicago/Turabian Style

Acim, Btissam, Mohamed Boukhlif, Hamid Ouhnni, Nassim Kharmoum, and Soumia Ziti. 2025. "A Decade of Deepfake Research in the Generative AI Era, 2014–2024: A Bibliometric Analysis" Publications 13, no. 4: 50. https://doi.org/10.3390/publications13040050

APA Style

Acim, B., Boukhlif, M., Ouhnni, H., Kharmoum, N., & Ziti, S. (2025). A Decade of Deepfake Research in the Generative AI Era, 2014–2024: A Bibliometric Analysis. Publications, 13(4), 50. https://doi.org/10.3390/publications13040050

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Decade of Deepfake Research in the Generative AI Era, 2014–2024: A Bibliometric Analysis

Abstract

1. Introduction

2. A Brief Literature Review on Deepfake Research

3. Materials and Methods

3.1. Data Collection and Filtering

3.2. Bibliometric Methodology

4. Results

4.1. Overview Analysis

4.1.1. Main Data

4.1.2. Annual Scientific Production and Average Citation per Year

4.2. Sources Analysis

4.2.1. Top 10 Most-Cited and Productive Sources

4.2.2. Top 10 Most Relevant Authors

4.2.3. Top 10 Corresponding Author’s Countries

4.2.4. Top 10 Most Cited Authors Globally

4.2.5. Leading Contributors and Influential Publications

4.2.6. Top 10 Most Cited Documents Locally

4.2.7. Lotka’s Frequency Distribution of Scientific Productivity

4.2.8. Lot Top 10 Most Frequent Words

4.2.9. Keywords Co-Occurrence Network

4.2.10. Keywords Co-Occurrence Network

4.2.11. Organizations Co-Authorship

4.2.12. Author Co-Citation Network

4.2.13. Network Visualization Map of Journal Co-Citations

4.3. Network Analysis

4.3.1. Thematic Mapping

4.3.2. Conceptual Mapping of Keywords (MCA, CA, MDS)

5. Discussion

5.1. Implications

5.2. Limitations

5.3. Future Research Directions

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI