Image Forgery Detection with Focus on Copy-Move: An Overview, Real World Challenges and Future Directions

Shallal, Issam; Rzouga Haddada, Lamia; Essoukri Ben Amara, Najoua

doi:10.3390/app152111774

Open AccessReview

Image Forgery Detection with Focus on Copy-Move: An Overview, Real World Challenges and Future Directions

by

Issam Shallal

^1,2,

Lamia Rzouga Haddada

^3,*

and

Najoua Essoukri Ben Amara

⁴

¹

LATIS—Laboratory of Advanced Technology and Intelligent Systems, Higher Institute of Computer Science and Communication Technologies, University of Sousse, Sousse 4002, Tunisia

²

Department of Computer Science, University of Anbar, Ramadi 31001, Anbar, Iraq

³

LATIS—Laboratory of Advanced Technology and Intelligent Systems, Institut Supérieur des Sciences Appliquées et de Technologie de Sousse, University of Sousse, Sousse 4002, Tunisia

⁴

LATIS—Laboratory of Advanced Technology and Intelligent Systems, Ecole Nationale d’Ingenieurs de Sousse, University of Sousse, Sousse 4002, Tunisia

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(21), 11774; https://doi.org/10.3390/app152111774

Submission received: 27 September 2025 / Revised: 25 October 2025 / Accepted: 29 October 2025 / Published: 5 November 2025

Download

Browse Figures

Versions Notes

Abstract

The rapid expansion of digital imagery, combined with increasingly sophisticated editing tools, has made image forgery a widespread and critical concern in fields such as journalism, forensics, and social media. This study provides a comprehensive review of Copy-Move Forgery Detection (CMFD) methods, focusing on the latest advances in deep learning-based techniques. We analyze key real-world challenges, summarize the most relevant recent solutions, and highlight persistent limitations that hinder robustness, accuracy, and practical deployment. A comparative review and qualitative analysis of prominent deep learning architectures reported in the literature is conducted to examine their relative efficiency, resilience, and trade-offs under diverse forgery scenarios. Finally, the paper highlights future research directions, including the development of more adaptable and generalizable models, the design of comprehensive benchmark datasets, the pursuit of real-time detection frameworks, and the enhancement of interpretability and transparency in CMFD systems.

Keywords:

image forgery detection; copy-move forgery; deep learning-based approaches; conventional approaches; benchmark datasets

1. Introduction

Digital images have become an integral part of contemporary life, serving as essential tools for communication, entertainment and documentation. However, the proliferation of sophisticated editing software has simultaneously facilitated the manipulation of visual content, enabling the creation of highly realistic forgery that are increasingly difficult to detect [1,2,3]. Image forgery refers to the deliberate modification of digital content to deceive or misrepresent reality [4]. Such alterations may involve the addition, removal or modification of image components, often executed with great precision to conceal any visible evidence of tampering [5].

Methods for detecting image forgery are generally classified into two categories: active and passive approaches. In active detection, supplementary information such as digital watermarks or cryptographic signatures is embedded within the image, thereby enabling the subsequent verification of its authenticity and integrity [6,7,8,9]. Although effective under certain conditions, this approach requires the prior insertion of metadata during image acquisition or the involvement of trusted entities, making it impractical in situations where the original image data are unavailable or unverifiable [10].

In contrast, passive techniques analyze the intrinsic statistical and structural properties of an image without requiring prior knowledge about its origin, authenticity, or acquisition device [11]. These approaches are specifically valuable for analyzing images captured in uncontrolled or unverified environments. Passive forgery detection methods are commonly divided into three subcategories: Copy–Move Forgery (CMF), image splicing, and image retouching [12].

CMF is one of the most straightforward and widely used forms of image manipulation [13]. It involves duplicating specific regions of an image and pasting them elsewhere within the same image. This technique is often employed to conceal critical details or introduce misleading information. Since the duplicated areas originate from the same image, they naturally share similar color, texture, and illumination characteristics, making detection particularly challenging [14]. Beyond the basic copy-paste operation, additional transformations, such as rotation, scaling, or post-processing effects (e.g., blurring, brightness adjustment, compression, or noise addition), are often applied to further obscure manipulation and reduce the visibility of tampered regions to the human eye.
Image splicing involves merging two or more distinct images into a single composite image [15,16]. The boundaries between the spliced regions may reveal visual inconsistencies such as abrupt transitions, mismatched textures, or unnatural edges that indicate manipulation. However, advanced image editing software can effectively smooth these transitions, producing composites that appear seamless and realistic even under close inspection [17].
Image retouching, although considered the least malicious form of forgery, still represents an intentional alteration of visual information. It focuses on enhancing or diminishing specific features within an image, and it is widely used in advertising, fashion and photography industries to improve visual appeal. Typical operations include adjusting color balance, brightness or contrast to achieve the desired aesthetic effect.

Ensuring the authenticity and reliability of digital imagery has become increasingly important in a world where visual content plays a central role in shaping public perception and decision-making. As image editing technologies continue to evolve, there is a growing necessity for robust and efficient methods to detect tampering. Among various image forgery detection approaches, CMFD has emerged as a particularly active research domain [13].

However, despite its significance, CMFD has not been extensively covered in previous survey studies, regarding a systematic investigation of its challenges, methodologies, and application domains. To address this gap, the main contributions of this paper are summarized as follows:

We present a comprehensive and structured survey of CMFD techniques, encompassing traditional (block-based and keypoint-based) and hybrid approaches, along with recent deep-learning-based models.
We review the benchmark datasets and evaluation metrics commonly used in CMFD research, highlighting their importance in assessing the robustness and generalization of detection algorithms.
We provide a critical analysis of the strengths, weaknesses and distinct features of each category of methods, offering an integrated perspective on their relative advantages and limitations.
We identify and synthesize emerging research directions and key challenges, providing insights that can guide future developments towards more reliable and scalable CMFD systems.

The remainder of this paper is organized as follows: Section 2 reviews the most influential surveys on CMFD, summarizing major contributions and identifying existing research gaps. Section 3 focuses on benchmarking CMFD, describing commonly used datasets and evaluation metrics while discussing their limitations and implications for model assessment. Section 4 explores real-world application domains and evaluates the relevance of existing methods in these contexts. Section 5 provides a comprehensive review of CMFD techniques, ranging from conventional feature-based methods to recent deep learning-driven architectures. Section 6 discusses the findings from the comparative review and qualitative analysis, addressing ongoing challenges and future perspectives such as enhancing interpretability, achieving real-time detection, incorporating multimodal data, and improving dataset diversity and quality. Finally, Section 7 and Section 8 provides a comprehensive discussion and concluding remarks. Figure 1 illustrates the overall structure of the paper.

2. Review of Reviews

Within the spectrum of digital image forgery techniques reported in the literature, CMFD has attracted substantial research attention. CMF involves replicating a segment of an image and inserting it elsewhere within the same image to obscure or duplicate objects or regions [18]. This form of manipulation is typically challenging to detect because the duplicated area often shares similar visual characteristics with its surroundings, making it difficult to distinguish it from the original content. CMFD techniques have important practical applications in numerous domains where image integrity is critical.

Thanks to the rapid growth of research in the field of image forensics, a considerable number of studies have recently addressed the problem of image forgery detection [11,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33]. However, the majority of existing reviews cover all types of forgery detection without providing sufficient focus on Copy-Move Forgery (CMF). This is a significant limitation because CMFD represents one of the most prevalent and practically threatening forms of manipulation, yet it often receives limited detailed analysis. Even among studies that address deep-learning-based CMFD, discussions regarding network architectures, dataset realism, robustness, and practical applicability remain superficial.

A critical observation across the literature is that many studies emphasize algorithmic novelty rather than problem-driven analysis. Little work examined the trade-off between handcrafted and deep features, the reasons behind robustness failure under transformations, or the implications of dataset bias on model generalization. Understanding these aspects is essential to interpret why certain CMFD techniques have succeeded in constrained benchmarks but failed in uncontrolled environments.

Strengths of existing CMFD research include the development of advanced deep-learning models capable of detecting subtle manipulations and the availability of benchmark datasets that support comparative evaluation. Weaknesses include reliance on synthetic datasets, limited generalization to real-world scenarios, inconsistent evaluation protocols, and insufficient consideration of robustness against transformations such as compression, scaling and rotation. Research gaps remain in designing interpretable architectures, integrating multimodal or contextual information, and establishing standardized evaluation frameworks.

From a broader perspective, the evolution of CMFD research reflects a transition from algorithm-centric development towards learning-based adaptability. Yet, this transition remains incomplete: While deep learning models improve accuracy, they introduce new vulnerabilities, including dependence on data distribution and adversarial susceptibility. More holistic analysis integrating technical, forensic, and operational factors is therefore required to define the next generation of CMFD frameworks.

Figure 2 and Figure 3 present the annual publication trends in CMFD research from 2020 to 2025, categorized into Copy-Move, Splicing, and Retouching detection techniques. As shown, there is a noticeable increase in Copy-Move publications from 84 in 2020 to 146 in 2024, followed by a decline to 119 in 2025, which may reflect shifts in research focus, dataset availability, or methodological saturation. Splicing publications generally rise from 58 in 2020 to 114 in 2024, before slightly decreasing to 95 in 2025. Retouching publications fluctuate throughout the period, with peaks in 2021 (24) and 2024 (23), and lower values in other years. These figures not only quantify annual trends in CMFD research but also highlight shifts in attention across different types of forgery, providing a basis for identifying gaps in 2025 and motivating further investigation in areas where research activity shows variations or declines.

Based on this data, our survey classifies CMFD methods and analyzes trends over the period 2020–2025. The statistical insights, illustrated in Figure 2 and Figure 3, reveal a clear research gap: while many studies focus on algorithmic novelty, few comprehensively evaluate methods under diverse real-world conditions or address robustness, interpretability, and dataset biases.

To address these gaps, we conduct a systematic review of CMFD research, synthesizing recent developments, analyzing trends, and identifying challenges and opportunities for future work.

Based on these topics, we conduct a review of studies spanning from 2020 to 2025, focusing on image forgery detection. Our statistical analysis, based on data collected from relevant literature, classifies forgery detection research and CMFD methods over this period. The findings indicate that CMF is the most extensively studied type of image manipulation and highlight the superior performance of deep-learning-based approaches in detection tasks. By providing a critical evaluation of methodological strengths, weaknesses, and open challenges, this survey aims to guide future research directions in CMFD.

3. Benchmarking CMFD: Datasets and Evaluation Metrics

3.1. Datasets

Datasets serve as essential tools for evaluating and comparing various algorithms across research domains [34]. When researchers employ different datasets or create their own, it becomes challenging to compare their findings with those of others. The use of standardized and widely accepted datasets greatly benefits the research community, as it enables consistent and reproducible comparisons. In the area of image forgery detection, particularly CMFD, several benchmark datasets have been introduced, each with distinct characteristics, strengths and limitations. Some of these datasets are outlined below.

3.1.1. MICC Dataset

The MICC dataset is one of the earliest and most widely used benchmarks in CMFD research. It comprises four subsets: MICC-F220, MICC-F2000, MICC-F600, and MICC-F8multi [35]. These subsets contain images forged via copy–move operations incorporating geometric modifications such as rotation and scaling. Despite its historical relevance, MICC presents several drawbacks. First, it lacks post-processing operations (e.g., compression and blurring), which significantly limits its realism and fail to emulate real-world tampering scenarios. Second, ground-truth masks are missing in MICC-F220, MICC-F2000 and MICC-F8multi, restricting performance evaluation to the image level rather than fine-grained pixel-level analysis. Third, the high-resolution images make block-based detection methods computationally expensive, favoring keypoint-based approaches instead. Moreover, the limited and uneven distribution of authentic and forged images across the subsets may reduce the reliability and generalizability of the results [35]. Furthermore, since all tampered samples are synthetically generated, the dataset fails to capture the complexity of real-world manipulations, such as mixed post-processing artifacts, irregular lighting variations, and semantically coherent object movements, which often occur in real forgery.

3.1.2. CoMoFoD Dataset

The CoMoFoD dataset, introduced by Tralic et al. in 2013 [36], consists of two subsets derived from 260 base images, one small (512 × 512) and one large (3000 × 2000). It provides a balanced collection of original and forged images (10,000 in the small set and 3000 in the large one). Forgery was generated using the copy-move technique with various geometric transformations such as translation, rotation, scaling, distortion, and their combinations [37]. Additionally, both original and forged images underwent post-processing operations, enhancing realism. The CoMoFoD dataset is particularly valuable because it provides ground-truth masks, supports pixel-level evaluation and covers a broad range of transformation types. However, its large image subset poses computational challenges for block-based methods, and the absence of precise transformation parameters (e.g., rotation angles) hinders reproducibility. Although CoMoFoD is often regarded as one of the most comprehensive datasets for CMFD, its controlled setting and synthetic manipulations still limit its ecological validity compared with naturally tampered images collected from the web or social media.

3.1.3. CASIA Dataset

The CASIA dataset, widely used for splicing and CMFD tasks, is available in two versions: CASIA V1.0 and CASIA V2.0. The first comprises 800 authentic images and 925 forged ones (384 × 256), while the second contains 7491 authentic and 5123 forged ones with varying resolutions. Both include post-processing operations such as blurring to enhance realism [38]. However, the absence of ground-truth masks limits its utility for pixel-level evaluation [39]. Moreover, annotation inconsistencies between CASIA and other public datasets introduce significant bias in cross-method comparisons. For instance, some datasets annotate copied regions only, while others include both source and target areas, making metric-based benchmarking unreliable. Hence, establishing a unified annotation standard is crucial for ensuring fairness and reproducibility across CMFD studies.

3.1.4. COVERAGE Dataset

The COVERAGE dataset, proposed by Wen et al. [40], was designed to increase realism by incorporating visually similar objects, making forgery more challenging to detect. It includes six transformation categories: rotation, scaling, freeform, illumination, translation, and combination. Nevertheless, COVERAGE is limited by its small scale (only 100 original images and 100 forged ones) and lack of post-processing operations, which reduce complexity and realism. Although it provides ground-truth masks for both source and target regions, these must be merged prior to evaluation, adding an extra preprocessing step. Despite these limitations, its inclusion of visually similar objects introduces a degree of semantic realism absent in earlier datasets. However, its limited diversity and lack of complex contextual scenes make it insufficient for training deep architectures capable of generalizing uncontrolled real-world image forgery.

3.1.5. Inpainting and General Purpose Datasets

In addition to CMFD-specific datasets, several large-scale image collections are used for forgery and inpainting research. These include COCO [41], ImageNet [42], MIT Places [43], UCID [44], and facial datasets like CelebA [45], Celeb-HQ [46], FFHQ [47] and Caltech Faces [48]. These datasets enable model training on diverse visual concepts and facilitate robustness analysis under different imaging conditions [49,50]. However, their heterogeneous annotation schemes, demographic bias and good acquisition conditions may limit their representativeness of real-world manipulations. Although these general-purpose datasets provide a broad visual foundation, they are not explicitly designed for tampering detection tasks and therefore lack the structural correlations and ground-truth annotations required for CMFD benchmarking.

3.1.6. Critical Analysis and Dataset Suitability Criteria

Although numerous CMFD datasets exist, most are synthetically generated and fail to capture the complexity and diversity of authentic manipulations involving compression, illumination inconsistencies or composite forgery. Furthermore, the absence of standardized ground-truth annotations and uniform evaluation protocols hampers objective benchmarking and cross-dataset generalization.

To address these gaps, a dataset’s suitability for CMFD research should be assessed according to several complementary criteria:

Realism: The extent to which manipulations emulate real-world scenarios, including post-processing, mixed compression, and context-aware editing.
Annotation consistency: The availability of accurate, standardized ground-truth masks enabling fair cross-method evaluation.
Diversity: Inclusion of varied image content, resolutions, manipulation scales, and scene complexities.
Balance: A proportional representation of authentic and tampered images to prevent model bias.
Reproducibility: Transparent documentation of manipulation parameters and forgery-generation processes.

Future dataset designing should move towards hybrid approaches that combine synthetic control with realistic manipulations collected from diverse environments. This would bridge the gap between laboratory performance and real-world applicability, enabling more robust and generalizable CMFD evaluation frameworks.

3.2. Evaluation Metrics

Evaluation metrics are employed to quantitatively assess the performance of a model or algorithm on a benchmark dataset. In CMFD, each image is typically classified as either forged or authentic. These metrics are derived from a confusion matrix [10], which comprises four components: True Positive (TP), representing instances in which forged images are correctly identified as forged; False Positive (FP), corresponding to authentic images that are incorrectly classified as forged; False Negative (FN), indicating forged images that are mistakenly classified as authentic; and True Negative (TN), denoting authentic images that are accurately recognized as authentic. Using these components, standard evaluation metrics, such as Accuracy (Acc), Precision, Recall, and F1-score [39], are calculated as follows:

Accuracy: It measures the ratio of accurate predictions to total predictions (Equation (1)):

$Acc = \frac{TP + TN}{TP + TN + FP + FN}$

(1)
Precision: It measures the accuracy of positive predictions (Equation (2)):

$Precision = \frac{TP}{TP + FP}$

(2)
Recall: It is also known as sensitivity or True Positive Rate (TPR). It measures the model’s ability to find all relevant positive cases (Equation (3)):

$Recall = \frac{TP}{TP + FN}$

(3)
F1-score: It is the harmonic mean of precision and recall (Equation (4)):

$F 1 - score = 2 \times \frac{Precision \times Recall}{Precision + Recall}$

(4)

The F1-score ranges from 0 to 1, where 1 indicates perfect precision and recall, and 0 indicates the worst performance.

However, while these global metrics are informative, they are insufficient for evaluating pixel-level CMF localization. To assess spatial accuracy, we denote by P the set of pixels predicted as forged and by G the ground-truth forged pixels.

The Intersection over Union (IoU), also called the Jaccard Index, quantifies the overlap between the predicted and ground-truth forged areas (Equation (5)):

IoU = \frac{| P \cap G |}{| P \cup G |}

(5)

The Dice Similarity Coefficient (DSC) measures the spatial similarity between P and G, emphasizing the balance between Precision and Recall (Equation (6)):

DSC = \frac{2 | P \cap G |}{| P | + | G |}

(6)

For region-based evaluation, the Average Precision (AP) is often employed. Let

T P_{b b o x}

,

F P_{b b o x}

and

F N_{b b o x}

denote the number of true, false and missed detection at the bounding-box level. The precision–recall curve is generated, and the AP corresponds to the area under this curve (Equation (7)):

AP = \int_{0}^{1} p (r) d r

(7)

where

p (r)

is the precision as a function of recall r.

To evaluate robustness under real-world perturbations (e.g., compression or noise), a Metric Drop Rate (MDR) can be introduced. Let

M_{r e f}

denote the metric value on pristine images and

M_{d e g}

its value under a degradation (e.g., JPEG compression of 30%). The drop rate is given by Equation (8):

MDR = \frac{M_{r e f} - M_{d e g}}{M_{r e f}} \times 100 %

(8)

This indicator helps quantify how performance degrades under adverse conditions, providing insight into the model’s robustness and generalization capability. Finally, it is essential to recognize that no single metric can comprehensively capture all aspects of CMFD performance; thus, the choice of metrics should depend on the task objective, global detection, pixel-level localization, or robustness analysis.

Table 1 provides a concise overview of the evaluation metrics discussed above, including their specific use in CMFD assessment, objectives and limitations.

This overview underscores the importance of selecting evaluation metrics that align with the specific task objective, whether for global image-level detection, pixel-level localization or robustness assessment. Employing a combination of these metrics enables a more comprehensive and nuanced evaluation of CMFD performance across diverse scenarios.

4. Applications and Real-World Domains

CMFD represents a critical research domain with diverse and significant applications, encompassing tasks from verifying image authenticity to addressing the escalating challenges posed by digital manipulation. Its practical relevance extends across multiple fields, contributing to the preservation of trust and integrity in an increasingly digital visual environment (Figure 4). The following section provides a detailed overview of these applications:

Forensic and law enforcement investigations: In criminal and judicial contexts, CMFD serves as a crucial forensic tool to verify the authenticity of digital images submitted as evidence. Moreover, by confirming the integrity of visual materials, investigators can strengthen legal proceedings and ensure the reliability of digital proof. Domain-specific constraints in forensic applications include handling low-resolution, highly compressed surveillance footage while generating interpretable and legally admissible results. Traditional feature-based methods (e.g., Scale-Invariant Feature Transform (SIFT) and Speeded-Up Robust Features (SURF)) and interpretable deep architectures remain preferred over purely black-box approaches. For example, a recent forensic validation framework has applied CMFD to closed-circuit television footage within simulated judicial investigations. The system successfully localized tampered regions in 240p images, producing explainable heatmaps suitable for forensic reporting and evidentiary use. In law enforcement settings, attention-based CMFD models have also demonstrated high robustness against noise, compression artifacts, and partial occlusions, thereby enabling traceable and explainable image analysis.
Journalism, media and entertainment: In the media and entertainment sectors, CMFD is essential for ensuring the authenticity of published visuals, preventing misinformation, and maintaining audience trust. Recent studies have emphasized the importance of explainability and provenance tracking in journalistic contexts, leading to the integration of CMFD within blockchain-based verification systems and deepfake detection frameworks. For instance, hybrid CMFD–GAN detection systems have been adopted by several news agencies to validate both copy–move and synthetic image manipulations before publication. A lightweight CMFD model integrated into an editorial verification workflow achieved near real-time tamper localization, reducing verification time by 80% while maintaining interpretability, an essential requirement for editorial decision-making.
Social media and online platforms: Social networks employ CMFD to combat misinformation and image-based manipulation, protecting users from deceptive or malicious content. Real-time inference and large-scale deployment represent the primary technical challenges in this domain. Lightweight CNN architectures such as MobileNetV2 and ShuffleNet, enhanced through pruning, quantization, and knowledge distillation, enable near real-time CMFD on resource-limited systems. For instance, deploying a quantized MobileNetV2 variant in a social-media moderation pipeline can substantially reduce inference latency, enabling near-real-time flagging of manipulated images while maintaining comparable detection performance.
E-commerce and corporate integrity: In e-commerce and brand management, CMFD ensures that product images and promotional materials are authentic and untampered. By detecting manipulated visuals, CMFD safeguards consumers from misleading claims, supports fair competition, and preserves corporate reputation. In this domain, systems must handle highly heterogeneous image resolutions, lighting conditions, and compression levels. Texture- and region-based features combined with CNN-based feature matching have proven effective for product image verification. Cloud-integrated CMFD frameworks now support large-scale automated screening across online marketplaces.
Healthcare and research integrity: In healthcare, CMFD plays an important role in ensuring the authenticity of diagnostic and clinical imagery, helping prevent potentially harmful misinterpretations. Unlike typical forensic tasks, healthcare applications demand privacy-preserving and domain-adaptive solutions. Federated learning-based CMFD models enable collaborative detection across institutions without centralizing sensitive data. In scientific publishing, CMFD assists in detecting manipulated figures, safeguarding research credibility and ethical standards. Recent integrity-assurance frameworks combine CMFD with metadata consistency checks to flag manipulated or duplicated visuals in academic publications, reinforcing transparency and reproducibility in research.
Art, heritage, and archival preservation: In digital art and heritage management, CMFD contributes to verifying authenticity and detecting unauthorized modifications in digitized artworks. Hybrid feature-learning approaches have achieved promising results in detecting copy–move forgery in paintings and historical images, producing interpretable localization maps that assist expert verification. For example, CMFD has been successfully applied to digitized Renaissance paintings to identify subtle restoration or retouching interventions, highlighting its relevance in cultural heritage protection.
Government, defense, and public policy: Government and defense institutions rely on CMFD to authenticate official documents, surveillance images, and strategic visual data, thus supporting transparency and national security. Explainable and traceable CMFD frameworks are essential in these domains to ensure accountability in automated decisions. Multi-modal CMFD approaches that integrate image, metadata, and geolocation verification have been successfully employed in defense intelligence and counter-propaganda analysis.

5. Comprehensive Overview of CMFD Techniques

Identifying CMF in digital images remains a challenging task that demands dedicated methodologies and specialized tools. Over the years, researchers have developed a wide range of strategies and advanced techniques to address this problem [10,51,52,53]. These approaches can be broadly categorized into conventional CMFD methods (such as block-based and keypoint-based techniques), hybrid strategies, and those leveraging deep-learning frameworks (Figure 5).

5.1. Conventional CMFD Techniques

Forgery detection methods are generally categorized into two primary types: block-based and keypoint-based approaches. Typically, these methods follow a three-step process: descriptor extraction, similarity mapping, and identification of tampered regions.

Block-based methods segment an image into overlapping or non-overlapping blocks [54]. Features are extracted from each block using techniques such as the discrete cosine transform (DCT), principal component analysis (PCA), or local binary patterns (LBP), and are subsequently matched using correlation measures or Euclidean distance [55]. Forgery localization is typically achieved through geometric transformations, often employing Random Sample Consensus (RANSAC) to filter out mismatches [56,57,58]. In [51], preprocessing began with converting the RGB image to grayscale, which was then partitioned into overlapping 4 × 4 pixel blocks. Each block underwent the Tetrolet transform, producing four low-pass and 12 high-pass coefficients that captured local structural information effectively. For matching, feature vectors were lexicographically sorted, and similar blocks were identified based on Euclidean distance comparisons against predefined thresholds. Although this block-based approach provided a structured methodology for CMFD, it exhibited limitations: the small block size amplified sensitivity to noise and compression artifacts, and the combination of sorting and distance calculations incurred high computational costs on large images. Therefore, scalability and robustness remained challenging in practical, real-world applications.
From an analytical standpoint, block-based methods demonstrated strong local correspondence detection but lacked global contextual reasoning. They often treated image regions as independent entities, ignoring semantic relationships between copied and original areas. Consequently, their performance deteriorated when the copied region overlapped complex textures or underwent multiple transformations. The authors in [14] suggested that hybrid strategies, combining block-based local similarity measures with deep feature representations, could mitigate these weaknesses by providing both fine-grained matching and semantic awareness. Such integration also enhanced interpretability, as handcrafted descriptors offered explainable cues while deep networks improved adaptability. In postprocessing, to address false matches caused by self-similar or homogeneous regions (such as the sky), the RANSAC algorithm was employed to eliminate inaccurately matched blocks. Finally, morphological operations like opening and closing were applied to precisely locate the forged regions in the image.
In a related study [53], the preprocessing stage began by converting the image from RGB to grayscale, followed by dividing it into overlapping square blocks. Gaussian–Hermite Moments (GHMs) were employed for feature extraction on each block. GHMs proved particularly effective due to their scale and rotation invariance, making them a robust descriptor for forgery detection. During block matching, descriptors were first sorted in lexicographic order, and analogous blocks were identified through Euclidean distances and thresholding. In the postprocessing phase, the RANSAC algorithm filtered out mismatched blocks, while morphological operations were applied to refine the detection of forged regions. The summary of several forgery detection studies using block-based approaches is presented in Table 2. These studies demonstrate the effectiveness of block-based methods in CMFD, especially when integrated with advanced feature extraction techniques and machine learning algorithms. However, they remain computationally demanding and exhibit limited robustness against geometric transformations, namely rotation, scaling, and affine distortions. In addition, while these methods effectively capture local similarities, they often fail to exploit global contextual relationships within the image, which limits their adaptability to complex or multi-region forgery. Future research should therefore focus on hybrid frameworks that combine block-based precision with deep contextual understanding to improve both robustness and interpretability.

Table 2. Overview of CMFD methods utilizing block-based approaches.

Ref/Year	Dataset	Techniques	Performance
[53] (2019)	Small CoMoFoD GRIP CMF	RGB to gray-level conversion, overlapping square blocks, GHM, Euclidean distance matching, morphological operations, RANSAC	High robustness to geometric transformations and post-processing (rotation, blurring, color adjustments, JPEG compression).
[57] (2019)	CMFD FAU	Image resizing, overlapping circular blocks, Zernike moments, KD-tree, RANSAC	Strong resistance to arbitrary transformations but high computational complexity.
[51] (2020)	GRIP CoMoFoD	RGB to gray conversion, overlapping square blocks, Contourlet transform, absolute difference matching, Fast Outliers filtering	Highly effective against noise, post-processing, and geometric distortions.
[52] (2020)	CoMoFoD COVERAGE DVMM	RGB to YCbCr conversion, Y-component square blocks, Tchebichef moments, GHV, Euclidean distance matching, morphological operations	Robust against brightness variations, color distortions, and compression artifacts.
[59] (2020)	CoMoFoD CASIA	Image resizing, RGB to gray conversion, circular blocks, PCET-SVD, Euclidean distance matching, block distance filtering	Good performance against rotation, scaling, and compression, but struggles with highly textured regions.
[60] (2022)	CoMoFoD	RGB to gray conversion, SWT, overlapped blocks, DCT-SVD, Euclidean distance matching, block distance, morphological operations	Resistant to post-processing operations and effectively detects complex forgery patterns. Performance data not available.

Keypoint-based methods avoid dividing the image into blocks. Instead, they detect keypoints such as corners and edges using algorithms like SIFT [14,61] or SURF [6], and match them through clustering or nearest-neighbor techniques. While computationally efficient and robust to transformations, these methods face challenges in uniform regions or misclassifying similar images as forgery [62].
Wang et al. [63] introduced a keypoint-based approach for detecting CMF in digital images, combining the SURF algorithm with Polar Complex Exponential Transform (PCET) techniques. The method began by dividing the image into non-overlapping irregular blocks through superpixel segmentation, which were subsequently categorized as either smooth or textured regions. Keypoints were then extracted using the SURF algorithm, and PCET coefficients were calculated to facilitate the identification of similar features via feature matching. To reduce FPs and accurately pinpoint areas with a high concentration of matching points, a filtering strategy coupled with the RANSAC algorithm was applied. The detected tampered regions were further refined through mathematical morphology and an iterative procedure. Comparative evaluations indicated that this approach outperformed other CMFD methods, particularly in high-brightness smooth areas and in images containing visually similar genuine regions. Moreover, it demonstrates robustness against various distortions, like rotation, scaling, blurring, JPEG compression, and added noise.
In [64], the authors investigate CMF, which exploited homogeneous regions to conceal or replicate objects, making them difficult to detect. Conventional keypoint-based CMFD methods would struggle in such areas due to a lack of distinctive features. The study evaluated SIFT, SURF and A-KAZE for CMFD, showing that A-KAZE would achieve the highest accuracy on the NB-CASIA dataset (98.98%), outperforming SURF (93.9%) and SIFT (89.2%). A-KAZE proved particularly effective for large-scale forgery in uniform regions while maintaining low computational cost.
In [65], Yang et al. implemented the SIFT algorithm during the feature extraction phase to identify keypoints in the original image. For feature matching, the k-Nearest Neighbor (kNN) algorithm was used to determine potential matching pairs from the detected keypoints. During the post-processing stage, a custom Two-Stage Filtering (TSF) algorithm was introduced to filter out false matches while retaining true ones. The TSF process involved two steps: (1) the grid-based filter algorithm, which performed an initial refinement of the keypoints, and (2) the clustering-based filter algorithm, which further refined the keypoints by grouping them, identifying inliers and outliers and discarding keypoints in outlier groups. Finally, the Delaunay triangulation method was utilized for image matting to highlight the forged regions.
An overview of keypoint-based methods for CMFD is illustrated in Table 3.
Despite their effectiveness, conventional methods suffered from several inherent limitations, including the need for extensive manual parameter tuning, strong dependency on specific datasets, and limited generalizability across diverse image conditions. Moreover, their performance was often constrained by handcrafted feature design, which lacked adaptability to unseen manipulation patterns or varying acquisition settings. Such dependence on fixed thresholds and manually engineered descriptors hindered scalability and reproducibility, making these methods less reliable in real-world forensic scenarios where noise, compression, and complex transformations are prevalent.

5.2. Hybrid CMFD Techniques

Hybrid methods, as discussed in various studies [73,74], leverage combinations of complementary detection strategies [75], such as integrating deep-learning with block-based or keypoint-based approaches, and could enhance detection accuracy and computational efficiency. These methods have been effective in detecting diverse CMF, handling various transformations, and managing complex backgrounds.

For example, in [73], the authors proposed a forensic framework that combined adaptive and hybrid strategies, incorporating Haar DWTs, SURFs, Histogram of Oriented Gradients (HOG), and a probabilistic filter for classification. This approach was evaluated on the IMD and MICC-F220 datasets, achieving promising results. Similarly, the authors in [74] introduced a CMFD method that employed SURFs, maximally stable extremal regions and novel filtering techniques, which demonstrated robust performance on the MICC-F220 dataset and the F8 multi-datasets.

Hybrid methods integrate two or more detection strategies, such as block-based and keypoint-based techniques [63,73,76,77], to enhance both the accuracy and efficiency of CMFD. These approaches are capable of detecting multiple forgery, handling various geometric and photometric transformations, and effectively performing on images with complex backgrounds or overlapping regions. However, they often involve higher computational costs and require careful parameter selection and tuning [78,79].

Alhaidery et al. [73] suggested a passive image forensic framework that combined adaptive and hybrid strategies. The framework comprised a preparatory phase followed by three processing layers. In the preparatory phase, Haar DWTs were used for automated image segmentation. The first layer applied simple linear iterative clustering to segment images and classified the segments using entropy measures. In the second stage, SURF and HOG descriptors were utilized to perform region-based keypoint detection. The third stage incorporated a probabilistic FP suppression module to enhance recognition accuracy by refining the TPR and false-positive rate (FPR). The pipeline was assessed on the IMD and MICC-F220 benchmark datasets, demonstrating robustness against photometric and geometric transformations. In a subsequent study [74], the same authors developed a CMFD scheme focusing specifically on detecting duplicated regions. This approach integrated SURF and MSER detectors with unique vector representations, complemented by newly introduced parallel and distance ratio filters to enhance detection performance. The experiments on the MICC-F220 dataset and F8 multi-datasets demonstrated the method robustness to post-processing attacks, achieving a high TPR and a low FPR, and improving upon their previous framework by specifically targeting CMF.

Nirmal et al. [80] introduced a CMFD approach that combined the Histogram of Oriented Gradients (HOG) descriptor with Local Binary Pattern Variance (LBPV) algorithms. The method located forged regions through post-processing procedures and was evaluated on six benchmark datasets (UCID, MICC-F220, CASIA, MICC-F2000, TIDE, and CoMoFoD), which included a wide variety of transformations such as translation, flipping, scaling, rotation, color adjustments, brightness variations, and JPEG compression. Performance was assessed using pixel-level metrics, namely the FPR and True Detection Rate (TDR). Table 4 illustrates an overview of hybrid CMFD methods.

5.3. Deep-Learning-Based CMFD Methods

Over the last few years, deep-learning techniques have gained significant attention in the field of computer vision, specifically in forgery detection [87]. These methods excel in automatically learning hierarchical feature representations directly from data, enabling the extraction of rich semantic information without manual feature engineering. However, a major challenge associated with deep-learning approaches is their strong dependence on large datasets for effective training. To mitigate this limitation, several strategies have been put forward in the literature, as outlined below.

CMFD using CNNs: Researchers have adapted deep-learning models [88,89,90,91,92,93,94,95] such as the VGG series, DenseNet, and GoogleNet architectures for forgery detection by fine-tuning their layers and training them on domain-specific datasets [96]. AlexNet is a widely used pre-trained CNN in deep-learning, comprising eight layers, five convolutional blocks, two fully connected layers, and a softmax output that distinguishes between 1000 classes. Trained on the large-scale ImageNet dataset with approximately 60 million parameters, it demonstrates strong generalization capability. The Relief feature selection algorithm can be then applied to extract relevant features, which can be subsequently classified as authentic or forged using logistic regression.
Several pre-trained networks have been evaluated in combination with different classifiers on the MICC-F600 and MICC-F2000 datasets. In a related study, Hebbar et al. [97] explored the use of transfer learning for detecting copy-move and splicing forgery, employing models such as VGG16, VGG19, ResNet50, and DenseNet. To enhance detection accuracy, images were preprocessed using Error Level Analysis (ELA), which highlighted forgery regions by comparing the original image with its recompressed version at 90% quality. The preprocessed images were then used for model fine-tuning. During classification, feature maps extracted by the models passed through a Global Average Pooling (GAP) layer, followed by a dense layer with 512 neurons, a dropout layer (0.25) and a final dense layer with a sigmoid activation function to distinguish between authentic and forged images. This method was evaluated on the CASIA V2 dataset.
Accordingly, Table 5 provides an overview of several forgery detection studies employing this approach. The analysis of this table highlights the growing adoption of CNN architectures for CMFD. The table reveals a clear trend toward leveraging deep learning models originally designed for image classification, repurposed to extract discriminative features and identify duplicated regions. While these architectures have demonstrated competitive performance across several datasets, a closer examination indicates that their effectiveness remains highly dependent on network depth, dataset diversity, and post-processing refinement. While these architectures have been effectively adapted for copy-move detection, their primary strength lies in feature extraction and classification rather than precise pixel-level localization. This limitation becomes particularly evident when dealing with subtle or small duplicated regions, where global features may overlook fine-grained spatial inconsistencies. Furthermore, most CNN-based CMFD frameworks rely heavily on supervised learning, requiring large, well-annotated datasets that are often unavailable or domain-specific. As a result, their generalization capability remains limited when confronted with unseen manipulations or cross-dataset variations. In addition, the lack of explicit spatial constraints within convolutional layers can lead to localization ambiguity, especially in cases involving geometric transformations such as rotation, scaling, or partial occlusion. Therefore, despite their strong representational power, CNN-based methods still face challenges in achieving robust, interpretable, and pixel-accurate forgery detection. To overcome these constraints, recent research has shifted toward hybrid architectures that integrate CNNs with attention mechanisms or transformer-based encoders.

Table 5. Overview of CMFD studies using CNNs.

Ref	Year	Dataset	Techniques	Performance
[98]	2019	GRIP	Pre-trained AlexNet model with Block-based method	This method combines the advantages of a pre-trained AlexNet and a block-based approach
[99]	2019	CASIA	CNN with Block-Based method with Zernike Moment	This method combines the advantages of deep-learning and a block-based approach, detecting both forgery images and their types
[97]	2021	CASIA V2	ResNet50, VGG16, VGG19 and DenseNet networks	This technique utilized pre-trained networks and transfer learning with ELA images to emphasize forgery regions
[96]	2021	MICC-F2000	Dual branch of CNN with different filters and kernel sizes	This technique uses a dual-branch CNN with a functional API, enhancing robustness to scaling through filters of varying kernel sizes
[93]	2022	MICC-F600, MICC-F2000	AlexNet with Logistic regression	This technique employed the pre-trained AlexNet model as the feature extractor
[100]	2025	4 publicly available datasets	Lightweight CNN-based CMFD network (ST-Net) with Selective Sampling Attention (SSA), Two-Step Self-Correlation Calculation (TS-SCC), dual-branch adaptive feature fusion, multiscale atrous convolutions	Outperforms several related CMFD networks in detection accuracy, number of parameters, computational cost, and inference time
[101]	2025	CoMoFoD, CASIA, COVERAGE	Recursive Wavelet Transform Network: multi-stage wavelet transform, sorted convolution, adaptive multi-scale attention fusion, diagonal-guided self-correlation, U-Net localization	High localization accuracy; robust to geometric transformations

CMFD using object detection networks: Scientists have adapted object detection frameworks, including R-CNN, fast R-CNN, faster R-CNN, and mask R-CNN, ref. [102] to detect forgery regions by fine-tuning their layers and training them on specialized datasets. Detecting objects at different scales, especially small ones, remains challenging. The Feature Pyramid Network (FPN) addresses this issue by combining features from multiple convolutional layers to improve multi-scale detection performance [103]. A copy-move and splicing forgery detection approach integrating the FPN with mask R-CNN was explored in [102], following three stages: feature extraction (using ResNet-101), region proposal (via Region Of Interest (RoI) Align), and prediction (using an FCN framework). To enhance forgery localization accuracy, the Sobel edge detection filter was incorporated into the loss function. The method was evaluated on the COVERAGE and DVMM datasets.
Another study [104] proposed a two-stream faster R-CNN for forgery detection, utilizing both RGB and noise images. The RGB stream extracted features using ResNet-101, which were then fed into the Region Proposal Network (RPN) and RoI pooling for bounding box regression and manipulation classification. Since RGB images alone were insufficient, a second stream processed noise images generated via a steganalysis-rich model filter. Extracted noise features were then combined with RGB features to improve detection of copy-move, splicing, and object removal forgery. This method was tested on the NIST16, COVERAGE, CASIA, and Columbia datasets. However, despite its promising dual-stream design, the approach remained limited by its reliance on supervised learning and the computational overhead of the Faster R-CNN framework. Moreover, its performance degraded when confronted with subtle manipulations or low-quality images, indicating that noise-based cues alone may not provide sufficient discriminative power for generalized forgery detection.
Additionally, a two-stage constrained R-CNN with mask R-CNN was introduced in [105]. In the first stage, a constrained convolution layer and early ResNet-101 blocks extracted a unified feature representation, which was passed to an attention-based RPN for RoI identification. In the second stage, a skip structure merges low-level (Conv-3x) and high-level Convolutional Block Attention Module (CBAM) features, enhancing global representations. A softmax layer then classified boundary pixels. This approach was evaluated on the COVERAGE, Columbia, NIST16, and CASIA datasets. The integration of CBAM significantly improved feature discrimination and spatial awareness, allowing the network to better capture subtle boundary cues between authentic and tampered regions. However, this two-stage framework also increased computational complexity and inference time due to the additional attention and mask generation processes. Moreover, its dependency on region proposals limited scalability and reduced robustness when applied to high-resolution or heavily compressed images. As a result, although the method achieved high detection accuracy on benchmark datasets, its practical deployment in real-world forensic analysis remains challenging. Table 6 provides an overview of several studies employing object detection networks for CMFD, highlighting the architectures, datasets, and key findings. The analysis of Table 6 shows that object detection-based CMFD methods increasingly exploit advanced region proposal and segmentation frameworks such as Faster R-CNN and Mask R-CNN to achieve precise localization of tampered areas. While these approaches demonstrate strong detection accuracy and robustness to geometric and compression variations, they often suffer from high computational cost and dependency on large annotated datasets. Moreover, the reliance on bounding-box or mask-based localization can introduce coarse boundary estimations, limiting their effectiveness for subtle or small-scale manipulations. Overall, these models illustrate a promising evolution toward region-aware detection but still face challenges related to efficiency, generalization, and scalability in real-world forensic scenarios.

Table 6. Overview of CMFD methods utilizing object detection networks.

Ref	Year	Dataset	Techniques	Performance
[104]	2018	COVERAGE, CASIA, NIST16	Two streams Faster R-CNN and RGB with noise image	Robust to resizing/compression, leveraging color/noise to detect and specify forgery types
[102]	2019	COVERAGE, DVMM	Feature pyramid network with ResNet101 as the backbone with Mask R-CNN with Sobel Filter	Robust to compression/resizing, with Sobel loss improving accuracy for multi-scale forgery detection
[105]	2020	NIST16, COVERAGE, CASIA, Columbia	Constrained R-CNN	Detects pixel forgery and type, using constrained convolution and feature fusion for robust detection
[106]	2022	CoMoFod, MICC-F2000, CASIA V2	DenseNet-41 with Mask R-CNN	Robust to transformations, but sensitive to large light variations

CMFD using autoencoder networks: An autoencoder network, consisting of an encoder and a decoder, compresses an image into a latent representation and reconstructs it to detect inconsistencies. Symmetric or asymmetric layer designs can enhance detection performance. Instead of reconstructing the entire image, the network can generate a binary mask to differentiate between forged and authentic pixels, effectively detecting various forgery types [107,108]. Autoencoder networks, typically used for data generation, can also be adapted for image forgery detection. They consist of an encoder, which compresses the input image into a feature representation, and a decoder, which reconstructs the original image. The encoder–decoder structure is often symmetric, though some studies [109] introduced asymmetry or multiple input streams to improve forgery detection. Training is generally performed using datasets with ground-truth images to optimize accuracy.
Unlike traditional reconstruction-based autoencoders, forgery detection models generate binary masks to accurately distinguish manipulated from authentic pixels. For instance, Ding et al. [110] suggested DCU-Net, a dual-channel U-shaped network specifically designed for splicing forgery detection. It consisted of encoder, feature- fusion, and decoder stages. Unlike standard autoencoders, DCU-Net’s encoder incorporated two input channels: an RGB image and a residual image (extracted via high-pass filtering). This dual-channel design improved edge and content feature extraction. Feature fusion occurred in two steps: First, deep features from both channels are combined; second, multiscale dilated convolution layers refined forgery detection across various region sizes. In the decoder, low-resolution features were upsampled, while skip connections restore lost semantic information. The final stage applied conditional random fields and morphological operations to enhance pixel-level localization accuracy. DCU-Net was evaluated on the CASIA and Columbia datasets.
In [111], the authors proposed an encoder–decoder CNN framework for copy-move forgery detection, which utilized images at multiple scales to capture both global and local features. The network employed varied layers and kernel sizes, enabling more comprehensive feature extraction across different image resolutions. This multi-scale design enhanced the model’s robustness to common image degradations, such as brightness variations, additive noise, and geometric scaling. Evaluations on the CoMoFod and CMFD datasets demonstrated that the method effectively detected forged regions, even under challenging conditions. However, the approach was limited by increased computational complexity due to multi-scale processing, and its reliance on CNN-based feature extraction potentially restricted adaptability to manipulations exhibiting subtle or highly textured patterns. Overall, while this encoder–decoder design improved resilience to typical image variations, further work would be needed to enhance generalization across diverse forgery types and real-world scenarios. Table 7 summarizes recent CMFD studies employing autoencoder-based networks. These methods leveraged encoder–decoder architectures, often enhanced with CNNs, LSTMs, or hybrid modules, to capture both spatial and temporal dependencies and improve pixel-level forgery localization. While they demonstrated robustness to brightness variations, noise, compression, and geometric scaling, most approaches suffered from high computational complexity, long training times, and dependency on large annotated datasets. Moreover, despite their ability to detect various forgery types, subtle manipulations in highly textured or low-resolution regions remained challenging, indicating that autoencoder-based CMFD methods still require further refinement for scalable, real-world deployment.

Table 7. Overview of CMFD studies using autoencoder networks.

Ref	Year	Dataset	Techniques	Performance
[112]	2019	NIST16, COVERAGE	(CNN with LSTM) Encoder and Decoder	This approach integrates spatial and frequency information to enhance performance and detect all types of forgery.
[109]	2020	MIT dataset	(CNN with LSTM) Encoder and Decoder (CNN)	The CNN detects similarities, and the LSTM eliminates wrongly identified forgery regions, making it robust to compression and noise
[113]	2021	GRIP, DVMM, CMFD, BSDS300	VGG16, CNN for splicing and copy-move, and Encoder–Decoder for pixel forgery	This method detects and classifies forgery types but has high computational complexity, preprocessing, and training time
[111]	2022	CoMoFod, CMFD	Encoder (CNN)-Decoder (CNN)	Utilizes images at different scales and networks with varied layers and kernel sizes for improved feature extraction, and is robust to brightness changes, noise, and scaling
[108]	2022	COVERAGE, NIST16, CASIA	Encoder (LSTM with Rotating Residual Units)-Decoder	This hybrid approach combines LSTM resampling and rotating residual units to highlight inconsistencies between healthy and forged regions, detecting all types of forgery
[107]	2022	COVERAGE, NIST16, CASIA, CoMoFod	U-Net (Encoder (ResNet)-Decoder)	This technique identifies all types of forgery but has high computational complexity

CMFD using generative adversarial networks (GANs): GANs consist of a generator, which creates samples resembling the training data, and a discriminator, which evaluates their authenticity. GAN-based methods for CMFD [114] leverage techniques such as one-class classification to identify anomalies. Abdalla et al. [115] put forward a three-branch framework for CMFD based on a GAN architecture, comprising a generator and a discriminator. The G component generated forged variations of the input data, whereas the discriminator classified pixels as genuine or manipulated. During training, the generator aimed to deceive the discriminator, which simultaneously learned to accurately detect forged regions. Complementing the GAN, a tailored CNN model extracted features to identify similar regions, particularly those affected by copy-move operations. The network integrated convolution operations, self-correlation mechanisms, pooling layers and dense connections to generate a binary mask representing the duplicated regions. A linear Support Vector Machine (SVM) was subsequently utilized to perform classification using the combined outputs of the GAN and the CNN. The GAN was pre-trained and evaluated on CIFAR-10 and MNIST, while the CNN and SVM were trained and tested on a custom dataset containing authentic and forged images. Although this GAN-based framework demonstrated innovative integration of generative and discriminative components, it remained limited by the reliance on synthetic pre-training, potential overfitting to small custom datasets, and high computational demands, which may hinder generalization to complex, real-world forgeries. Islam et al. [114] proposed a CMFD and localization technique based on a GAN-driven deep model, called DOA-GAN. In this architecture, the generator employed a dual-order attention module, enabling the extraction of forgery-aware attention maps and capturing co-occurrence relationships across image patches. The discriminator ensured the accuracy of the predicted masks. The results revealed that DOA-GAN outperformed state-of-the-art methods, providing finer copy-move masks and accurately distinguishing source and target regions. However, despite its improved localization and attention mechanisms, the method remained computationally intensive and heavily dependent on the availability of large, well-annotated training datasets, which may limit its applicability to diverse real-world images and subtle forgery scenarios.
Collectively, these studies have demonstrated the effectiveness of GAN-based deep-learning approaches for CMFD. Such methods are capable of learning complex image representations and generating corresponding masks that precisely delineate duplicated regions, showing strong potential for diverse multimedia applications.
The summary of several CMFD studies using GANs is presented in Table 8. From an analytical perspective, while GAN-based approaches demonstrate notable robustness to geometric distortions and limited availability of forged samples, they are inherently constrained by substantial data and computational requirements. Furthermore, their effectiveness diminishes when addressing subtle, small-scale, or homogeneous duplicated regions, indicating that additional methodological advancements are required to ensure reliable performance in practical forensic applications.

Table 8. Overview of CMFD methods utilizing GANs.

Ref	Year	Dataset	Techniques	Performance
[116]	2018	European credit card transactions	GAN with One-class classification approach	Ideal when forgery samples are lacking or training datasets are imbalanced
[117]	2018	NASA satellite image	Block-based with GAN and One-class approach	The approach relies only on authentic samples for training, with no assumptions regarding forgery
[115]	2019	MICC-F600, CoMoFod	GAN with SVM	Requires a large amount of data to train the GAN network
[114]	2020	CASIA, CoMoFod	Dual-order attentive GAN	Resilient to geometric distortions and subsequent processing, though its performance decreases in the presence of small or homogeneous regions

CMFD using Recurrent Neural Networks (RNNs): RNNs are designed to capture dependencies within data, particularly in image analysis [118], where they identify relationships between pixels. Researchers use RNNs [119], often in combination with CNNs and autoencoders, for pixel-level forgery detection. Techniques such as Long Short-Term Memory (LSTM) networks are employed to learn spatial relationships and classify image patches as forged or authentic. RNNs are applied in deep-learning to capture dependencies between components. In authentic images, pixel color dynamics exhibit strong dependencies, whereas forgery regions introduce spatial inconsistencies.
Researchers [112,118,119] have employed RNNs to detect forged patches or pixels, often combined with CNNs or autoencoders to refine detection and reduce false positives. For example, a pixel-level forgery detection method combining CNN and LSTM was suggested in [118]. First, image patches were extracted using a sliding window. The model consisted of five convolutional layers and a three-layer LSTM network. Initial convolutional layers captured low-level features, which were then processed into 8×8 blocks and fed into the LSTM to learn spatial relationships. A softmax classifier categorized patches as forged or authentic, followed by additional convolutional layers to generate a confidence score map. The model, trained end-to-end, detected multiple forgery types and was evaluated on the COVERAGE and NIST datasets.
Another approach [119] integrated CNN with ConvLSTM for CMFD. ConvLSTM, which replaced fully connected layers with convolutional ones, encoded spatial–temporal information. This method, tested on the MICC-F220, MICC-F600, MICC-F2000 and SATs-130 datasets, enhanced detection accuracy by preserving spatial correlations across pixels.
Table 9 summarizes forgery detection studies using RNNs. These architectures, particularly when combined with CNNs, effectively capture both spatial and temporal dependencies, improving the detection of sequential or contextual inconsistencies across image regions. However, despite their ability to handle complex feature correlations and enhance generalization, RNN-based methods often incur higher computational costs and longer training times. Moreover, their reliance on temporal modeling (originally designed for sequential data) may limit efficiency in purely spatial forgery detection tasks, suggesting that hybrid or attention-based extensions could offer more balanced solutions.

6. Challenges and Future Directions

The field of CMFD continues to face numerous technical, methodological, and ethical challenges that make it a complex and rapidly evolving research area. Existing approaches, ranging from traditional feature-based algorithms to modern deep-learning architectures, often encounter inherent limitations that restrict their robustness, scalability, and generalization capabilities. These obstacles underscore the urgent need for continuous innovation and methodological refinement to ensure the reliability and adaptability of CMFD systems. This section provides an in-depth discussion of key challenges while highlighting possible future directions, integration strategies, and emerging opportunities for improvement.

Computational complexity and lack of real-time performance: Most CMFD models remain computationally demanding and are not yet optimized for real-time deployment, especially when processing large, high-resolution images. This limitation is critical in time-sensitive or large-scale contexts such as social media monitoring, CCTV surveillance, and digital media forensics, where latency and scalability are major constraints. Conventional feature-based methods (e.g., SIFT and SURF) often generate thousands of keypoints per image, while deep CNN models (e.g., ResNet and EfficientNet) impose heavy GPU and memory loads during inference. Future research should focus on lightweight and energy-efficient architectures—such as MobileNetV3, ShuffleNet, or quantized CNNs combined with pipeline optimization and on-the-fly feature matching to achieve near real-time performance. Techniques like model pruning, knowledge distillation, dynamic inference, and the integration of GPU acceleration or edge frameworks (e.g., TensorRT and ONNX Runtime) can further enhance scalability, reduce latency and enable efficient CMFD deployment on resource-constrained devices.
Vulnerability to complex attacks: CMFD algorithms often lose robustness against sophisticated manipulations involving scaling, rotation, compression, or added noise. Adversarial attacks or compound transformations can disguise duplicated regions, reducing detection accuracy. For example, attackers can introduce local geometric distortions or apply post-processing filters that obscure copy-move traces, causing deep networks to misclassify or overlook tampered regions. Future work should emphasize adversarially trained CMFD models and the incorporation of data augmentation strategies simulating realistic forgery. The integration of self-supervised or contrastive learning could also enhance robustness by teaching models to recognize intrinsic structural consistencies even under degradation.
Localization precision: Accurate localization of forged regions remains a persistent issue, particularly when copied areas are small, overlapping, or seamlessly integrated into complex backgrounds. Most patch-based CNNs or block-matching methods still suffer from boundary uncertainty and tend to generate coarse detection masks. To improve spatial accuracy, attention-based networks or transformer architectures can be employed to focus on fine-grained inconsistencies. Hybrid strategies combining pixel-level refinement with contextual analysis, such as U-Net or Mask R-CNN variants adapted for forgery detection, can further improve localization precision.
Scale and rotation invariance: Manipulations involving scaling and rotation still present challenges, especially for CNN-based models trained on limited transformation variations. Feature-invariant descriptors (e.g., ORB and Zernike moments) have shown partial success, but deep-learning alternatives must explicitly model geometric transformations. Future research should explore Spatial Transformer Networks (STNs) or equivariant CNNs capable of learning transformation-invariant representations directly, thereby enhancing robustness against diverse manipulation scenarios.
High FPR: Many CMFD techniques produce high FPRs, flagging genuine textures as duplicated regions, mainly in images with repetitive patterns such as foliage or bricks. Reducing the FPR without sacrificing sensitivity remains a promising trade-off. Future systems may benefit from uncertainty quantification or confidence-aware fusion, integrating statistical thresholds with semantic-level context to distinguish authentic repetitions from manipulations. For instance, combining CNN-based similarity maps with structural priors (edges, symmetry cues) could substantially reduce false detections in textured backgrounds.
Generalization to diverse manipulation scenarios: Generalization remains one of the weakest points of current CMFD systems. Models trained on specific datasets (e.g., CoMoFoD or MICC-F220) often fail on unseen forgery types. This is partly due to overfitting to specific data distributions and lack of domain diversity. Future directions should explore cross-dataset learning, domain adaptation, and generative data synthesis (using GANs) to improve robustness to new manipulations. Meta-learning or few-shot CMFD frameworks could further enable rapid adaptation to novel manipulation styles with minimal retraining.
Dependence on training data quality: Machine learning-based CMFD methods are highly dependent on dataset quality. Current datasets often contain limited manipulation styles, artificial post-processing, or non-representative textures, reducing real-world applicability. Efforts should focus on building large-scale, balanced and publicly available datasets that simulate real-life forgery across multiple compression levels, devices and contexts. For example, a “CMFD-RealWorld” benchmark integrating authentic camera pipelines and complex environmental noise would substantially improve the ecological validity of future evaluations.
Integration challenges with existing systems: Embedding CMFD solutions into existing digital ecosystems, such as social media moderation, journalism verification pipelines or forensic software, remains technically difficult. Challenges include compatibility, data security, and scalability across heterogeneous infrastructures. Developing modular API-based CMFD frameworks that can plug into existing verification platforms is essential. Edge-AI deployment and federated learning approaches may also help preserve data privacy while maintaining efficient detection across distributed networks.
Ethical and privacy considerations: The adoption of the CMFD technology raises important ethical and privacy questions. Misclassification could lead to reputational harm, false accusations, or censorship, especially in journalism or legal contexts. To mitigate these risks, CMFD systems must incorporate explainability mechanisms that allow transparent interpretation of detection results. Collaborations with ethicists and policymakers are also required to establish responsible AI guidelines, ensuring fair and accountable use of image forensics technologies.

By addressing the aforementioned challenges through innovation in algorithm design, dataset creation, system integration, and ethical governance, CMFD can evolve into a more reliable, efficient, and socially responsible technology. Future research should also investigate hybrid CMFD architectures that integrate complementary paradigms, combining deep-learning features with handcrafted or statistical cues to simultaneously enhance interpretability, robustness, and computational efficiency. Concrete examples of such hybrid designs in related domains, such as deepfake detection, demonstrate their potential to enhance both accuracy and resilience.

7. Discussion

Numerous techniques for CMFD have been proposed [90,120,121,122], which can be broadly grouped into three main categories: conventional methods, hybrid techniques, and deep-learning-based approaches. Across these categories, researchers have explored diverse strategies to address different types of manipulations, reflecting the richness and rapid evolution of the field. In this study, we have conducted a comprehensive survey on CMFD methods to synthesize and compare existing approaches. In our comparison, we relied on aggregated findings from the literature, and it is important to acknowledge that such comparisons inherently involve challenges due to differing experimental setups across studies.

Conventional methods are mainly divided into block-based and keypoint-based techniques. Block-based approaches, which compare overlapping image blocks, are valued for their simplicity and effectiveness in detecting large, contiguous duplicated regions [53,57,59]. Keypoint-based methods, which rely on salient points or feature descriptors, are more robust to geometric transformations and scaling [62,66,67,68,72]. Despite advantages such as interpretability and low computational costs, these methods are often sensitive to noise, compression, and post-processing and tend to fail in detecting subtle or high-quality forgery. From an analytical perspective, conventional methods reveal an early trade-off between computational efficiency and robustness to transformations. Block-based techniques excel in structured duplication detection but lack geometric flexibility, whereas keypoint-based approaches handle transformations effectively at the expense of dense region coverage. This dichotomy illustrates a fundamental design tension that continues to influence modern CMFD architectures.

Hybrid techniques combine block-based analysis with keypoint descriptors, thereby leveraging the dense coverage of block partitioning together with the geometric invariance of keypoints. This integration enables a more balanced and reliable detection of CMF [63,73,76,77]. Nevertheless, as most of these traditional hybrid approaches still rely on hand-crafted features, they may lack robustness and fail to generalize when exposed to previously unseen manipulations or complex post-processing operations. To overcome these limitations, recent research has shifted toward hybrid frameworks that integrate traditional feature extraction with deep learning architectures, aiming to combine the interpretability of classical methods with the powerful representation capabilities of convolutional networks [123].

A key research direction lies in defining effective integration strategies for hybrid models. Successful fusion can occur at various stages, such as feature-level, decision-level or architecture-level fusion, depending on the complementary nature of the combined methods. For example, CNN-based features may be fused with keypoint descriptors to enhance localization precision, while handcrafted and learned features can be jointly optimized through multi-branch architectures. However, conflicts may arise when the feature spaces or optimization objectives of different modules are not well aligned, leading to instability or redundant learning. Careful normalization, attention-based weighting, or adaptive fusion layers can mitigate these issues and ensure that hybrid designs exploit synergies rather than amplify inconsistencies. Despite their conceptual appeal, most hybrid strategies remain experimentally underexplored. There is limited comparative evidence on which fusion level (feature, decision, or architecture) yields the best generalization under cross-domain settings. This research gap highlights the need for systematic benchmarking frameworks and ablation studies that go beyond descriptive comparisons. Deep-learning-based approaches have recently emerged as the dominant paradigm. Among them, CNNs remain foundational, GANs enhance robustness through synthetic training data, and autoencoders enable lightweight anomaly detection. Future CMFD systems are likely to adopt hybrid architectures combining complementary strengths, which can improve both detection accuracy and resilience to complex manipulations [78,79,86,124]. These systems typically achieve higher accuracy, better generalization, and improved robustness against adversarial manipulations. Integrating resilience-inspired recovery mechanisms, as suggested in complex network security research and exemplified by the resilience recovery method for complex traffic networks [125], could further enhance CMFD model robustness. Such approaches illustrate how predictive forecasting and adaptive recovery strategies can maintain performance under faults or unexpected image degradations, enabling accurate detection even in challenging conditions. In contrast, deep-learning methods face critical challenges. They require large and diverse datasets, incur high computational costs and often suffer from limited interpretability, which is problematic in forensic or legal contexts where explainability is essential. We observe that most models are evaluated on relatively constrained datasets, which raises concerns regarding their robustness in real-world scenarios. Incorporating resilience and recovery strategies, adapted from other domains such as complex traffic networks, allows CMFD models to maintain performance under unexpected perturbations and recover accurate predictions, thereby complementing conventional training and evaluation protocols. A more critical synthesis reveals that while deep-learning dominates current CMFD research, the field risks over-reliance on accuracy metrics without sufficient examination of interpretability, fairness, and energy efficiency. Moreover, the limited adoption of cross-dataset evaluation hinders an accurate assessment of models’ true generalization ability. Future reviews should therefore move beyond descriptive summaries to emphasize comparative insights, clarifying the evolutionary trends, trade-offs, and underlying factors that determine the superior performance of certain architectures, while providing analytical guidance for advancing CMFD research. We believe that addressing dataset diversity and standardizing evaluation metrics is as important as proposing new models.

8. Conclusions

This work presents a comparative review of the CMFD domain, covering key applications of image forgery detection, the main challenges reported in the literature along with proposed solutions, and potential future research directions. We have highlighted major factors that would affect system reliability, model performance, and generalization, including image variations (texture, scale, compression, and blurring), algorithmic and scene-related difficulties (complex backgrounds, occlusions, and multiple objects), and dataset limitations (imbalanced samples, insufficient annotations, and lack of diversity). We have reviewed approaches ranging from traditional block and keypoint-based techniques to modern CNN- and GAN-driven models, emphasizing their respective strengths and limitations. A critical examination reveals that although deep-learning has revolutionized CMFD, progress has often been incremental, with a lot of work focusing on architectural variations rather than addressing fundamental limitations such as explainability, cross-domain robustness, and computational scalability. Our analysis indicates that hybrid architectures integrating handcrafted features with deep representations offer a promising direction, as they combine interpretability with high discriminative power and adaptability. However, the absence of standardized benchmarks and unified evaluation protocols continues to hinder objective comparison and reproducibility. Future research should therefore emphasize analytical synthesis over descriptive reporting, fostering a deeper understanding of how architectural design, training strategies and data realism jointly influence model generalization. In addition, extending CMFD frameworks towards multimodal and cross-domain scenarios could enhance resilience to novel manipulations and better align forensic detection systems with real-world constraints.

We believe that this survey provides a comprehensive overview of the CMFD problem and will assist the community in further investigating its challenges. Ultimately, advancing CMFD requires a paradigm shift (from isolated technical improvements to a more integrated, explainable, and benchmark-driven approach) capable of ensuring trust, transparency and scalability in digital image forensics.

Author Contributions

Conceptualization, I.S. and L.R.H.; methodology, I.S. and L.R.H.; investigation, I.S.; writing—original draft preparation, I.S.; writing—review and editing, L.R.H. and N.E.B.A.; supervision, L.R.H. and N.E.B.A.; project administration, L.R.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AP	Average Precision
CLAHE	Contrast-Limited Adaptive Histogram Equalization
CMF	Copy-Move Forgery
CMFD	Copy-Move Forgery Detection
CNN	Convolutional Neural Network
DCT	Discrete Cosine Transform
DCU-Net	Dual-Channel U-shaped Network
DHE	Dynamic Histogram Equalization
DoG	Difference of Gaussian
DOA-GAN	Dual-Order Attentive Generative Adversarial Network
DSC	Dice Similarity Coefficient
DWT	Discrete Wavelet Transform
ELA	Error Level Analysis
FN	False Negatives
FP	False Positives
FPN	Feature Pyramid Network
FPR	False Positive Rates
GAN	Generative Adversarial Network
GHM	Gaussian–Hermite Moments
HOG	Histogram of Oriented Gradient
IoU	Intersection over Union
LSTM	Long Short Term Memory
MSE	Mean Squared Error
MSER	Maximally Stable Extremal Regions
PCET	Polar Complex Exponential Transform
RPN	Region Proposal Network
RoI	Region of Interest
RGB	Red Green Blue
RNNs	Recurrent Neural Networks
SIFT	Scale-Invariant Feature Transform
SURF	Speeded-Up Robust Features
SLIC	Simple Linear Iterative Clustering
STNs	Spatial Transformer Networks
SVD	Singular Value Decomposition
SVM	Support Vector Machine
TN	True Negatives
TP	True Positives
TPR	True Positive Rate
TSF	Two-Stage Filtering
UCID	Uncompressed Color Image Database

References

Priya, G.N.; Kumar, K.S.; Suganthi, N.; Muppidi, S. Squirrel Henry Gas Solubility Optimization driven Deep Maxout Network with multi-texture feature descriptors for digital image forgery detection. Concurr. Comput. Pract. Exp. 2024, 36, e7965. [Google Scholar] [CrossRef]
Kulkarni, D.; Dixit, V.V. EEG-based emotion classification model: Combined model with improved score level fusion. Biomed. Signal Process. Control. 2024, 95, 106352. [Google Scholar] [CrossRef]
Wang, J.; Gao, X.; Nie, J.; Wang, X.; Huang, L.; Nie, W.; Jiang, M.; Wei, Z. Strong robust copy-move forgery detection network based on layer-by-layer decoupling refinement. Inf. Process. Manag. 2024, 61, 103685. [Google Scholar] [CrossRef]
Khalil, A.H.; Ghalwash, A.Z.; Elsayed, H.A.; Salama, G.I.; Ghalwash, H.A. Enhancing digital image forgery detection using transfer learning. IEEE Access 2023, 11, 21435–21450. [Google Scholar] [CrossRef]
Assiri, M. Synergy of Internet of Things and Software Engineering Approach for Enhanced Copy—Move Image Forgery Detection Model. Electronics 2025, 14, 692. [Google Scholar] [CrossRef]
Badr, A.; Youssif, A.; Wafi, M. A robust copy-move forgery detection in digital image forensics using SURF. In Proceedings of the 2020 8th International Symposium Digital Forensics and Security (ISDFS), Beirut, Lebanon, 1–2 June 2020; pp. 1–6. [Google Scholar]
Jabra, S.B.; Farah, M.B. Deep learning-based watermarking techniques challenges: A review of current and future trends. Circuits Syst. Signal Process. 2024, 43, 4339–4368. [Google Scholar] [CrossRef]
Haddada, L.R.; Amara, N.E.B. Double watermarking-based biometric access control for radio frequency identification card. Int. J. RF Microw. Comput.-Aided Eng. 2019, 29, e21905. [Google Scholar]
Abir, N.A.M.; Warif, N.B.A.; Zainal, N. An automatic enhanced filters with frequency-based copy-move forgery detection for social media images. Multimed. Tools Appl. 2024, 83, 1513–1538. [Google Scholar] [CrossRef]
Mehrjardi, F.Z.; Latif, A.M.; Zarchi, M.S.; Sheikhpour, R. A survey on deep learning-based image forgery detection. Pattern Recognit. 2023, 140, 109778. [Google Scholar] [CrossRef]
Ferreira, W.D.; Ferreira, C.B.R.; Júnior, G.d.C.; Soares, F. A review of digital image forensics. Comput. Electr. Eng. 2020, 85, 106685. [Google Scholar] [CrossRef]
Pham, N.T.; Park, C.-S. Toward deep-learning-based methods in image forgery detection: A survey. IEEE Access 2023, 11, 11224–11237. [Google Scholar] [CrossRef]
Xiong, L.; Xu, J.; Yang, C.-N.; Zhang, X. CMCF-Net: An End-to-End Context Multiscale Cross-Fusion Network for Robust Copy-Move Forgery Detection. IEEE Trans. Multimed. 2024, 26, 6090–6101. [Google Scholar] [CrossRef]
Fu, G.; Zhang, Y.; Wang, Y. Image copy-move forgery detection based on fused features and density clustering. Appl. Sci. 2023, 13, 7528. [Google Scholar] [CrossRef]
Qazi, E.U.H.; Zia, T.; Almorjan, A. Deep learning-based digital image forgery detection system. Appl. Sci. 2022, 12, 2851. [Google Scholar] [CrossRef]
Meena, K.B.; Tyagi, V. Image splicing forgery detection techniques: A review. In Proceedings of the ICACDS: International Conference on Advances in Computing and Data Sciences, Nashik, India, 23–24 April 2021; pp. 364–388. [Google Scholar]
Islam, M.M.; Karmakar, G.; Kamruzzaman, J.; Murshed, M. A robust forgery detection method for copy–move and splicing attacks in images. Electronics 2020, 9, 1500. [Google Scholar] [CrossRef]
Maashi, M.; Alamro, H.; Mohsen, H.; Negm, N.; Mohammed, G.P.; Ahmed, N.; Ibrahim, S.S.; Alsaid, M.I. Modeling of Reptile Search Algorithm With Deep Learning Approach for Copy Move Image Forgery Detection. IEEE Access 2023, 11, 87297–87304. [Google Scholar] [CrossRef]
Meena, K.B.; Tyagi, V. Image forgery detection: Survey and future directions. Data Eng. Appl. 2019, 2, 163–194. [Google Scholar]
Srivastava, M.; Varshney, K. A Review of Copy-Move Image Forgery Detection. In Proceedings of the 2025 5th International Conference on Pervasive Computing and Social Networking (ICPCSN), Salem, India, 14–16 May 2025; pp. 1586–1591. [Google Scholar] [CrossRef]
Wang, Y.; Fu, H.; Wu, T. Research on the Face Forgery Detection Model Based on Adversarial Training and Disentanglement. Appl. Sci. 2024, 14, 4702. [Google Scholar] [CrossRef]
Zhang, Z.; Wang, C.; Zhou, X. A survey on passive image copy-move forgery detection. J. Inf. Process. Syst. 2018, 14, 6–31. [Google Scholar]
Sadeghi, S.; Dadkhah, S.; Jalab, H.A.; Mazzola, G.; Uliyan, D. State of the art in passive digital image forgery detection: Copy-move image forgery. Pattern Anal. Appl. 2018, 21, 291–306. [Google Scholar] [CrossRef]
Walia, S.; Kumar, K. Digital image forgery detection: A systematic scrutiny. Austral. J. Forensic Sci. 2018, 51, 488–526. [Google Scholar] [CrossRef]
Warif, N.B.A.; Idris, M.Y.I.; Wahab, A.W.A.; Ismail, N.-S.-N.; Salleh, R. A comprehensive evaluation procedure for copy-move forgery detection methods: Results from a systematic review. Multimed. Tools Appl. 2022, 81, 15171–15203. [Google Scholar] [CrossRef]
Kaur, G.; Singh, N.; Kumar, M. Image forgery techniques: A review. Artif. Intell. Rev. 2022, 56, 1577–1625. [Google Scholar] [CrossRef]
Zheng, L.; Zhang, Y.; Vrizlynn, L. A survey on image tampering and its detection in real-world photos. J. Vis. Commun. Image Represent. 2019, 58, 380–399. [Google Scholar] [CrossRef]
Niu, L.; Cong, W.; Liu, L.; Hong, Y.; Zhang, B.; Liang, J.; Zhang, L. Making images real again: A comprehensive survey on deep image composition. arXiv 2021, arXiv:2106.14490. [Google Scholar]
Roy, A.; Dixit, R.; Naskar, R.; Chakraborty, R.S. Copy-Move Forgery Detection in Digital Images—Survey and Accuracy Estimation Metrics. In Digital Image Forensics: Theory and Implementation; Springer: Singapore, 2020; pp. 27–56. [Google Scholar]
Korus, P. Digital image integrity—A survey of protection and verification techniques. Digit. Signal Process. 2017, 71, 1–26. [Google Scholar] [CrossRef]
Anushree, R.; Vinay Kumar, S.B.; Sachin, B.M. A Survey on Copy Move Forgery Detection (CMFD) Technique. In Proceedings of the 2023 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE), Bengaluru, India, 27–28 January 2023; pp. 439–443. [Google Scholar]
Camacho, I.C.; Wang, K. A comprehensive review of deep-learning-based methods for image forensics. J. Imag. 2021, 7, 69. [Google Scholar] [CrossRef]
Abidin, A.B.Z.; Majid, H.B.A.; Samah, A.B.A.; Hashim, H.B. Copy-move image forgery detection using deep learning methods: A review. In Proceedings of the 2019 6th International Conference on Research and Innovation in Information Systems (ICRIIS), Johor Bahru, Malaysia, 2–3 December 2019; pp. 1–6. [Google Scholar]
Haddada, L.R.; Rmida, F.M.; Ouarda, W.; Maalej, I.K.; Masmoudi, R.; Alimi, A.M.; Amara, N.E.B. A benchmark tetra-modal biometric score database. Biomed. Signal Process. Control 2024, 98, 106778. [Google Scholar]
Al-Qershi, O.M.; Khoo, B.E. Evaluation of copy-move forgery detection: Datasets and evaluation metrics. Multimed. Tools Appl. 2018, 77, 31807–31833. [Google Scholar] [CrossRef]
Tralic, D.; Zupancic, I.; Grgic, S.; Grgic, M. CoMoFoD—New database for copy-move forgery detection. In Proceedings of the 55th International Symposium ELMAR-2013, Zadar, Croatia, 25–27 September 2013; pp. 49–54. [Google Scholar]
Rodriguez-Ortega, Y.; Ballesteros, D.M.; Renza, D. Copy-move forgery detection (CMFD) using deep learning for image and video forensics. J. Imaging 2021, 7, 59. [Google Scholar] [CrossRef]
Dong, J.; Wang, W.; Tan, T. CASIA Image Tampering Detection Evaluation Database. In Proceedings of the 6th International Conference on Image and Graphics (ICIG), Hefei, China, 12–15 August 2011; pp. 26–31. [Google Scholar]
Gazzah, S.; Haddada, L.R.; Shallal, I.; Amara, N.E.B. Digital image forgery detection with focus on a copy-move forgery detection: A survey. In Proceedings of the 2023 International Conference on Cyberworlds (CW), Sousse, Tunisia, 3–5 October 2023; pp. 240–247. [Google Scholar]
Wen, B.; Zhu, Y.; Subramanian, R.; Ng, T.-T.; Shen, X.; Winkler, S. COVERAGE—A novel database for copy-move forgery detection. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 161–165. [Google Scholar]
Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland, 6–12 September 2014; pp. 740–755. [Google Scholar]
Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
Zhou, B.; Lapedriza, A.; Khosla, A.; Oliva, A.; Torralba, A. Places: A 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 1452–1464. [Google Scholar] [CrossRef]
Schaefer, G.; Stich, M. UCID: An uncompressed color image database. In Storage and Retrieval Methods and Applications for Multimedia; International Society for Optics and Photonics (SPIE): Bellingham, WA, USA, 2003; Volume 5307, pp. 472–480. [Google Scholar]
Liu, Z.; Luo, P.; Wang, X.; Tang, X. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 3730–3738. [Google Scholar]
Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive growing of GANs for improved quality, stability, and variation. In Proceedings of the International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
Karras, T.; Laine, S.; Aila, T. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 4401–4410. [Google Scholar]
Weber, M. The Caltech Face Database; California Institute of Technology: Pasadena, CA, USA, 1999. [Google Scholar]
Yu, B.; Li, W.; Li, X.; Lu, J.; Zhou, J. Frequency-aware spatiotemporal transformers for video inpainting detection. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 8188–8197. [Google Scholar]
Criminisi, A.; Pérez, P.; Toyama, K. Region filling and object removal by exemplar-based image inpainting. IEEE Trans. Image Process. 2004, 13, 1200–1212. [Google Scholar] [CrossRef]
Meena, K.B.; Tyagi, V. A copy-move image forgery detection technique based on tetrolet transform. J. Inf. Secur. Appl. 2020, 52, 102481. [Google Scholar] [CrossRef]
Mahmood, T.; Shah, M.; Rashid, J.; Saba, T.; Nisar, M.W.; Asif, M. A passive technique for detecting copy-move forgeries by image feature matching. Multimed. Tools Appl. 2020, 79, 31759–31782. [Google Scholar] [CrossRef]
Meena, K.B.; Tyagi, V. A copy-move image forgery detection technique based on Gaussian-Hermite moments. Multimed. Tools Appl. 2019, 78, 33505–33526. [Google Scholar] [CrossRef]
Koritala, S.P.; Chimata, M.; Polavarapu, S.N.; Vangapandu, B.S.; Manikandan, V.M. An Efficient Copy-Move Forgery Detection using Discrete Cosine Transform with Block-wise Peak-Pixel-based Block Clustering. In Proceedings of the 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kamand, India, 24–28 June 2024; pp. 1–6. [Google Scholar]
Hamidon, N.R.; Salleh, S.F.M.; Warif, N.B.A. An Analysis of Copy-Move Forgery Detection based on Discrete Cosine Transform against JPEG Compression and Brightness Changes Attack. In Proceedings of the 2024 The 1st International Conference on Cyber Security and Computing (CyberComp2024), Melaka, Malaysia, 6–7 November 2024; pp. 45–50. [Google Scholar]
Shehin, A.U.; Sankar, D. Copy move forgery detection and localisation robust to rotation using block-based discrete cosine transform and eigenvalues. J. Vis. Commun. Image Represent. 2024, 99, 104075. [Google Scholar] [CrossRef]
Ouyang, J.; Liu, Y.; Liao, M. Robust copy-move forgery detection method using pyramid model and Zernike moments. Multimedia Tools Appl. 2019, 78, 10207–10225. [Google Scholar] [CrossRef]
Alkawaz, M.H.; Sulong, G.; Saba, T.; Rehman, A. Detection of copy-move image forgery based on discrete cosine transform. Neural Comput. Appl. 2018, 30, 183–192. [Google Scholar] [CrossRef]
Wang, Y.; Kang, X.; Chen, Y. Robust and accurate detection of image copy-move forgery using PCET-SVD and histogram of block similarity measures. J. Inf. Secur. Appl. 2020, 54, 102536. [Google Scholar] [CrossRef]
Kumar, S.; Mukherjee, S.; Pal, A.K. An improved reduced feature-based copy-move forgery detection technique. Multimed. Tools Appl. 2023, 82, 1431–1456. [Google Scholar] [CrossRef]
Chen, C.-C.; Lu, W.-Y.; Chou, C.-H. Rotational copy-move forgery detection using SIFT and region growing strategies. Multimed. Tools Appl. 2019, 78, 18293–18308. [Google Scholar] [CrossRef]
Chen, H.; Yang, X.; Lyu, Y. Copy-move forgery detection based on keypoint clustering and similar neighborhood search algorithm. IEEE Access 2020, 8, 36863–36875. [Google Scholar] [CrossRef]
Wang, C.; Zhang, Z.; Li, Q.; Zhou, X. An image copy-move forgery detection method based on SURF and PCET. IEEE Access 2019, 7, 170032–170047. [Google Scholar] [CrossRef]
Bisht, G.S.; Jain, A.; Bansla, V.; Sharma, K.; Bhutia, R.; Kumar, V. Enhanced keypoint-based approach for identifying copy-move forgery in digital images. In Proceedings of the 2024 7th International Conference on Contemporary Computing and Informatics (IC3I), Greater Noida, India, 18–20 September 2024; pp. 1101–1106. [Google Scholar]
Yang, J.; Liang, Z.; Gan, Y.; Zhong, J. A novel copy-move forgery detection algorithm via two-stage filtering. Digit. Signal Process. 2021, 113, 103032. [Google Scholar] [CrossRef]
Lin, C.; Lu, W.; Huang, X.; Liu, K.; Sun, W.; Lin, H. Region duplication detection based on hybrid feature and evaluative clustering. Multimed. Tools Appl. 2019, 78, 20739–20763. [Google Scholar] [CrossRef]
Bilal, M.; Habib, H.A.; Mehmood, Z.; Saba, T.; Rashid, M. Single and multiple copy–move forgery detection and localization in digital images based on the sparsely encoded distinctive features and DBSCAN clustering. Arab. J. Sci. Eng. 2020, 45, 2975–2992. [Google Scholar] [CrossRef]
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
Hegazi, A.; Taha, A.; Selim, M.M. An improved copy-move forgery detection based on density-based clustering and guaranteed outlier removal. J. King Saud Univ. Comput. Inf. Sci. 2021, 33, 1055–1063. [Google Scholar] [CrossRef]
Diwan, A.; Sharma, R.; Roy, A.K.; Mitra, S.K. Keypoint based comprehensive copy-move forgery detection. IET Image Process. 2021, 15, 1298–1309. [Google Scholar] [CrossRef]
Bilal, M.; Habib, H.A.; Mehmood, Z.; Yousaf, R.M.; Saba, T.; Rehman, A. A robust technique for copy-move forgery detection from small and extremely smooth tampered regions based on the DHE-SURF features and mDBSCAN clustering. Aust. J. Forensic Sci. 2021, 53, 459–482. [Google Scholar] [CrossRef]
Wang, X.; Zhang, H.; Wang, D.; Niu, P. Adaptive copy move forgery detection based on new keypoint feature and matching. Appl. Intell. 2025, 55, 1–24. [Google Scholar] [CrossRef]
Alhaidery, M.M.A.; Taherinia, A.H. A passive image forensic scheme based on adaptive and hybrid techniques. Multimed. Tools Appl. 2022, 81, 12681–12699. [Google Scholar] [CrossRef]
Alhaidery, M.M.A.; Taherinia, A.H.; Yazdi, H.S. Cloning detection scheme based on linear and curvature scale space with new FP removal filters. Multimed. Tools Appl. 2022, 81, 8745–8766. [Google Scholar] [CrossRef]
Vaishnavi, V.; Shreya, S.; Yugender, K.; Reddy, K.B.S.; Sreelakshmi, R. A Novel Approach to Detect Copy-Move Forgery Using Deep Learning. J. Comput. Allied Intell. JCAI 2025, 3. [Google Scholar] [CrossRef]
Narayanan, S.S.; Gopakumar, G. Recursive block based keypoint matching for copy move image forgery detection. In Proceedings of the 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kharagpur, India, 1–3 July 2020; pp. 1–6. [Google Scholar]
Sunitha, K.; Krishna, A.N. Efficient keypoint based copy move forgery detection method using hybrid feature extraction. In Proceedings of the 2020 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), Bangalore, India, 5–7 March 2020; pp. 670–675. [Google Scholar]
Shallal, I.; Haddada, L.R.; Amara, N.E.B. Enhanced Detection of Copy-Move Forgery by Fusing Scores From Handcrafted and Deep Learning-Based Detection Systems. In Proceedings of the 2024 17th International Conference on Development in eSystem Engineering (DeSE), Khorfakkan, United Arab Emirates, 6–8 November 2024; pp. 113–118. [Google Scholar]
Shallal, I.; Haddada, L.R.; Amara, N.E.B. Lightweight Hybrid Model Combining MobileNetV2 and PCA for Copy-Move Forgery Detection. In Proceedings of the 2024 17th International Conference on Development in eSystem Engineering (DeSE), Khorfakkan, United Arab Emirates, 6–8 November 2024; pp. 107–112. [Google Scholar]
Nirmal Jothi, J.; Letitia, S. Tampering detection using hybrid local and global features in wavelet-transformed space with digital images. Soft Comput. 2020, 24, 5427–5443. [Google Scholar] [CrossRef]
Lin, C.; Lu, W.; Huang, X.; Liu, K.; Sun, W.; Lin, H.; Tan, Z. Copy-move forgery detection using combined features and transitive matching. Multimed. Tools Appl. 2019, 78, 30081–30096. [Google Scholar] [CrossRef]
Meena, K.B.; Tyagi, V. A hybrid copy-move image forgery detection technique based on Fourier-Mellin and scale invariant feature transforms. Multimed. Tools Appl. 2020, 79, 8197–8212. [Google Scholar] [CrossRef]
Niyishaka, P.; Bhagvati, C. Copy-move forgery detection using image blobs and BRISK feature. Multimed. Tools Appl. 2020, 79, 26045–26059. [Google Scholar] [CrossRef]
Jaiswal, A.K.; Gupta, D.; Srivastava, R. Detection of copy-move forgery using hybrid approach of DCT and BRISK. In Proceedings of the 7th International Conference on Signal Processing and Integrated Networks (SPIN), Noida, India, 27–28 February 2020; pp. 471–476. [Google Scholar]
Sushir, R.D.; Wakde, D.G.; Bhutada, S.S. Enhanced blind image forgery detection using an accurate deep learning based hybrid DCCAE and ADFC. Multimed. Tools Appl. 2024, 83, 1725–1752. [Google Scholar] [CrossRef]
Diwan, A.; Roy, A.K. CNN-keypoint based two-stage hybrid approach for copy-move forgery detection. IEEE Access 2024, 12, 43809–43826. [Google Scholar] [CrossRef]
Kuznetsov, O.; Frontoni, E.; Romeo, L.; Rosati, R. Enhancing copy-move forgery detection through a novel CNN architecture and comprehensive dataset analysis. Multimed. Tools Appl. 2024, 83, 59783–59817. [Google Scholar] [CrossRef]
Wang, J.; Nie, J.; Jing, N.; Liang, X.; Wang, X.; Chi, C.-H.; Wei, Z. Copy-move forgery image detection based on cross-scale modeling and alternating refinement. IEEE Trans. Multimed. 2025, 27, 5452–5465. [Google Scholar] [CrossRef]
Dell’Olmo, P.V.; Kuznetsov, O.; Frontoni, E.; Arnesano, M.; Napoli, C.; Randieri, C. Dataset dependency in CNN-based copy-move forgery detection: A multi-dataset comparative analysis. Mach. Learn. Knowl. Extr. 2025, 7, 54. [Google Scholar] [CrossRef]
Dar, N.A.; Atsya, R.; Kumar, A. Examining Deep Learning Models for Reliable Detection of Copy-Move Forgeries in Digital Images. In Proceedings of the 2025 12th International Conference on Computing for Sustainable Global Development (INDIACom), Delhi, India, 2–4 April 2025; pp. 1–6. [Google Scholar]
Farhan, M.H.; Shaker, K.; Al-Janabi, S. Efficient approach for the localization of copy-move forgeries using PointRend with RegNetX. Baghdad Sci. J. 2024, 21, 1416. [Google Scholar] [CrossRef]
Li, Y.; He, Y.; Chen, C.; Dong, L.; Li, B.; Zhou, J.; Li, X. Image Copy-Move Forgery Detection via Deep PatchMatch and Pairwise Ranking Learning. IEEE Trans. Image Process. 2025, 34, 425–440. [Google Scholar] [CrossRef]
Hammad, B.T.; Ahmed, I.T.; Jamil, N. An secure and effective copy move detection based on pretrained model. In Proceedings of the 2022 IEEE 13th Control and System Graduate Research Colloquium (ICSGRC), Shah Alam, Malaysia, 23 July 2022; pp. 66–70. [Google Scholar]
Devi, K.J.; Dinesh, J.; Neha, K.L.; Teja, K.V.N.S.R.; Jayashree, M.; Devi, J.R. A Novel Approach to Enhancing Identity Document Authentication with Copy-Move Forgery Detection using CNN. In Proceedings of the 2024 4th Interdisciplinary Conference on Electrics and Computer (INTCEC 2024), Chicago, IL, USA, 11–13 June 2024; pp. 1–6. [Google Scholar]
Hosny, K.M.; Mortda, A.M.; Fouda, M.M.; Lashin, N.A. An Efficient CNN Model to Detect Copy-Move Image Forgery. IEEE Access 2022, 10, 48622–48632. [Google Scholar] [CrossRef]
Goel, N.; Kaur, S.; Bala, R. Dual branch convolutional neural network for copy move forgery detection. IET Image Process. 2021, 15, 656–665. [Google Scholar] [CrossRef]
Hebbar, N.K.; Kunte, A.S. Transfer learning approach for splicing and copy-move image tampering detection. ICTACT J. Image Video Process. 2021, 11, 2447–2452. [Google Scholar] [CrossRef]
Muzaffer, G.; Ulutas, G. A new deep learning-based method to detection of copy-move forgery in digital images. In Proceedings of the 2019 Scientific Meeting on Electrical-Electronics & Biomedical Engineering and Computer Science (EBBT), Istanbul, Turkey, 24–26 April 2019; pp. 1–4. [Google Scholar]
Rajini, N.H. Image forgery identification using convolution neural network. Int. J. Recent Technol. Eng. 2019, 8, 311–320. [Google Scholar]
Shi, Y.; Weng, S.; Yu, L.; Li, L. A Copy-Move Forgery Detection Network Based on Selective Sampling Attention and Low-Cost Two-Step Self-Correlation Calculation. IEEE Trans. Multimed. 2025, 27, 4084–4094. [Google Scholar] [CrossRef]
Niu, Y.; Wu, X.; Liu, C. Recursive Wavelet Transform Network for Robust Copy-Move Forgery Detection. Neurocomputing 2025, 641, 130373. [Google Scholar] [CrossRef]
Wang, X.; Wang, H.; Niu, S.; Zhang, J. Detection and localization of image forgeries using improved mask regional convolutional neural network. Math. Biosci. Eng. 2019, 16, 4581–4593. [Google Scholar] [CrossRef]
Lin, T.-Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
Zhou, P.; Han, X.; Morariu, V.I.; Davis, L.S. Learning rich features for image manipulation detection. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 1053–1061. [Google Scholar]
Yang, C.; Li, H.; Lin, F.; Jiang, B.; Zhao, H. Constrained R-CNN: A general image manipulation detection model. In Proceedings of the 2020 IEEE International Conference on Multimedia and Expo (ICME), London, UK, 6–10 July 2020; pp. 1–6. [Google Scholar]
Nazir, T.; Nawaz, M.; Masood, M.; Javed, A. Copy move forgery detection and segmentation using improved mask region-based convolution network (RCNN). Appl. Soft Comput. 2022, 131, 109778. [Google Scholar] [CrossRef]
Biach, F.Z.E.; Iala, I.; Laanaya, H.; Minaoui, K. Encoder-decoder based convolutional neural networks for image forgery detection. Multimed. Tools Appl. 2022, 81, 22611–22628. [Google Scholar] [CrossRef]
Chen, H.; Chang, C.; Shi, Z.; Lyu, Y. Hybrid features and semantic reinforcement network for image forgery detection. Multimed. Syst. 2022, 28, 363–374. [Google Scholar] [CrossRef]
Lu, M.; Niu, S. A detection approach using LSTM-CNN for object removal caused by exemplar-based image inpainting. Electronics 2020, 9, 858. [Google Scholar] [CrossRef]
Ding, H.; Chen, L.; Tao, Q.; Fu, Z.; Dong, L.; Cui, X. DCU-Net: A dual-channel U-shaped network for image splicing forgery detection. Neural Comput. Appl. 2023, 35, 5015–5031. [Google Scholar] [CrossRef]
Jaiswal, A.K.; Srivastava, R. Detection of copy-move forgery in digital image using multi-scale, multi-stage deep learning model. Neural Process. Lett. 2022, 54, 75–100. [Google Scholar] [CrossRef]
Bappy, J.H.; Simons, C.; Nataraj, L.; Manjunath, B.S.; Roy-Chowdhury, A.K. Hybrid LSTM and encoder–decoder architecture for detection of image forgeries. IEEE Trans. Image Process. 2019, 28, 3286–3300. [Google Scholar] [CrossRef]
Abhishek; Jindal, N. Copy move and splicing forgery detection using deep convolution neural network, and semantic segmentation. Multimed. Tools Appl. 2021, 80, 3571–3599. [Google Scholar] [CrossRef]
Islam, A.; Long, C.; Basharat, A.; Hoogs, A. DOA-GAN: Dual-order attentive generative adversarial network for image copy-move forgery detection and localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 4676–4685. [Google Scholar]
Abdalla, Y.; Iqbal, M.T.; Shehata, M. Copy-move forgery detection and localization using a generative adversarial network and convolutional neural-network. Information 2019, 10, 286. [Google Scholar] [CrossRef]
Chen, J.; Shen, Y.; Ali, R. Credit card fraud detection using sparse autoencoder and generative adversarial network. In Proceedings of the IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, Canada, 1–3 November 2018; pp. 1054–1059. [Google Scholar]
Yarlagadda, S.K.; Güera, D.; Bestagini, P.; Zhu, F.M.; Tubaro, S.; Delp, E.J. Satellite image forgery detection and localization using GAN and one-class classifier. arXiv 2018, arXiv:1802.04881. [Google Scholar] [CrossRef]
Bappy, J.H.; Roy-Chowdhury, A.K.; Bunk, J.; Nataraj, L.; Manjunath, B.S. Exploiting spatial structure for localizing manipulated image regions. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4970–4979. [Google Scholar]
Elaskily, M.A.; Alkinani, M.H.; Sedik, A.; Dessouky, M.M. Deep learning based algorithm (ConvLSTM) for copy move forgery detection. J. Intell. Fuzzy Syst. 2021, 40, 4385–4405. [Google Scholar] [CrossRef]
Liang, E.; Zhang, K.; Hua, Z.; Li, Y.; Jia, X. TransCMFD: An adaptive transformer for copy-move forgery detection. Neurocomputing 2025, 638, 130110. [Google Scholar] [CrossRef]
Bharathi, T.; Reddy, V.L. Quantifying Deep Learning Based Image Forgery Detection System: A Survey. In Proceedings of the 2nd International Conference on Artificial Intelligence Trends and Pattern Recognition (ICAITPR), Hyderabad, India, 6–8 December 2024; pp. 1–8. [Google Scholar]
Goel, A.; Chakraverti, A.K.; Gupta, K. A Comprehensive Analysis of Copy-Move Forgery Detection Method. In Proceedings of the 2025 4th OPJU International Technology Conference (OTCON) on Smart Computing for Innovation and Advancement in Industry 5.0, Raigarh, India, 9–11 April 2025; pp. 1–5. [Google Scholar]
Agarwal, R.; Verma, O.P. Robust copy-move forgery detection using modified superpixel based FCM clustering with emperor penguin optimization and block feature matching. Evol. Syst. 2022, 13, 27–41. [Google Scholar] [CrossRef]
Ibrahim, Z.S.; Hasan, T.M. Copy-Move Image Forgery Detection Using Deep Learning Approaches: An Abbreviated Survey. Bilad Alrafidain J. Eng. Sci. Technol. 2025, 4, 137–154. [Google Scholar] [CrossRef]
Hong, S.; Yue, T.; You, Y.; Lv, Z.; Tang, X.; Hu, J.; Yin, H. A Resilience Recovery Method for Complex Traffic Network Security Based on Trend Forecasting. Int. J. Intell. Syst. 2025, 40, 3715086. [Google Scholar] [CrossRef]

Figure 1. Hierarchy adopted for CMFD survey.

Figure 2. Classification of forgery detection research from 2020 to 2025.

Figure 3. Classification of CMFD methods from 2020 to 2025.

Figure 4. CMFD applications.

Figure 5. An overview of CMFD techniques.

Table 1. Comparison of CMFD metric types, objectives and limitations.

Metric Type	Objective	Limitations
Image-level metrics	Evaluating whether an entire image is forged or authentic (e.g., Accuracy, Precision, Recall, F1-score)	Do not provide information about the location or shape of forgery Can mask small or partial manipulations
Pixel-level metrics	Assessing spatial accuracy of detected forgery at the pixel level (e.g., IoU and DSC)	Require precise ground-truth masks Sensitive to annotation inconsistencies Computationally expensive for large images
Robustness metrics	Quantifying performance under perturbations or real-world conditions (e.g., MDR under compression, noise and geometric distortions)	Dependent on the type and severity of perturbations Cannot fully reflect generalization to unseen manipulations Choice of degradation levels affects interpretation

Table 3. Overview of CMFD methods utilizing keypoint-based methods.

Ref/Year	Dataset	Techniques	Performance
[66] (2019)	FAU MICC-F600	Harris Laplace, Hessian Laplace, SIFT, G2NN, RANSAC, Bag of Words	Robust against geometric transformations and affine matrix distortions but struggles with complex background textures.
[67] (2019)	CoMoFoD MICC-F200 MICC-F220	RGB to gray conversion, Discrete Wavelet Transform (DWT), 2-Nearest Neighbor (2NN) search, DBSCAN clustering, RANSAC refinement	Resilient to geometric variations yet suffers from significant complexity. Reliable in high-keypoint regions but prone to duplicate region detection.
[62] (2020)	GRIP FAU	SIFT, G2NN, HAC, J-Linkage	Robust against image transformations but computationally expensive. Effective in detecting minor mismatches.
[68] (2020)	CoMoFoD CMH CIVF COVERAGE	SIFT + LBP, 2NN, correlation coefficient with thresholding, RANSAC	Highly robust against geometric distortions and noise. Performance affected by method selection and keypoint redundancy.
[29] (2020)	COVERAGE	SIFT, Rotated LBP, G2NN, hierarchical clustering, RANSAC	Strong against geometric transformations but sensitive to illumination variations.
[69] (2021)	MICC-F220	Contrast-limited adaptive histogram equalization (CLAHE), keypoint extraction, RANSAC	Reduces noise and improves keypoint detection, but additional filtering is needed for complex backgrounds.
[70] (2021)	CoMoFoD CMFD COVERAGE MICC-F200 MICC-F220	SIFT, 2NN, spectral clustering, geometric constraints, RANSAC	Robust against geometric transformations and affine distortions, with high precision in detecting forgery regions.
[71] (2021)	CoMoFoD MICC-F220 MICC-F200	Dynamic Histogram Equalization (DHE), SIFT, keypoint attraction, geometric constraints	DHE preprocessing enhances keypoint detection, improving robustness against transformations.
[72] (2025)	GRIP, MICC-F600	Adaptive CMFD using uniform keypoints extraction (SLIC + MDML-DCP), AQFPJFM + GLTP descriptors, ITQ-PTH matching	Offers improved robustness to geometric and illumination variations, balanced precision–recall performance, and faster large-scale matching compared to conventional methods.

Table 4. Overview of CMFD methods utilizing hybrid approaches.

Ref	Year	Dataset	Techniques	Performance
[63]	2019	GRIP, FAU	Image partitioning into irregular blocks (smooth and texture regions), SURF+PCET feature extraction, Improved G2NN matching, RANSAC and morphological post-processing	Robust to geometric transformations, blurring, JPEG compression, and noise. Detects forgery in visually similar regions. Difficult parameter tuning
[81]	2019	IMD	SIFT + Local Intensity Order Pattern (LIOP) for feature extraction, G2NN + transitive matching, SLIC segmentation, RANSAC	Resistant to geometric transformations, noise, and JPEG compression. Effective for cloned regions with few key points
[76]	2020	MICC-F600	Non-overlapping block division, SIFT feature extraction, Euclidean distance matching, Morphological operations	Robust to geometric transformations. Struggles with smooth images and small forgery within single blocks
[77]	2020	MICC-F220	Square block division, SIFT + SURF feature extraction, Hierarchical clustering with spatial distances, RANSAC refinement	Handles geometric transformations well. Effective for smooth images and small forgery with SIFT and SURF
[82]	2020	GRIP, IMD	Image segmentation into smooth and texture parts, FMT and SIFT-based feature extraction, Patch match and G2NN matching, Dense Linear Fitting and RANSAC filtering	Robust to scaling, noise, and JPEG compression
[83]	2020	MICC-F220, MICC-F8multi, CoMoFod	Grayscale conversion, Sobel edge detection, DoG blob detection, BRISK feature extraction, Hamming distance matching, RANSAC filtering	Resistant to geometric transformations and post-processing operations. Detects multiple forgery regions with reduced computational complexity
[84]	2020	CoMoFod	Grayscale conversion, DCT and BRISK-based feature extraction, Euclidean and FLANN-based matching, clustering via Euclidean distance	Maintains stability under geometric changes, blur, and noise, but struggles in detecting forgery in smooth regions and after advanced post-processing
[85]	2024	CASIA V1, Coverage, GRIP	WE-CLAHE pre-processing, Hybrid DTT+VGGNet feature extraction, IHH-based dimensionality reduction, DCCAE classification, ADFC segmentation	Robust to geometric transformations and achieves high accuracy
[86]	2024	Custom	CenSurE keypoint detection combined with CNN-based feature learning for CMFD and localization	Robust against geometric transformations, compression, noise, and various post-processing operations. Performs well on both smooth and textured images

Table 9. Overview of forgery detection studies using RNNs.

Ref	Year	Dataset	Techniques	Performance
[118]	2017	COVERAGE, NIST	CNN with LSTM	This end-to-end network detects all types of image forgery
[112]	2019	NIST16, COVERAGE	(CNN with LSTM) Encoder and Decoder	This method combines spatial and frequency information for enhanced performance, detecting all types of image forgery
[119]	2021	MICC-F220, MICC-F600, MICC-F2000, SATs-130	CNN with ConvLSTM	The ConvLSTM-CNN hybrid enhances performance, while combining four datasets improves generalization and reduces overfitting

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shallal, I.; Rzouga Haddada, L.; Essoukri Ben Amara, N. Image Forgery Detection with Focus on Copy-Move: An Overview, Real World Challenges and Future Directions. Appl. Sci. 2025, 15, 11774. https://doi.org/10.3390/app152111774

AMA Style

Shallal I, Rzouga Haddada L, Essoukri Ben Amara N. Image Forgery Detection with Focus on Copy-Move: An Overview, Real World Challenges and Future Directions. Applied Sciences. 2025; 15(21):11774. https://doi.org/10.3390/app152111774

Chicago/Turabian Style

Shallal, Issam, Lamia Rzouga Haddada, and Najoua Essoukri Ben Amara. 2025. "Image Forgery Detection with Focus on Copy-Move: An Overview, Real World Challenges and Future Directions" Applied Sciences 15, no. 21: 11774. https://doi.org/10.3390/app152111774

APA Style

Shallal, I., Rzouga Haddada, L., & Essoukri Ben Amara, N. (2025). Image Forgery Detection with Focus on Copy-Move: An Overview, Real World Challenges and Future Directions. Applied Sciences, 15(21), 11774. https://doi.org/10.3390/app152111774

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Image Forgery Detection with Focus on Copy-Move: An Overview, Real World Challenges and Future Directions

Abstract

1. Introduction

2. Review of Reviews

3. Benchmarking CMFD: Datasets and Evaluation Metrics

3.1. Datasets

3.1.1. MICC Dataset

3.1.2. CoMoFoD Dataset

3.1.3. CASIA Dataset

3.1.4. COVERAGE Dataset

3.1.5. Inpainting and General Purpose Datasets

3.1.6. Critical Analysis and Dataset Suitability Criteria

3.2. Evaluation Metrics

4. Applications and Real-World Domains

5. Comprehensive Overview of CMFD Techniques

5.1. Conventional CMFD Techniques

5.2. Hybrid CMFD Techniques

5.3. Deep-Learning-Based CMFD Methods

6. Challenges and Future Directions

7. Discussion

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI