Copy-Move Forgery Detection Using Scale Invariant Feature and Reduced Local Binary Pattern Histogram

: Because digitized images are easily replicated or manipulated, copy-move forgery techniques are rendered possible with minimal expertise. Furthermore, it is di ﬃ cult to verify the authenticity of images. Therefore, numerous e ﬀ orts have been made to detect copy-move forgeries. In this paper, we present an improved region duplication detection algorithm based on the keypoints. The proposed algorithm utilizes the scale invariant feature transform (SIFT) and the reduced local binary pattern (LBP) histogram. The LBP values with 256 levels are obtained from the local window centered at the keypoint, which are then reduced to 10 levels. For a keypoint, a 138-dimensional is generated to detect copy-move forgery. We test the proposed algorithm on various image datasets and compare the detection accuracy with those of existing methods. The experimental results demonstrate that the performance of the proposed scheme is superior to that of other tested copy-move forgery detection methods. Furthermore, the proposed method exhibits a uniform detection performance for various types of test datasets.


Introduction
Copy-move forgery (CMF) is a popular image tampering method, wherein a portion of an image is copied from one section of the image and is pasted elsewhere in the same image. An image can be forged to conceal or change its meaning by the copy-move process. Therefore, it is important to verify the authenticity of the image and localize the copied and moved regions. Because the copied portion of an image is generally scaled or rotated, it is difficult to verify the authenticity of the image based on visual inspection alone. For this reason, the development of reliable copy-move forgery detection (CMFD) methods has become an important issue [1][2][3]. The common framework of CMFD comprises five steps: preprocessing (optional), feature extraction, matching, false match removal, and localization.
The optional first step of the CMFD process is preprocessing. In this step, the conversion of RGB color channels to grayscale [4,5] is often exploited to reduce the dimensionality of the input images. Alternately, RGB colors can be converted into the YCbCr [6,7] or the HSV [8] color space to use both the luminance and chrominance information. Various block division and segmentation methods can be considered for use in preprocessing. An image can be divided into overlapping square blocks [9][10][11], non-overlapping square blocks [12], or circular blocks [13,14]. Image segmentation techniques [15,16] are often included in the preprocessing step to separate the copied source region from the pasted target region. a 4 × 4 sub-window are used to generate an 8-dimensional descriptor for each sub-window. Next, the descriptors in all 16 sub-windows are arranged in series to create the final 128-dimensional descriptor. The limitation of the conventional descriptor generation method is that it only provides local information about a single keypoint. Because this method cannot represent global information around the keypoint, it may be difficult to cope with pixel changes, such as compression or differences in the background area caused by the copy-move process.
In this paper, we present an improved CMFD algorithm by adding a new descriptor. The proposed additional descriptor based on the LBP feature is capable of capturing global information associated with the keypoints. LBP is acknowledged as one of the features that is sufficiently robust to handle small pixel changes. For this reason, the proposed descriptor is generated using a histogram of LBP values for a 16 × 16 window centered on the keypoint. The LBP values are not generated at every pixel in the image but only for keypoints by the SIFT. Because a typical LBP has 256 levels for a pixel, we reduce the LBP levels to 10 to prevent an increase in the descriptor dimension. In total, the proposed descriptor for a keypoint has 138 dimensions. By means of experiments using various test datasets, we demonstrate that the proposed method generates more accurate estimation results in detecting CMFs than conventional methods.
The remainder of this paper is organized as follows. Section 2 introduces the basic process of the SIFT-based CMFD method and its limitations. Section 3 presents the proposed CMFD algorithm. In Section 4, the performance of the proposed method is compared with those of existing methods using the experimental results. Section 5 presents the discussion, and the conclusion is presented in Section 6. Figure 1 illustrates the basic workflow for the SIFT-based CMFD. For a given image I, SIFT features are extracted at different scales using a scale-space representation by an image pyramid. Potential keypoints are selected using scale-space extrema. All the potential keypoints are further refined according to a contrast and an edge threshold. This process is used to eliminate unstable keypoints in the SIFT algorithm. In the next step, an orientation is assigned to each keypoint to achieve invariance to image rotation. The gradient magnitude and direction are calculated in a local window centered at the SIFT keypoint. An orientation histogram with 36 bins covering 360 degrees is created. The highest peak in the histogram is considered as the dominant orientation. Furthermore, any peak above 80% of the highest peak is also considered while calculating the main orientation. descriptors in all 16 sub-windows are arranged in series to create the final 128-dimensional descriptor. The limitation of the conventional descriptor generation method is that it only provides local information about a single keypoint. Because this method cannot represent global information around the keypoint, it may be difficult to cope with pixel changes, such as compression or differences in the background area caused by the copy-move process.

SIFT Descriptor
In this paper, we present an improved CMFD algorithm by adding a new descriptor. The proposed additional descriptor based on the LBP feature is capable of capturing global information associated with the keypoints. LBP is acknowledged as one of the features that is sufficiently robust to handle small pixel changes. For this reason, the proposed descriptor is generated using a histogram of LBP values for a 16 × 16 window centered on the keypoint. The LBP values are not generated at every pixel in the image but only for keypoints by the SIFT. Because a typical LBP has 256 levels for a pixel, we reduce the LBP levels to 10 to prevent an increase in the descriptor dimension. In total, the proposed descriptor for a keypoint has 138 dimensions. By means of experiments using various test datasets, we demonstrate that the proposed method generates more accurate estimation results in detecting CMFs than conventional methods.
The remainder of this paper is organized as follows. Section 2 introduces the basic process of the SIFT-based CMFD method and its limitations. Section 3 presents the proposed CMFD algorithm. In Section 4, the performance of the proposed method is compared with those of existing methods using the experimental results. Section 5 presents the discussion, and the conclusion is presented in Section 6. Figure 1 illustrates the basic workflow for the SIFT-based CMFD. For a given image I, SIFT features are extracted at different scales using a scale-space representation by an image pyramid. Potential keypoints are selected using scale-space extrema. All the potential keypoints are further refined according to a contrast and an edge threshold. This process is used to eliminate unstable keypoints in the SIFT algorithm. In the next step, an orientation is assigned to each keypoint to achieve invariance to image rotation. The gradient magnitude and direction are calculated in a local window centered at the SIFT keypoint. An orientation histogram with 36 bins covering 360 degrees is created. The highest peak in the histogram is considered as the dominant orientation. Furthermore, any peak above 80% of the highest peak is also considered while calculating the main orientation. To generate a 128-dimensional descriptor for a keypoint, a 16 × 16 local window centered at the keypoint is considered. This window is divided into 16 sub-blocks each of size 4 × 4 pixels. For each sub-block, an 8-bin orientation histogram is created. A total of 128-bin values are obtained, and they are represented as a vector to form the keypoint descriptor. Let {k1, k2, …, km} be the m-keypoints extracted using the SIFT for the given image, I. Keypoint descriptors {f1, f2, …, fm} corresponding to {k1, k2, …, km} are generated using the above procedures. Figure 2 shows the generation method of a descriptor fi for a keypoint ki at the image location (xi,yi). As shown in Figure 2, the SIFT descriptor is represented as a list of gradients for the main direction in 16 4 × 4 sub-blocks around the keypoint. However, simply listing local gradients in this manner can tend to degrade the matching performance as the pixel values around the keypoints change. To generate a 128-dimensional descriptor for a keypoint, a 16 × 16 local window centered at the keypoint is considered. This window is divided into 16 sub-blocks each of size 4 × 4 pixels. For each sub-block, an 8-bin orientation histogram is created. A total of 128-bin values are obtained, and they are represented as a vector to form the keypoint descriptor. Let {k 1 , k 2 , . . . , k m } be the m-keypoints extracted using the SIFT for the given image, I. Keypoint descriptors {f 1 , f 2 , . . . , f m } corresponding to {k 1 , k 2 , . . . , k m } are generated using the above procedures. Figure 2 shows the generation method of a descriptor f i for a keypoint k i at the image location (x i ,y i ). As shown in Figure 2, the SIFT descriptor is represented as a list of gradients for the main direction in 16 4 × 4 sub-blocks around the keypoint. However, simply listing local gradients in this manner can tend to degrade the matching performance as the pixel values around the keypoints change.

Matching
If a pixel of an image is copied and moved, the keypoint descriptor of that pixel location, fi needs to be compared with all the other descriptors other than fi represented as fj≠i. This process is called matching. Let {d1,i, d2,i, …, dm−1,i} be the set of distances from fi and m−1 descriptors other than fi. At this point, each distance has the following relationship, The keypoint fi is determined to be matched with the key point at distance d1,i if the following relation is satisfied, that is where t is the threshold defined as t∈ (1,0). In this paper, t is set to 0.65. However, a false matching almost always exists after the completion of the matching process. Therefore, a range of methods has been studied with the aim of eliminating false matching. Mismatched keypoint pairs have been eliminated using various clustering algorithms, such as Jlinage clustering [31], distance-based clustering [38], and hierarchical clustering [39,40]. In particular, the RANSAC algorithm is the most frequently used to eliminate false matches [15,31,35,36,[39][40][41]. However, these algorithms can only eliminate the false matches but cannot generate the correct matched keypoint pairs. Figure 3 illustrates an example of a false match based on the conventional SIFT descriptor [36], wherein a keypoint exists close to the boundary of the copy-moved portion and the authentic region. In Figure 3, a keypoint kA has been moved to kB. For a correct match, kA and kB should be matched by Equation (2). However, the keypoint kB is detected as the potentially matched keypoint with kA because kB has the closest distance (d1,A = 0.2684) to kA. In contrast, the distance between kA and kC is calculated as d2,A = 0.2721, and no matching occurs because d1,A/d2,A exceeds the threshold t. This is because the background (16 × 16 window centered at kA) of kA is different from that of kB. This situation may also occur if the image is compressed after CMF.

Matching
If a pixel of an image is copied and moved, the keypoint descriptor of that pixel location, f i needs to be compared with all the other descriptors other than f i represented as f j i . This process is called matching. Let {d 1,i , d 2,i , . . . , d m−1,i } be the set of distances from f i and m−1 descriptors other than f i . At this point, each distance has the following relationship, The keypoint f i is determined to be matched with the key point at distance d 1,i if the following relation is satisfied, that is where t is the threshold defined as t ∈ (1,0). In this paper, t is set to 0.65. However, a false matching almost always exists after the completion of the matching process. Therefore, a range of methods has been studied with the aim of eliminating false matching. Mismatched keypoint pairs have been eliminated using various clustering algorithms, such as J-linage clustering [31], distance-based clustering [38], and hierarchical clustering [39,40]. In particular, the RANSAC algorithm is the most frequently used to eliminate false matches [15,31,35,36,[39][40][41]. However, these algorithms can only eliminate the false matches but cannot generate the correct matched keypoint pairs. Figure 3 illustrates an example of a false match based on the conventional SIFT descriptor [36], wherein a keypoint exists close to the boundary of the copy-moved portion and the authentic region. In Figure 3, a keypoint k A has been moved to k B . For a correct match, k A and k B should be matched by Equation (2). However, the keypoint k B is detected as the potentially matched keypoint with k A because k B has the closest distance (d 1,A = 0.2684) to k A . In contrast, the distance between k A and k C is calculated as d 2,A = 0.2721, and no matching occurs because d 1,A /d 2,A exceeds the threshold t. This is because the background (16 × 16 window centered at k A ) of k A is different from that of k B . This situation may also occur if the image is compressed after CMF.

Proposed Method
In this paper, we propose an improved CMFD method by adding an LBP histogram-based descriptor. The LBP value is obtained for every pixel in the 16 × 16 window centered on the keypoint. The 256 LBP levels are reduced to 10 levels, and their histogram is used as a new descriptor. The main contribution of the proposed method is that the histogram of the reduced LBP values corresponding to a keypoint is used as an additional descriptor to increase the matched keypoint pairs.

Reduced LBP Feature
The LBP is one of the general features used to extract the texture information of an image. Owing to its discriminative performance and computational simplicity, the LBP feature is widely used in various applications. For a pixel in an image located at (p,q), the LBP value is calculated as follows.
where L(p,q) is the LBP value of the center pixel located at (p,q), In is the intensity of the neighboring pixel, I(p,q) is the intensity of the pixel at (p,q), and N is the number of neighboring pixels chosen at a given radius. In this paper, we use N = 8 (3 × 3 local window centered at (p,q)), which generates am 8-bit LBP value. In Equation (3), s(x) is defined as Let Ω(xi,yi) be the set of pixels that exist in a 16 × 16 local window centered at the keypoint ki location, (xi,yi). Using all the pixels at (p,q)∈Ω(xi,yi), LBP values are calculated as depicted in Figure 4.

Proposed Method
In this paper, we propose an improved CMFD method by adding an LBP histogram-based descriptor. The LBP value is obtained for every pixel in the 16 × 16 window centered on the keypoint. The 256 LBP levels are reduced to 10 levels, and their histogram is used as a new descriptor. The main contribution of the proposed method is that the histogram of the reduced LBP values corresponding to a keypoint is used as an additional descriptor to increase the matched keypoint pairs.

Reduced LBP Feature
The LBP is one of the general features used to extract the texture information of an image. Owing to its discriminative performance and computational simplicity, the LBP feature is widely used in various applications. For a pixel in an image located at (p,q), the LBP value is calculated as follows.
where L(p,q) is the LBP value of the center pixel located at (p,q), I n is the intensity of the neighboring pixel, I(p,q) is the intensity of the pixel at (p,q), and N is the number of neighboring pixels chosen at a given radius. In this paper, we use N = 8 (3 × 3 local window centered at (p,q)), which generates am 8-bit LBP value. In Equation (3), s(x) is defined as Let Ω(x i ,y i ) be the set of pixels that exist in a 16 × 16 local window centered at the keypoint k i location, (x i ,y i ). Using all the pixels at (p,q) ∈ Ω(x i ,y i ), LBP values are calculated as depicted in Figure 4. The SIFT feature already contains a considerable amount of information for detecting CMFs. Therefore, it is not necessary to use all the LBP information. An LBP can be classified into two categories. If the binary pattern contains at most two 0→1 or 1→0 transitions, it is called a uniform pattern. For example, 00110000 (2 transitions) is a uniform pattern, but 01010100 (6 transitions) is a non-uniform pattern. Among the 256 LBP values, 58 LBP values are uniform, and the rest are non-uniform. A uniform pattern is characterized by a series of consecutive 1's. Let L c (p,q) (c = 0, 1, 2, . . . , 8) be the LBP with c consecutive 1's. Obviously, L 0 (p,q) = 00000000, and L 8 (p,q) = 11111111. For c = 1, 2, . . . , 7, L c (p,q) has 8 binary patterns that can be all be viewed as rotationally shifted versions of a single pattern. Figure 5 shows various uniform patterns. Let L non (p,q) be any pattern that has Symmetry 2020, 12, 492 6 of 16 no consecutive 1's except for L 0 (p,q). In conclusion, the 256-level LBP values can be divided into 10 groups, that is, nine types of L c (p,q) and one L non (p,q).
where L(p,q) is the LBP value of the center pixel located at (p,q), In is the intensity of the neighboring pixel, I(p,q) is the intensity of the pixel at (p,q), and N is the number of neighboring pixels chosen at a given radius. In this paper, we use N = 8 (3 × 3 local window centered at (p,q)), which generates am 8-bit LBP value. In Equation (3), s(x) is defined as Let Ω(xi,yi) be the set of pixels that exist in a 16 × 16 local window centered at the keypoint ki location, (xi,yi). Using all the pixels at (p,q)∈Ω(xi,yi), LBP values are calculated as depicted in Figure 4.  The SIFT feature already contains a considerable amount of information for detecting CMFs. Therefore, it is not necessary to use all the LBP information. An LBP can be classified into two categories. If the binary pattern contains at most two 0→1 or 1→0 transitions, it is called a uniform pattern. For example, 00110000 (2 transitions) is a uniform pattern, but 01010100 (6 transitions) is a non-uniform pattern. Among the 256 LBP values, 58 LBP values are uniform, and the rest are nonuniform. A uniform pattern is characterized by a series of consecutive 1's. Let Lc(p,q) (c = 0, 1, 2, …, 8) be the LBP with c consecutive 1's. Obviously, L0(p,q) = 00000000, and L8(p,q) = 11111111. For c = 1, 2, …, 7, Lc(p,q) has 8 binary patterns that can be all be viewed as rotationally shifted versions of a single pattern. Figure 5 shows various uniform patterns. Let Lnon(p,q) be any pattern that has no consecutive 1's except for L0(p,q). In conclusion, the 256-level LBP values can be divided into 10 groups, that is, nine types of Lc(p,q) and one Lnon(p,q).

Proposed Descriptor
In this paper, we use the probabilities of Lc(p,q) and Lnon(p,q) as a new descriptor for CMFD. To maintain the rotation-invariant characteristic of the new descriptor, we check the occurrence of Lc(p,q) alone. Lnon(p,q) can reflect a frequent variation in a small window, which may occur because of noise, quantization errors, or small background changes. To reduce the effect of these variations, all nonuniform patterns are checked as they occur. The proposed descriptor ri corresponding to keypoint ki is obtained as where Rc(xi,yi) and Rnon(xi,yi) are the normalized number of occurrences of Lc(p,q) and Lnon(p,q), , for all ( , ) ( , ) ( , ) where #[Lc(p,q)]is the number of occurrences of the Lc(p,q) pattern in Ω(xi,yi), and |Ω(xi,yi)| is the cardinality of Ω(xi,yi). Rnon(xi,yi) can be obtained in a similar manner. ri is composed of the histogram of Lc(p,q) and Lnon(p,q). It is a 10 dimensional feature vector. Figure 6 illustrates the ri generation method using the reduced LBP histogram. The proposed descriptor for a keypoint, based on SIFT and the histogram of the reduced LBP, is obtained as

Proposed Descriptor
In this paper, we use the probabilities of L c (p,q) and L non (p,q) as a new descriptor for CMFD. To maintain the rotation-invariant characteristic of the new descriptor, we check the occurrence of L c (p,q) alone. L non (p,q) can reflect a frequent variation in a small window, which may occur because of noise, quantization errors, or small background changes. To reduce the effect of these variations, all non-uniform patterns are checked as they occur. The proposed descriptor r i corresponding to keypoint k i is obtained as where R c (x i ,y i ) and R non (x i ,y i ) are the normalized number of occurrences of L c (p,q) and L non (p,q), respectively, in Ω(x i ,y i ). R c (x i ,y i ) is calculated by where #[L c (p,q)] is the number of occurrences of the L c (p,q) pattern in Ω(x i ,y i ), and |Ω(x i ,y i )| is the cardinality of Ω(x i ,y i ). R non (x i ,y i ) can be obtained in a similar manner. r i is composed of the histogram of L c (p,q) and L non (p,q). It is a 10 dimensional feature vector. Figure 6 illustrates the r i generation method using the reduced LBP histogram.
In Figure 4, we investigated a no-match case using the conventional SIFT descriptor. We apply the proposed descriptor to this case, and Figure 7 depicts the result of this case. As shown in Figure  7, we observe that the failed match is transformed into a successful match by using the proposed descriptor. The keypoint, kB is detected as the potentially matched keypoint with kA because kB has the closest distance (d1,A = 0.1426) to kA. The distance between kA and kC is calculated as d2,A = 0.2380. Because d1,A/d2,A does not exceed the threshold t, we can determine that kA and kB are the matched pair. Based on this result, we conclude that the proposed LBP-based descriptor plays an important role in removing the effect of a small fluctuation that occurred because there was a keypoint close to the boundary of the copy-moved portion and the authentic region.

Estimation of Affine Transform and False Match Removal
For the putative matched keypoint pairs, the geometric distortions, such as the rotation, scaling, and shearing of the duplicated regions can be estimated. Let (xi,yi) and (x'i,y'i) be the pixel locations from a region and its duplicate, respectively. These two locations are related by an affine transform as follows.  The proposed descriptor for a keypoint, based on SIFT and the histogram of the reduced LBP, is obtained as where g i is the proposed descriptor for CMFD and has 138 dimensional features. Because the descriptor, r i is generated for a relatively large area, unlike f i , it may be sufficiently robust to handle small pixel changes and quantization errors caused by image compression.
In Figure 4, we investigated a no-match case using the conventional SIFT descriptor. We apply the proposed descriptor to this case, and Figure 7 depicts the result of this case. As shown in Figure 7, we observe that the failed match is transformed into a successful match by using the proposed descriptor. The keypoint, k B is detected as the potentially matched keypoint with k A because k B has the closest distance (d 1,A = 0.1426) to k A . The distance between k A and k C is calculated as d 2,A = 0.2380. Because d 1,A /d 2,A does not exceed the threshold t, we can determine that k A and k B are the matched pair. Based on this result, we conclude that the proposed LBP-based descriptor plays an important role in removing the effect of a small fluctuation that occurred because there was a keypoint close to the boundary of the copy-moved portion and the authentic region. where gi is the proposed descriptor for CMFD and has 138 dimensional features. Because the descriptor, ri is generated for a relatively large area, unlike fi, it may be sufficiently robust to handle small pixel changes and quantization errors caused by image compression.
In Figure 4, we investigated a no-match case using the conventional SIFT descriptor. We apply the proposed descriptor to this case, and Figure 7 depicts the result of this case. As shown in Figure  7, we observe that the failed match is transformed into a successful match by using the proposed descriptor. The keypoint, kB is detected as the potentially matched keypoint with kA because kB has the closest distance (d1,A = 0.1426) to kA. The distance between kA and kC is calculated as d2,A = 0.2380. Because d1,A/d2,A does not exceed the threshold t, we can determine that kA and kB are the matched pair. Based on this result, we conclude that the proposed LBP-based descriptor plays an important role in removing the effect of a small fluctuation that occurred because there was a keypoint close to the boundary of the copy-moved portion and the authentic region.

Estimation of Affine Transform and False Match Removal
For the putative matched keypoint pairs, the geometric distortions, such as the rotation, scaling, and shearing of the duplicated regions can be estimated. Let (xi,yi) and (x'i,y'i) be the pixel locations from a region and its duplicate, respectively. These two locations are related by an affine transform as follows.

Estimation of Affine Transform and False Match Removal
For the putative matched keypoint pairs, the geometric distortions, such as the rotation, scaling, and shearing of the duplicated regions can be estimated. Let (x i ,y i ) and (x' i ,y' i ) be the pixel locations from a region and its duplicate, respectively. These two locations are related by an affine transform as follows.
x i y i = t 11 t 12 t 21 t 22 where t 11 , t 12 , t 21 , t 22 , x 0 , and y 0 are the transform parameters. To estimate these parameters, at least three pairs of corresponding keypoints that are not collinear are required. However, the obtained parameters of the affine transform are inaccurate because of the mismatched keypoints. To eliminate unreliable keypoint matches, the widely used RANSAC scheme is employed. The parameters can be obtained using the RANSAC algorithm exhibit a high degree of accuracy. Furthermore, the affine transform parameters can also be used to determine the correlation map of the duplicated region. After executing the RANSAC algorithm, the degree of rotation and scaling are estimated using the conventional SIFT and the proposed method. Table 1 presents the estimation results of the 10 different combinations of geometric transforms applied to the MICC-F220 dataset [36]. The MICC-F220 dataset comprises 110 tempered images and their corresponding 110 originals. The image resolution varies from 722 × 480 to 800 × 600 pixels, and the size of the forged portion covers, on average, 1.2% of the entire image. As shown in Table 1, the transform parameters using the proposed method are more accurate than those of the conventional SIFT algorithm. Thus, we conclude that the proposed descriptor based on the histogram of the reduced LBP plays a positive role in the detection of CMF.

Localization
To localize the duplicated regions affected by CMF, the correlation map between the original image and the warped image is frequently used. Let W be the warped image obtained by transforming the image according to the affine transform. The correlation coefficient at a pixel location (x,y), ρ(x,y) is computed as where Λ(x,y) is a 5 × 5 local window centered at (x,y), and I µ and W µ are the average values of I and W in the area Λ(x,y). A Gaussian filter is applied to the correlation map to reduce the noisy pixels, and a Symmetry 2020, 12, 492 9 of 16 binary correlation map is obtained by a threshold. If the ρ(x,y) value for point (x,y) is greater than a threshold, this point is considered to be true, otherwise, this point is assigned a value of false.

Summary of Proposed Method
The overall algorithm of the proposed CMFD method is presented in Figure 8. SIFT is performed on the image suspected to have undergone CMF, to extract keypoints. For a keypoint k i at the image location (x i ,y i ), the conventional 128-dimensional descriptor f i is generated. For all the pixels in a 16 × 16 window centered at that keypoint location, pixel-wise LBP values are calculated. Next, the 256-level LBP values are reduced to 10 types of values. A histogram of the reduced LBP values is generated, and we let this 10-dimensional histogram be the additional descriptor r i . Next, g i , which is the combination of f i and r i , becomes the new descriptor for CMFD. For the final output, the false matching removal step, followed by localization using the RANSAC algorithm is performed. level LBP values are reduced to 10 types of values. A histogram of the reduced LBP values is generated, and we let this 10-dimensional histogram be the additional descriptor ri. Next, gi, which is the combination of fi and ri, becomes the new descriptor for CMFD. For the final output, the false matching removal step, followed by localization using the RANSAC algorithm is performed.

Experimental Results
To evaluate the performance of the proposed method, we first define three measures at the image level and pixel level. At the image level, the ability to correctly verify whether an images forged or not is measured, and the forgery localization accuracy performance is analyzed at the pixel level. In this paper, we use three measures, namely, true positive rate (TPR), false positive rate (FPR), and accuracy (ACC). TPR is used to measure the percentage of actual positives that are correctly identified, and it is defined as # images (pixels) detected as forged being forged # genuine images (pixels) TPR  . (10) FPR is the fraction of original images (pixels) that are not correctly identified, and it is obtained as #images (pixels) detected as forged being genuine # genuine images (pixels) FPR  .
A well-designed CMF detector must simultaneously maintain a high TPR and low FPR. ACC is the percentage of correct decisions and is defined as # corrected detected images (pixels) #total mages (pixels) ACC  . (12)
In this paper, we use the TPR and FPR values presented in the published papers for each test dataset.

Experimental Results
To evaluate the performance of the proposed method, we first define three measures at the image level and pixel level. At the image level, the ability to correctly verify whether an images forged or not is measured, and the forgery localization accuracy performance is analyzed at the pixel level. In this paper, we use three measures, namely, true positive rate (TPR), false positive rate (FPR), and accuracy (ACC). TPR is used to measure the percentage of actual positives that are correctly identified, and it is defined as TPR = # images (pixels) detected as forged being forged # genuine images (pixels) .
FPR is the fraction of original images (pixels) that are not correctly identified, and it is obtained as FPR = #images (pixels) detected as forged being genuine #genuine images (pixels) .
A well-designed CMF detector must simultaneously maintain a high TPR and low FPR. ACC is the percentage of correct decisions and is defined as ACC = #corrected detected images (pixels) #totalmages (pixels) .

Dataset
In our simulation, we use four test datasets, namely, MICC-F220 [36], CMH [8], D [43], and COVERAGE [44]. The detailed information regarding these four datasets is summarized in Table 2. In this paper, we use the TPR and FPR values presented in the published papers for each test dataset.  Table 3 presents the detection performance using the MICC-F220 dataset. Because this dataset has no ground truth, three performance measures are obtained by at the image level. The performance of the proposed method is compared with that of various state-of-the-art CMFD methods, such as, improved DAISY descriptor (DAISY) [45], local invariant symmetry features (LISF) [46], 2-levels clustering strategy (Clustering) [47], dense filed-based method (DF) [48], Markov random field-based method (MRF) [49], iterative copy-move forgery detection (ICMFD) [35], going deeper into copy-move forgery detection (GDCMFD) [8], and hierarchical feature point matching-based method (HFPM) [42]. As presented in Table 3, HFPM exhibits the highest ACC of 99.10. The proposed CMFD algorithm achieves the second rank with an ACC of 96.82. For the MICC-F220 dataset, keypoint-based methods [42,47], including the proposed algorithm, demonstrate relatively better performance than block-based approaches. Table 3. True positive rate (TPR) (%), false positive rate (FPR) (%), and accuracy (%) on the MICC-F220 dataset at an image level. The number in bold indicates the highest performance, and the number in italics represents the second place.

Results Obtained on CMH Dataset
The CMH dataset has four types of uncompressed 108 realistic cloning images maliciously tampered with in various ways (23 plain copy-moved images (CMH 1 ), 25 rotated images (CMH 2 ), 25 resized images (CMH 3 ), and 35 images that are both rotated and resized (CMH 4 )). Additionally, to address compressions, we compressed the full image into a JPEG format using quality factors of 70%, 80%, and 90% (CMH 5 ). Table 4 presents the detection performance on this dataset. Every image has its own ground truth indicating the original and cloned regions in white color.

Method
Most CMFD approaches tend to degrade in their detection performance when operating on a compressed image. However, our method is considerably robust in handling compressed images during detection. To evaluate the CMFD performance for various forgery detection algorithms, three performance measures are tested on the CMH 5 test dataset. Table 5 shows the detection performance on uncompressed CMH 5 dataset at the pixel level. As shown in Table 5, the performance of many CMFD approaches, except those of LSIF, SIFT, and the proposed method, is degraded. In particular, the performance degradation of the block-based algorithms, such as HT, ZM, and GDCMFD, is considerable. The TPR, FPR, and ACC values of our methods are almost the same as those of the uncompressed CMH datasets. This advantage of the proposed CMFD method can be attributed to the addition of the reduced LBP descriptor.  Figure 9 depicts the CMF region localization examples for the CMH dataset. As depicted in Figure 9, the proposed method exhibits the best localization performance. HT also demonstrates good localization results. However, this method does not reveal the details of the copied and moved regions. SIFTJL and AO demonstrate limited localization performances, whereas GDCMFD, based on statistical moments, demonstrates extremely poor localization performance. Symmetry 2020, 12, x FOR PEER REVIEW 12 of 17 Figure 9. Copy-move forged localization results on the CHM dataset.

Results on Obtained D Dataset
The D dataset comprises medium sized images (almost all of which are 1000 × 700 or 700 × 1000) and is subdivided into four datasets (D0, D1, D2, and D3). The first dataset D0 is made of 50 uncompressed images with simply translated copies. D1 is created by copy-pasting objects after rotation, and D2 is derived by of applying scaling to the copies. The subset D3 comprises 50 original images without tampering. For comparison, HT [50], SIFTJL [31], GDCMFD [8], and AO [51] are used. Table 6 presents the detection performances on the D dataset. The proposed CMFD approach achieves the best detection performance, and HT achieves the second place with respect to ACC for the D0, D1, and D2 datasets. Because the D3 dataset contains only authentic images, only FPRs are compared in Table 6. The proposed method does not find any image to be manipulated as a copymove image, and has an FPR value of zero. Figure 10 illustrates forged localization examples for the D dataset. As shown in Figure 10, the proposed CMFD algorithm exhibits the best localization performance.

Results on Obtained D Dataset
The D dataset comprises medium sized images (almost all of which are 1000 × 700 or 700 × 1000) and is subdivided into four datasets (D 0 , D 1 , D 2 , and D 3 ). The first dataset D 0 is made of 50 uncompressed images with simply translated copies. D 1 is created by copy-pasting objects after rotation, and D 2 is derived by of applying scaling to the copies. The subset D 3 comprises 50 original images without tampering. For comparison, HT [50], SIFTJL [31], GDCMFD [8], and AO [51] are used. Table 6 presents the detection performances on the D dataset. The proposed CMFD approach achieves the best detection performance, and HT achieves the second place with respect to ACC for the D 0 , D 1 , and D 2 datasets. Because the D 3 dataset contains only authentic images, only FPRs are compared in Table 6. The proposed method does not find any image to be manipulated as a copy-move image, and has an FPR value of zero. Figure 10 illustrates forged localization examples for the D dataset. As shown in Figure 10, the proposed CMFD algorithm exhibits the best localization performance.

Results on Obtained D Dataset
The D dataset comprises medium sized images (almost all of which are 1000 × 700 or 700 × 1000) and is subdivided into four datasets (D0, D1, D2, and D3). The first dataset D0 is made of 50 uncompressed images with simply translated copies. D1 is created by copy-pasting objects after rotation, and D2 is derived by of applying scaling to the copies. The subset D3 comprises 50 original images without tampering. For comparison, HT [50], SIFTJL [31], GDCMFD [8], and AO [51] are used. Table 6 presents the detection performances on the D dataset. The proposed CMFD approach achieves the best detection performance, and HT achieves the second place with respect to ACC for the D0, D1, and D2 datasets. Because the D3 dataset contains only authentic images, only FPRs are compared in Table 6. The proposed method does not find any image to be manipulated as a copymove image, and has an FPR value of zero. Figure 10 illustrates forged localization examples for the D dataset. As shown in Figure 10, the proposed CMFD algorithm exhibits the best localization performance.   This dataset has 100 original and forged image pairs, with an average resolution of 400 × 486. For comparison, DF [46], SCMFD [15], ZM [34], GDCMFD [8], ICMFD [35], and HFPM [42] are used. Table 7 shows the image level detection performance on the COVERAGE dataset. Most algorithms perform poorly because each image contains similar-but-genuine objects in this dataset. HFPM exhibits the best ACC value, whereas the ACC of our method ranked third. However, the difference in the ACC between the method that achieved the first place, and the proposed method is only 1.73.

Discussion
In this paper, we compared the performance of our method with various CMFD methods for four datasets. Table 8 presents the ACC ranks of the CMFD methods for all the test datasets. For datasets, such as CMH and D, the proposed algorithm exhibited the highest CMFD performance. Our method achieved the second highest performance for the MIC-F220 dataset and ranked in the third place for the COVERAGE dataset. Overall, the proposed method demonstrated a high performance on average for various data. Many conventional CMFD methods exhibited performance variations depending on whether the image was compressed or not and the type of test dataset. For compressed image datasets, such as CMH 5 , our approach exhibited almost the same detection accuracy as that for uncompressed image datasets. In particular, our algorithm also achieved the highest rank for the dataset wherein the image was geometrically transformed. HT also exhibited a fairy good ranking, but fell short of that of the proposed method. Based on the results of Table 8, we can conclude that the proposed method results in a more uniform and consistent CMFD detection performance, regardless of the type of dataset.

Conclusions
In this paper, we introduced a new CMFD algorithm by adding the reduced LBP histogram-based descriptor. 256-level LBP features were first generated for a 16 × 16 window centered at a single keypoint. Next, the 256-level LBP values were reduced to 10 values to prevent an increase in the descriptor dimension. The histogram of the reduced LBP features was used as the additional descriptor to detect the CMF. In total, the proposed descriptor for a keypoint had 138 dimensions. We evaluated four types of test datasets. Additionally, the performance of the proposed method was compared with that of the existing CMFD algorithms. The simulation results showed that the proposed CMFD scheme generated more accurate estimation results while detecting CMFs than those obtained from conventional methods. The proposed method also demonstrated a relatively uniform and consistent detection performance regardless of the different types of test datasets.