# Convolutional Neural Network for Copy-Move Forgery Detection

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Related Work

#### 2.1. Feature Extraction

#### 2.2. Using CNNs for Feature Extraction

#### 2.3. Classifying Feature Selections

## 3. The Proposed CNN Model

- (1)
- In a convolutional layer, the convolutions are performed using stride ${S}_{h}$ and ${S}_{w}$ for the first two axes of the input feature maps, along with ${P}_{i+1}$ filters ${K}_{h}\times {K}_{w}\times {P}_{i}$ [37]:$${H}_{i+1}=\left[\frac{{H}_{i}-{K}_{h}+1}{{S}_{h}}\right]$$$${W}_{i+1}=\left[\frac{{W}_{i}-{K}_{w}+1}{{S}_{w}}\right]$$$${P}_{i+1}$$
- (2)
- In a pooling layer, which occurs following convolutions, the layer chooses pixel valuations of specific characteristics (e.g., average pooling or maximum pooling) within a given region. If a max-pooling layer is chosen, it then carries out maximum element extraction, i.e., stride ${S}_{h}$ and ${S}_{w}$ for the initial two axes in a neighborhood ${K}_{h}\times {K}_{w}$ for every two-dimensional piece of input the feature map. The input block’s maximum value is, therefore, returned [37]. This approach is commonly applied in deep learning networks. In our proposed strategy, the max-pooling layer will decrease the input image patch resolution, as well as enhance network robustness, in the face of possible valuation changes in the motion residuals of the frame’s absolute difference image [38].$${\mathrm{H}}_{\mathrm{i}+1}=\left[\frac{{\mathrm{H}}_{\mathrm{i}}-{\mathrm{K}}_{\mathrm{h}}+1}{{\mathrm{S}}_{\mathrm{h}}}\right]$$$${\mathrm{W}}_{\mathrm{i}+1}=\left[\frac{{\mathrm{W}}_{\mathrm{i}}-{\mathrm{K}}_{\mathrm{w}}+1}{{\mathrm{S}}_{\mathrm{w}}}\right]$$$${\text{}\mathrm{P}}_{\mathrm{i}+1}={\mathrm{P}}_{\mathrm{i}}$$Input image patches for CNN models use two-dimensional array image blocks measuring 3 × (64 × 64), with 3 indicating the channel number in the RGB-scale. Thus, if we use 3 × 3 as the window size and 3 as the stride size, then the image patch resolution decreases by half to 32 × 32 from its original 64 × 64, following the initial max-pooling layer [37].
- (3)
- ReLU layer performs element-wise nonlinear activation. Given a single neuron $x$, it is transformed into a single neuron $y$ with:$$y=\mathrm{max}\left(0,x\right)$$
- (4)
- Softmax layer turns an input feature vector into a vector with the same number of elements summing to 1. Given an input vector $x$ with ${P}_{i}$ neurons ${x}_{j}$ $i\text{}\in \left[1,\text{}{P}_{i}\right]$, each input neuron produces a corresponding output neuron:$${y}_{j}=\frac{{e}^{{x}_{j}}}{{{\displaystyle \sum}}_{k=1}^{k={P}_{i}}{e}^{{x}_{k}}}$$
- (5)
- In a fully-connected (FC) layer, dot multiplication is carried out between flattened feature maps (i.e., the input feature vector) and the weight matrix using ${P}_{i+1}$ rows, along with columns of ${P}_{i}$ or (${H}_{i}\text{}.\text{}{W}_{i}\text{}.\text{}{P}_{i}$) [37]. Meanwhile, the output feature vector presents ${P}_{i+1}$ elements [37]. Trained CNNs can also remove meaningful information in images that have not been used to train the network. This particular characteristic enables forgery exposure of previously unidentified images as well [37].
- (6)
- In a batch normalization layer, every input channel is normalized in ultra-small (or mini) batches. The batch normalization layer initially normalizes every individual channel’s activations by subtracting the mini-batch mean and then dividing the result by the standard deviation of the mini-batch [37]. Next, the input is shifted by the layer using the learnable offset $\beta $, after which it scales the input using the learnable scale factor $\gamma $ [37]. Batch normalization layers can also be used between convolutional and nonlinearities (e.g., ReLU layers) to increase CNN training and lessen any sensitivities that might arise during the initialization of the networks. Batch normalization can normalize inputs ${x}_{i}$ through formulating the mean ${\mu}_{B}$ and variance ${\sigma}_{B}^{2}$ for a mini-batch and input channel, after which it formulates the normalized activations [37]:$$\widehat{{x}_{i}}=\frac{{x}_{i}-{\mu}_{B}}{\sqrt{{\sigma}_{B}^{2}+\u03f5}}$$

#### 3.1. The Proposed CNN Architecture

#### Batch Normalization of the Proposed CNN

## 4. Experiment Results

#### 4.1. Environment Analysis

#### 4.2. Training

#### 4.3. Results and Discussion

## 5. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Jing, W.; Hongbin, Z. Exposing digital forgeries by detecting traces of image splicing. In Proceedings of the 8th IEEE International Conference on Signal Processing, Guilin, China, 16–20 November 2006; Volume 2. [Google Scholar]
- Christlein, V.; Riess, C.; Jordan, J.; Riess, C.; Angelopoulou, E. An evaluation of popular copy-move forgery detection approaches. IEEE Trans. Inf. Forensics Secur.
**2012**, 7, 1841–1854. [Google Scholar] [CrossRef] - Ryu, S.J.; Kirchner, M.; Lee, M.J.; Lee, H.K. Rotation invariant localization of duplicated image regions based on Zernike moments. IEEE Trans. Inf. Forensics Secur.
**2013**, 1355–1370. [Google Scholar] [CrossRef] - Li, H.; Luo, W.; Qiu, X.; Huang, J. IEEE Trans. Inf. Forensics Secur.
**2017**, 12, 1240–1252. [Google Scholar] [CrossRef] - Korus, P.; Huang, J. Multi-scale analysis strategies in prnu-based tampering localization. IEEE Transl. Trans. Inf. Forensics Secur.
**2017**, 12, 809–824. [Google Scholar] [CrossRef] - Lee, H.; Ekanadham, C.; Ng, A.Y. Sparse deep belief net model for visual area. In Advances in Neural Information Processing Systems; 20 (NIPS); MIT Press: Cambridge, MA, USA, 2008. [Google Scholar]
- Larochelle, H.; Bengio, Y.; Louradour, J.; Lamblin, P. Exploring strategies for training deep neural networks. J. Mach. Learn. Res.
**2009**, 10, 1–40. [Google Scholar] - LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Oroc. IEEE
**1998**, 2278–2324. [Google Scholar] [CrossRef] - Giacinto, G.; Roli, F. Design of effective neural network ensembles for image classification purposes. Image Vis. Comput.
**2001**, 699–707. [Google Scholar] [CrossRef] - Fukushima, K.; Miyake, S.; Ito, T. Neocognitron: A neural network model for a mechanism of visual pattern recognition. IEEE Trans. Syst. Man Cybern
**1983**, 826–834. [Google Scholar] [CrossRef] - V, A.; L, K.; Gupta, A. Convolutional Neural Networks for Matlab. MatConvNet
**2015**, 1–59. [Google Scholar] - He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. arXiv preprint
**2015**, arXiv:1512.03385. [Google Scholar] - Verma, V.; Agarwal, N.; Khanna, N. DCT-domain Deep Convolutional Neural Networks for Multiple JPEG Compression Classification. Image Commun.
**2017**, 67, 1–12. [Google Scholar] [CrossRef] - Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–8 December 2012; pp. 1097–1105. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; pp. 91–99. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Identity mappings in deep residual networks. arXiv preprint
**2016**, arXiv:1603.05027. [Google Scholar] - Smeulders, A.W.; Worring, M.; Santini, S.; Gupta, A.; Jain, R. Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell.
**2000**, 22, 1349–1380. [Google Scholar] [CrossRef] - Voulodimos, A.; Doulamis, N.D.A.; Protopapadakis, E. Deep Learning for Computer Vision: A Brief Review. Hindawi. Comp. Intell. Neurosci.
**2018**, 1–13. [Google Scholar] [CrossRef] [PubMed] - Yang, J.; Yu, K.; Gong, Y.; Huang, T. Linear spatial pyramid matching using sparse coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami Beach, FL, USA, 20–25 June 2009; pp. 20–25. [Google Scholar]
- Felzenszwalb, P.F.; Girshick, R.B.; McAllester, D.; Ramanan, D. Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell.
**2010**, 32, 1627–1645. [Google Scholar] [CrossRef] [PubMed] - Jegou, H.; Perronnin, F.; Douze, M.; Sanchez, J.; Perez, P.; Schmid, C. Aggregating local image descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell.
**2012**, 34, 1704–1716. [Google Scholar] [CrossRef] [PubMed] - Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Proceedings of the European Conference on Computer Vision, Santiago, Chile, 11–18 December 2015; Springer: Berlin, Germany, 2015; pp. 818–833. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint
**2015**, arXiv:1409.155. [Google Scholar] - Xie, S.; Girshick, R.; Dollar, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. arXiv preprint
**2016**, arXiv:1611.05431. [Google Scholar] - Das, S. CNNs Architectures: LeNet, AlexNet, VGG, GoogLeNet, ResNet and more. Available online: https://medium.com/analytics-vidhya/cnns-architectures-lenet-alexnet-vgg-googlenet-resnet-and-more-666091488df5 (accessed on 28 August 2017).
- Paulin, M.; Douze, M.; Harchaoui, Z.; Mairal, J.; Perronin, F.; Schmid, C. Local convolutional features with unsupervised training for image retrieval. In Proceedings of the IEEE International Conference on Computer Vision, Araucano Park, Las Condes, Chile, 11–18 December 2015; pp. 91–99. [Google Scholar]
- Liu, Y.; Guan, Q.; Zhao, X. Copy-move Forgery Detection based on Convoluational Kernel Network. Multimedia Tools Appl.
**2018**, 77, 18269–18293. [Google Scholar] - Younis, A.; Iqbal, T.; Shehata, M. Copy-Move Forgery Detection Based on Enhanced Patch-Match. Int. J. Comput. Sci. Issues
**2017**, 14, 1–7. [Google Scholar] - Soni, B.; Das, P.K.D.; Thounaojam, D. Cmfd: A detailed review of block based and key feature-based techniques in image copy-move forgery detection. IET Image Process.
**2017**. [Google Scholar] [CrossRef] - Birajdar, G.K.; Mankar, V.H. Digital image forgery detection using passive niques. A survey. Digit. Investig.
**2013**, 10, 226–245. [Google Scholar] [CrossRef] - Asghar, K.; Habib, Z.; Hussain, M. Copy-move and splicing image forgery detection and localization techniques. A review. Aust. J. Forensic Sci.
**2017**, 49, 281–307. [Google Scholar] [CrossRef] - Radford, A.; Metz, L.; Chintala, S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. In Proceedings of the Computer Vision Conference ICLR, Caribe Hilton, San Juan, Puerto Rico, 2–4 May 2016; pp. 1–16. [Google Scholar]
- Wu, Y.; Abd-Almageed, W.; Natarajan, P. BusterNet: Detection Copy-Move Image Forgery with Source/Target Localization; Springer: Berlin, Germany, 2018. [Google Scholar]
- Huo, Y.; Zhu, X. High dynamic range image forensics using cnn. arXiv
**2019**, arXiv:1902.10938.2019. [Google Scholar] - Isola, P.; Zhu, J.-Y.; Zhou, T.; Efros, A.A. Image-to-image translation with conditional adversarial networks. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Bondi, L.; Baroffio, L.; Guera, D.; Bestagini, P.; Delp, E.J.; Tubaro, S. First Steps Toward Camera Model Identification with Convolutional Neural Networks. IEEE Signal. Process. Lett.
**2017**, 24, 259–263. [Google Scholar] [CrossRef] - Sergey, I.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint
**2015**, arXiv:1502.03167. [Google Scholar] - Yao, Y.; Shi, Y.; Weng, S.; Guan, B. Deep Learning for Detection of Object Forgery in Advanced video. Symmetry
**2017**, 3. [Google Scholar] [CrossRef] - Mahendran, A.; Vedaldi, A. Visualizing deep convolutional neural networks using natural pre-images. Int. J. Comput. Vis.
**2016**, 12, 233–255. [Google Scholar] [CrossRef] - Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 Jun 2015; pp. 1–9. [Google Scholar]
- Bayar, B.; Stamm, M.C. Design principles of convolutional neural networks for multi-media forensics. Soc. Imaging Sci. Technol.
**2017**, 10, 77–86. [Google Scholar] - Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning (ICML), Lille, France, 6–11 July 2015. [Google Scholar]
- Songtao, W.; Zhong, S.Z.; Yan, L. A novel convolutional neural network for image steganalysis with shared normalization. IEEE Trans. Multimed.
**2017**, 1–12. [Google Scholar] [CrossRef] - Alex, K. Learning Multiple Layers of Features form Tiny Images; University of Toronto: Toronto, ON, Canada, 2009; pp. 1–58. [Google Scholar]
- James, P.; Relja, A.; Andrew, Z. Available online: http://robots.ox.ac.uk/~vgg/data/oxbuildings/ (accessed on 19 August 2019).
- Wen, B.; Zhu, Y.; Subramanian, R.; Ng, T.; Shen, X.; Winkler, S. COVERAGE—A Novel Database for Copy-Move Forgery Detection. In Proceedings of the IEEE International Conference Image Processing (ICIP), Quebec City, QC, Canada, 27–30 September 2015. [Google Scholar]
- Chen, S.; Tan, S.; Li, B.; Huang, J. Automatic Detection of Object-Based Forgery in Advanced Video. IEEE Trans. Circuits Syst. Video Technol.
**2016**, 26, 2138–2151. [Google Scholar] [CrossRef] - Li, J.; Li, X.; Yang, B.; Sun, X. Segmentation-based image copy-move forgery detection scheme. Ieee Trans. Inf. Forensics Secur.
**2015**, 10, 507–518. [Google Scholar] - Ulutas, G.; Muzaffer, G. A new copy move forgery detection method resistant to object removal with uniform background forgery. Hindawi Math. Probl. Eng.
**2016**, 1–19. [Google Scholar] [CrossRef] - Silva, E.; Carvalho, T.; Ferreira, A.; Rocha, A. Going deeper into copy-move forgery detection: Exploring image telltales via multi-scale analysis and voting processes. J. Vis. Commun. Image Represent.
**2015**, 29, 16–32. [Google Scholar] [CrossRef] - Ryn, S.J.; Lee, M.J.; Lee, H.K. Detection of copy-move forgery using Zernike moments. In Information hiding; Springer: Berlin, Germany, 2010; pp. 51–65. [Google Scholar]

**Figure 4.**Random output samples show the true detection of the pristine image’s category. The output is flagged with the category name in green color “pristine” which indicates the correct decision.

**Figure 5.**This figure presents three image categories (a) pristine image; (b) the same image was manipulated with copy-move forgery (c) the output mask showing the copy-move forgery detection result, including the two similar areas in the same image frame.

**Figure 6.**Random samples of the model result illustrate a true passive forgery detection i.e., there are no original images (pristine).

**Figure 7.**False detections by labeling this forged image as a pristine image. There will be a red flag in the results, referring to the false result.

CNN | Layers No. | Inventor(s) | Year | Place | Parameters No. | Error Rate | |
---|---|---|---|---|---|---|---|

[25] | LeNet | 8 | Yann LeCun | 1988 | First | 60 T | N/A |

[14] | AlexNet | 7 | Alex Kirzhevsky Hinton, Ilya Sutskever | 2012 | First | 60 M | 15.3% |

[22] | ZFNet | 7 | Matthew Zeiler and Rob Fergus | 2013 | First | N/A | 14.8% |

[25] | Google Net | 9 | 2014 | First | 4 M | 6.67% | |

[23] | VGG Net | 16 | Simonyan, Zisserman | 2014 | Second | 140 M | 3.6% |

[12] | ResNet | 152 | Kaiming He | 2015 | First | N/A | 3.75% |

Layer | Properties | No |
---|---|---|

imageInputLayer | $64\times 64\times 3$ | 1 |

convolution2dLayer | $645\times 5$ convolutions with stride [1 1] and padding [2 2 2 2] | 3 |

MaxPooling2DLayer | Name: ‘‘ HasUnpoolingOutputs: 0 NumOutputs: 1 OutputNames: {‘out’} Hyperparameters PoolSize: [2 2] Stride: [2 2] PaddingMode: ‘manual’ PaddingSize: [0 0 0 0] | 3 |

fullyConnectedLayer(x) $x=\text{}\{\begin{array}{c}64\\ 2\end{array}$. | 64 fully connected layer 2 fully connected layer | 2 |

ReLU | ReLU | 4 |

Softmax | Softmax | 1 |

C-Outputlayer | $64\times 64\times 3$ | 1 |

Data | Property | Option Value |
---|---|---|

Input Size | [64 64 3] | Various |

Fill Value | 0 | |

Rand X Reflection | 0 | |

Rand Y Reflection | 0 | |

Rand Rotation | [–20 20] | |

Rand X Scale | [1 1] | |

Rand Y Scale | [1 1] | |

Rand X Shear | [0 0] | |

Rand Y Shear | [0 0] | |

Rand X Translation | [–3 3] | |

Rand Y Translation | [–3 3] | |

Initial Learn Rate | 0.01 | 0.001 |

Mini Batch Size | 256 | 100, 64 |

lower Threshold | 10 | 8 |

Validation Frequency | 50 | 30 |

Base Learning Reta | 0.0001 |

Epoch | Iteration | Time Elapsed Sec. | Mini-batch Accuracy | Base Learning Rate |
---|---|---|---|---|

1 | 1 | 31.42 | 88.00% | 0.0010 |

4 | 50 | 1587.43 | 90.00% | 0.0010 |

7 | 90 | 3120.95 | 91.00% | 0.0010 |

**Table 5.**Comparison of copy-move forgery detection F-measure, precision and recall of different algorithm.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Abdalla, Y.; Iqbal, M.T.; Shehata, M.
Convolutional Neural Network for Copy-Move Forgery Detection. *Symmetry* **2019**, *11*, 1280.
https://doi.org/10.3390/sym11101280

**AMA Style**

Abdalla Y, Iqbal MT, Shehata M.
Convolutional Neural Network for Copy-Move Forgery Detection. *Symmetry*. 2019; 11(10):1280.
https://doi.org/10.3390/sym11101280

**Chicago/Turabian Style**

Abdalla, Younis, M. Tariq Iqbal, and Mohamed Shehata.
2019. "Convolutional Neural Network for Copy-Move Forgery Detection" *Symmetry* 11, no. 10: 1280.
https://doi.org/10.3390/sym11101280