Due to the great success of convolutional neural networks (CNNs) in the area of computer vision, the existing methods tend to match the global or local CNN features between images for near-duplicate image detection. However, global CNN features are not robust enough to combat background clutter and partial occlusion, while local CNN features lead to high computational complexity in the step of feature matching. To achieve high efficiency while maintaining good accuracy, we propose a coarse-to-fine feature matching scheme using both global and local CNN features for real-time near-duplicate image detection. In the coarse matching stage, we implement the sum-pooling operation on convolutional feature maps (CFMs) to generate the global CNN features, and match these global CNN features between a given query image and database images to efficiently filter most of irrelevant images of the query. In the fine matching stage, the local CNN features are extracted by using maximum values of the CFMs and the saliency map generated by the graph-based visual saliency detection (GBVS) algorithm. These local CNN features are then matched between images to detect the near-duplicate versions of the query. Experimental results demonstrate that our proposed method not only achieves a real-time detection, but also provides higher accuracy than the state-of-the-art methods.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited