Recently, deep learning-based techniques have shown great power in image inpainting especially dealing with squared holes. However, they fail to generate plausible results inside the missing regions for irregular and large holes as there is a lack of understanding between missing regions and existing counterparts. To overcome this limitation, we combine two non-local mechanisms including a contextual attention module (CAM) and an implicit diversified Markov random fields (ID-MRF) loss with a multi-scale architecture which uses several dense fusion blocks (DFB) based on the dense combination of dilated convolution to guide the generative network to restore discontinuous and continuous large masked areas. To prevent color discrepancies and grid-like artifacts, we apply the ID-MRF loss to improve the visual appearance by comparing similarities of long-distance feature patches. To further capture the long-term relationship of different regions in large missing regions, we introduce the CAM. Although CAM has the ability to create plausible results via reconstructing refined features, it depends on initial predicted results. Hence, we employ the DFB to obtain larger and more effective receptive fields, which benefits to predict more precise and fine-grained information for CAM. Extensive experiments on two widely-used datasets demonstrate that our proposed framework significantly outperforms the state-of-the-art approaches both in quantity and quality.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited