MDPI - Publisher of Open Access Journals

25 pages, 6911 KiB

Open AccessArticle

Image Inpainting Algorithm Based on Structure-Guided Generative Adversarial Network

by Li Zhao, Tongyang Zhu, Chuang Wang, Feng Tian and Hongge Yao

Mathematics 2025, 13(15), 2370; https://doi.org/10.3390/math13152370 - 24 Jul 2025

Viewed by 331

To address the challenges of image inpainting in scenarios with extensive or irregular missing regions—particularly detail oversmoothing, structural ambiguity, and textural incoherence—this paper proposes an Image Structure-Guided (ISG) framework that hierarchically integrates structural priors with semantic-aware texture synthesis. The proposed methodology advances a [...] Read more.

To address the challenges of image inpainting in scenarios with extensive or irregular missing regions—particularly detail oversmoothing, structural ambiguity, and textural incoherence—this paper proposes an Image Structure-Guided (ISG) framework that hierarchically integrates structural priors with semantic-aware texture synthesis. The proposed methodology advances a two-stage restoration paradigm: (1) Structural Prior Extraction, where adaptive edge detection algorithms identify residual contours in corrupted regions, and a transformer-enhanced network reconstructs globally consistent structural maps through contextual feature propagation; (2) Structure-Constrained Texture Synthesis, wherein a multi-scale generator with hybrid dilated convolutions and channel attention mechanisms iteratively refines high-fidelity textures under explicit structural guidance. The framework introduces three innovations: (1) a hierarchical feature fusion architecture that synergizes multi-scale receptive fields with spatial-channel attention to preserve long-range dependencies and local details simultaneously; (2) spectral-normalized Markovian discriminator with gradient-penalty regularization, enabling adversarial training stability while enforcing patch-level structural consistency; and (3) dual-branch loss formulation combining perceptual similarity metrics with edge-aware constraints to align synthesized content with both semantic coherence and geometric fidelity. Our experiments on the two benchmark datasets (Places2 and CelebA) have demonstrated that our framework achieves more unified textures and structures, bringing the restored images closer to their original semantic content. Full article

► Show Figures

Figure 1

16 pages, 1610 KiB

Open AccessArticle

Cascaded Dual-Inpainting Network for Scene Text

by Chunmei Liu

Appl. Sci. 2025, 15(14), 7742; https://doi.org/10.3390/app15147742 - 10 Jul 2025

Viewed by 207

Abstract

Scene text inpainting is a significant research challenge in visual text processing, with critical applications spanning incomplete traffic sign comprehension, degraded container-code recognition, occluded vehicle license plate processing, and other incomplete scene text processing systems. In this paper, a cascaded dual-inpainting network for [...] Read more.

Scene text inpainting is a significant research challenge in visual text processing, with critical applications spanning incomplete traffic sign comprehension, degraded container-code recognition, occluded vehicle license plate processing, and other incomplete scene text processing systems. In this paper, a cascaded dual-inpainting network for scene text (CDINST) is proposed. The architecture integrates two scene text inpainting models to reconstruct the text foreground: the Structure Generation Module (SGM) and Structure Reconstruction Module (SRM). The SGM primarily performs preliminary foreground text reconstruction and extracts text structures. Building upon the SGM’s guidance, the SRM subsequently enhances the foreground structure reconstruction through structure-guided refinement. The experimental results demonstrate compelling performance on the benchmark dataset, showcasing both the effectiveness of the proposed dual-inpainting network and its accuracy in incomplete scene text recognition. The proposed network achieves an average recognition accuracy improvement of 11.94% compared to baseline methods for incomplete scene text recognition tasks. Full article

► Show Figures

Figure 1

37 pages, 20758 KiB

Open AccessReview

A Comprehensive Review of Image Restoration Research Based on Diffusion Models

by Jun Li, Heran Wang, Yingjie Li and Haochuan Zhang

Mathematics 2025, 13(13), 2079; https://doi.org/10.3390/math13132079 - 24 Jun 2025

Viewed by 1848

Abstract

Image restoration is an indispensable and challenging task in computer vision, aiming to enhance the quality of images degraded by various forms of degradation. Diffusion models have achieved remarkable progress in AIGC (Artificial Intelligence Generated Content) image generation, and numerous studies have explored [...] Read more.

Image restoration is an indispensable and challenging task in computer vision, aiming to enhance the quality of images degraded by various forms of degradation. Diffusion models have achieved remarkable progress in AIGC (Artificial Intelligence Generated Content) image generation, and numerous studies have explored their application in image restoration, achieving performance surpassing that of other methods. This paper provides a comprehensive overview of diffusion models for image restoration, starting with an introduction to the background of diffusion models. It summarizes relevant theories and research in utilizing diffusion models for image restoration in recent years, elaborating on six commonly used methods and their unified paradigm. Based on these six categories, this paper classifies restoration tasks into two main areas: image super-resolution reconstruction and frequency-selective image restoration. The frequency-selective image restoration category includes image deblurring, image inpainting, image deraining, image desnowing, image dehazing, image denoising, and low-light enhancement. For each area, this paper delves into the technical principles and modeling strategies. Furthermore, it analyzes the specific characteristics and contributions of the diffusion models employed in each application category. This paper summarizes commonly used datasets and evaluation metrics for these six applications to facilitate comprehensive evaluation of existing methods. Finally, it concludes by identifying the limitations of current research, outlining challenges, and offering perspectives on future applications. Full article

► Show Figures

Figure 1

22 pages, 4021 KiB

Open AccessArticle

Image Characteristic-Guided Learning Method for Remote-Sensing Image Inpainting

by Ying Zhou, Xiang Gao, Xinrong Wu, Fan Wang, Weipeng Jing and Xiaopeng Hu

Remote Sens. 2025, 17(13), 2132; https://doi.org/10.3390/rs17132132 - 21 Jun 2025

Viewed by 441

Abstract

Inpainting noisy remote-sensing images can reduce the cost of acquiring remote-sensing images (RSIs). Since RSIs contain complex land structure features and concentrated obscured areas, existing inpainting methods often produce color inconsistency and structural smoothing when applied to RSIs with a high missing ratio. [...] Read more.

Inpainting noisy remote-sensing images can reduce the cost of acquiring remote-sensing images (RSIs). Since RSIs contain complex land structure features and concentrated obscured areas, existing inpainting methods often produce color inconsistency and structural smoothing when applied to RSIs with a high missing ratio. To address these problems, inspired by tensor recovery, a lightweight image Inpainting Generative Adversarial Network (GAN) method combining low-rankness and local-smoothness (IGLL) is proposed. IGLL utilizes the low-rankness and local-smoothness characteristics of RSIs to guide the deep-learning inpainting. Based on the strong low rankness characteristic of the RSIs, IGLL fully utilizes the background information for foreground inpainting and constrains the consistency of the key ranks. Based on the low smoothness characteristic of the RSIs, learnable edges and structure priors are designed to enhance the non-smoothness of the results. Specifically, the generator of IGLL consists of a pixel-level reconstruction net (PIRN) and a perception-level reconstruction net (PERN). In PIRN, the proposed global attention module (GAM) establishes long-range pixel dependencies. GAM performs precise normalization and avoids overfitting. In PERN, the proposed flexible feature similarity module (FFSM) computes the similarity between background and foreground features and selects a reasonable feature for recovery. Compared with existing works, FFSM improves the fineness of feature matching. To avoid the problem of local-smoothness in the results, both the generator and discriminator utilize the structure priors and learnable edges to regularize large concentrated missing regions. Additionally, IGLL incorporates mathematical constraints into deep-learning models. A singular value decomposition (SVD) loss item is proposed to model the low-rankness characteristic, and it constrains feature consistency. Extensive experiments demonstrate that the proposed IGLL performs favorably against state-of-the-art methods in terms of the reconstruction quality and computation costs, especially on RSIs with high mask ratios. Moreover, our ablation studies reveal the effectiveness of GAM, FFSM, and SVD loss. Source code is publicly available on GitHub. Full article

(This article belongs to the Special Issue 3D Information Recovery and 2D Image Processing for Remotely Sensed Optical Images (Third Edition))

► Show Figures

Figure 1

24 pages, 7475 KiB

Open AccessArticle

Application of a Dual-Stream Network Collaboratively Based on Wavelet and Spatial-Channel Convolution in the Inpainting of Blank Strips in Marine Electrical Imaging Logging Images: A Case Study in the South China Sea

by Guilan Lin, Sinan Fang, Manxin Li, Hongtao Wu, Chenxi Xue and Zeyu Zhang

J. Mar. Sci. Eng. 2025, 13(5), 997; https://doi.org/10.3390/jmse13050997 - 21 May 2025

Cited by 1 | Viewed by 492

Abstract

Electrical imaging logging technology precisely characterizes the features of the formation on the borehole wall through high-resolution resistivity images. However, the problem of blank strips caused by the mismatch between the instrument pads and the borehole diameter seriously affects the accuracy of fracture [...] Read more.

Electrical imaging logging technology precisely characterizes the features of the formation on the borehole wall through high-resolution resistivity images. However, the problem of blank strips caused by the mismatch between the instrument pads and the borehole diameter seriously affects the accuracy of fracture identification and formation continuity interpretation in marine oil and gas reservoirs. Existing inpainting methods struggle to reconstruct complex geological textures while maintaining structural continuity, particularly in balancing low-frequency formation morphology with high-frequency fracture details. To address this issue, this paper proposes an inpainting method using a dual-stream network based on the collaborative optimization of wavelet and spatial-channel convolution. By designing a texture-aware data prior algorithm, a high-quality training dataset with geological rationality is generated. A dual-stream encoder–decoder network architecture is adopted, and the wavelet transform convolution (WTConv) module is utilized to enhance the multi-scale perception ability of the generator, achieving a collaborative analysis of the low-frequency formation structure and high-frequency fracture details. Combined with the spatial channel convolution (SCConv) to enhance the feature fusion module, the cross-modal interaction between texture and structural features is optimized through a dynamic gating mechanism. Furthermore, a multi-objective loss function is introduced to constrain the semantic coherence and visual authenticity of image reconstruction. Experiments show that, in the inpainting indexes for Block X in the South China Sea, the mean absolute error (MAE), structural similarity index (SSIM), and peak signal-to-noise ratio (PSNR) of this method are 6.893, 0.779, and 19.087, respectively, which are significantly better than the improved filtersim, U-Net, and AOT-GAN methods. The correlation degree of the pixel distribution between the inpainted area and the original image reaches 0.921~0.997, verifying the precise matching of the low-frequency morphology and high-frequency details. In the inpainting of electrical imaging logging images across blocks, the applicability of the method is confirmed, effectively solving the interference of blank strips on the interpretation accuracy of marine oil and gas reservoirs. It provides an intelligent inpainting tool with geological interpretability for the electrical imaging logging interpretation of complex reservoirs, and has important engineering value for improving the efficiency of oil and gas exploration and development. Full article

(This article belongs to the Special Issue Research on Offshore Oil and Gas Numerical Simulation)

► Show Figures

Figure 1

36 pages, 8348 KiB

Open AccessArticle

Classical vs. Machine Learning-Based Inpainting for Enhanced Classification of Remote Sensing Image

by Aleksandra Sekrecka and Kinga Karwowska

Remote Sens. 2025, 17(7), 1305; https://doi.org/10.3390/rs17071305 - 5 Apr 2025

Cited by 1 | Viewed by 1221

Abstract

Inpainting is a technique that allows for the reconstruction of images and the removal of unnecessary elements. In our research, we employed inpainting to eliminate erroneous lines in the images and examined its abilities in improving classification quality. To reduce the erroneous lines, [...] Read more.

Inpainting is a technique that allows for the reconstruction of images and the removal of unnecessary elements. In our research, we employed inpainting to eliminate erroneous lines in the images and examined its abilities in improving classification quality. To reduce the erroneous lines, we designed ResGMCNN, whose multi-column generator model uses residual blocks. For our studies, we used data from the COWC and DOTA datasets. The GMCNN model with residual connections outperformed most classical inpainting methods, including the Telea and Navier–Stokes methods, achieving a maximum structural similarity index measure (SSIM) of 0.93. However, despite the improvement in filling quality, these results still lag behind the Criminisi method, which achieved the highest SSIM values (up to 0.99). We investigated the improvement in classification quality by removing vehicles from the road class in images acquired by UAVs. For vehicle removal, we used Criminisi inpainting, as well as Navier–Stokes and Telea for comparison. Classification was performed using eight classifiers, six of which were based on machine learning, where we proposed our solutions. The results showed that classification quality could be improved by several to over a dozen percent, depending on the metric, image, and classification method. The F1-score and Cohen Kappa metrics indicated an improvement in classification quality of up to 13% in comparison to the classification of the original image. Nevertheless, each of the classical inpainting methods examined improved the road classification. Full article

► Show Figures

Figure 1

24 pages, 4262 KiB

Open AccessArticle

PigFRIS: A Three-Stage Pipeline for Fence Occlusion Segmentation, GAN-Based Pig Face Inpainting, and Efficient Pig Face Recognition

by Ruihan Ma, Seyeon Chung, Sangcheol Kim and Hyongsuk Kim

Animals 2025, 15(7), 978; https://doi.org/10.3390/ani15070978 - 28 Mar 2025

Viewed by 637

Abstract

Accurate animal face recognition is essential for effective health monitoring, behavior analysis, and productivity management in smart farming. However, environmental obstructions and animal behaviors complicate identification tasks. In pig farming, fences and frequent movements often occlude essential facial features, while high inter-class similarity [...] Read more.

Accurate animal face recognition is essential for effective health monitoring, behavior analysis, and productivity management in smart farming. However, environmental obstructions and animal behaviors complicate identification tasks. In pig farming, fences and frequent movements often occlude essential facial features, while high inter-class similarity makes distinguishing individuals even more challenging. To address these issues, we introduce the Pig Face Recognition and Inpainting System (PigFRIS). This integrated framework enhances recognition accuracy by removing occlusions and restoring missing facial features. PigFRIS employs state-of-the-art occlusion detection with the YOLOv11 segmentation model, a GAN-based inpainting reconstruction module using AOT-GAN, and a lightweight recognition module tailored for pig face classification. In doing so, our system detects occlusions, reconstructs obscured regions, and emphasizes key facial features, thereby improving overall performance. Experimental results validate the effectiveness of PigFRIS. For instance, YOLO11l achieves a recall of 94.92% and a

{AP}_{50}

of 96.28% for occlusion detection, AOTGAN records a FID of 51.48 and an SSIM of 91.50% for image restoration, and EfficientNet-B2 attains an accuracy of 91.62% with an F1 Score of 91.44% in classification. Additionally, heatmap analysis reveals that the system successfully focuses on relevant facial features rather than irrelevant occlusions, enhancing classification reliability. This work offers a novel and practical solution for animal face recognition in smart farming. It overcomes the limitations of existing methods and contributes to more effective livestock management and advancements in agricultural technology. Full article

(This article belongs to the Special Issue Precision Livestock Farming: New Techniques for Monitoring the Behaviour and Welfare of Farm Animal)

► Show Figures

Figure 1

18 pages, 2639 KiB

Open AccessArticle

Privacy-Preserved Visual Simultaneous Localization and Mapping Based on a Dual-Component Approach

by Mingxu Yang, Chuhua Huang, Xin Huang and Shengjin Hou

Appl. Sci. 2025, 15(5), 2583; https://doi.org/10.3390/app15052583 - 27 Feb 2025

Viewed by 643

Abstract

Edge-assisted visual simultaneous localization and mapping (SLAM) is widely used in autonomous driving, robot navigation, and augmented reality for environmental perception, map construction, and real-time positioning. However, it poses significant privacy risks, as input images may contain sensitive information, and generated 3D point [...] Read more.

Edge-assisted visual simultaneous localization and mapping (SLAM) is widely used in autonomous driving, robot navigation, and augmented reality for environmental perception, map construction, and real-time positioning. However, it poses significant privacy risks, as input images may contain sensitive information, and generated 3D point clouds can reconstruct original scenes. To address these concerns, this paper proposes a dual-component privacy-preserving approach for visual SLAM. First, a privacy protection method for images is proposed, which combines object detection and image inpainting to protect privacy-sensitive information in images. Second, an encryption algorithm is introduced to convert 3D point cloud data into a 3D line cloud through dimensionality enhancement. Integrated with ORB-SLAM3, the proposed method is evaluated on the Oxford Robotcar and KITTI datasets. Results demonstrate that it effectively safeguards privacy-sensitive information while ORB-SLAM3 maintains accurate pose estimation in dynamic outdoor scenes. Furthermore, the encrypted line cloud prevents unauthorized attacks on recovering the original point cloud. This approach enhances privacy protection in visual SLAM and is expected to expand its potential applications. Full article

(This article belongs to the Special Issue Advanced Technologies in Data and Information Security, Fourth Edition)

► Show Figures

Figure 1

17 pages, 3986 KiB

Open AccessArticle

Efficient Image Inpainting for Handwritten Text Removal Using CycleGAN Framework

by Somanka Maiti, Shabari Nath Panuganti, Gaurav Bhatnagar and Jonathan Wu

Mathematics 2025, 13(1), 176; https://doi.org/10.3390/math13010176 - 6 Jan 2025

Viewed by 2040

Abstract

With the recent rise in the development of deep learning techniques, image inpainting—the process of restoring missing or corrupted regions in images—has witnessed significant advancements. Although state-of-the-art models are effective, they often fail to inpaint complex missing areas, especially when handwritten occlusions are [...] Read more.

With the recent rise in the development of deep learning techniques, image inpainting—the process of restoring missing or corrupted regions in images—has witnessed significant advancements. Although state-of-the-art models are effective, they often fail to inpaint complex missing areas, especially when handwritten occlusions are present in the image. To address this issue, an image inpainting model based on a residual CycleGAN is proposed. The generator takes as input the image occluded by handwritten missing patches and generates a restored image, which the discriminator then compares with the original ground truth image to determine whether it is real or fake. An adversarial trade-off between the generator and discriminator motivates the model to improve its training and produce a superior reconstructed image. Extensive experiments and analyses confirm that the proposed method generates inpainted images with superior visual quality and outperforms state-of-the-art deep learning approaches. Full article

(This article belongs to the Special Issue New Trends in Computer Vision, Pattern Recognition and Machine Learning)

► Show Figures

Figure 1

21 pages, 66390 KiB

Open AccessArticle

Photorealistic Texture Contextual Fill-In

by Radek Richtr

Heritage 2025, 8(1), 9; https://doi.org/10.3390/heritage8010009 - 27 Dec 2024

Cited by 1 | Viewed by 1438

Abstract

This paper presents a comprehensive study of the application of AI-driven inpainting techniques to the restoration of historical photographs of the Czech city Most, with a focus on restoration and reconstructing the lost architectural heritage. The project combines state-of-the-art methods, including generative adversarial [...] Read more.

This paper presents a comprehensive study of the application of AI-driven inpainting techniques to the restoration of historical photographs of the Czech city Most, with a focus on restoration and reconstructing the lost architectural heritage. The project combines state-of-the-art methods, including generative adversarial networks (GANs), patch-based inpainting, and manual retouching, to restore and enhance severely degraded images. The reconstructed/restored photographs of the city Most offer an invaluable visual representation of a city that was largely destroyed for industrial purposes in the 20th century. Through a series of blind and informed user tests, we assess the subjective quality of the restored images and examine how knowledge of edited areas influences user perception. Additionally, this study addresses the technical challenges of inpainting, including computational demands, interpretability, and bias in AI models. Ethical considerations, particularly regarding historical authenticity and speculative reconstruction, are also discussed. The findings demonstrate that AI techniques can significantly contribute to the preservation of cultural heritage, but must be applied with careful oversight to maintain transparency and cultural integrity. Future work will focus on improving the interpretability and efficiency of these methods, while ensuring that reconstructions remain historically and culturally sensitive. Full article

(This article belongs to the Section Cultural Heritage)

► Show Figures

Figure 1

19 pages, 8686 KiB

Open AccessArticle

Prior-FOVNet: A Multimodal Deep Learning Framework for Megavoltage Computed Tomography Truncation Artifact Correction and Field-of-View Extension

by Long Tang, Mengxun Zheng, Peiwen Liang, Zifeng Li, Yongqi Zhu and Hua Zhang

Sensors 2025, 25(1), 39; https://doi.org/10.3390/s25010039 - 25 Dec 2024

Viewed by 972

Abstract

Megavoltage computed tomography (MVCT) plays a crucial role in patient positioning and dose reconstruction during tomotherapy. However, due to the limited scan field of view (sFOV), the entire cross-section of certain patients may not be fully covered, resulting in projection data truncation. Truncation [...] Read more.

Megavoltage computed tomography (MVCT) plays a crucial role in patient positioning and dose reconstruction during tomotherapy. However, due to the limited scan field of view (sFOV), the entire cross-section of certain patients may not be fully covered, resulting in projection data truncation. Truncation artifacts in MVCT can compromise registration accuracy with the planned kilovoltage computed tomography (KVCT) and hinder subsequent MVCT-based adaptive planning. To address this issue, we propose a Prior-FOVNet to correct the truncation artifacts and extend the field of view (eFOV) by leveraging material and shape priors learned from the KVCT of the same patient. Specifically, to address the intensity discrepancies between different imaging modalities, we employ a contrastive learning-based GAN, named TransNet, to transform KVCT images into synthesized MVCT (sMVCT) images. The sMVCT images, along with pre-corrected MVCT images obtained via sinogram extrapolation, are then input into a Swin Transformer-based image inpainting network for artifact correction and FOV extension. Experimental results using both simulated and real patient data demonstrate that our method outperforms existing truncation correction techniques in reducing truncation artifacts and reconstructing anatomical structures beyond the sFOV. It achieves the lowest MAE of 23.8 ± 5.6 HU and the highest SSIM of 97.8 ± 0.6 across the test dataset, thereby enhancing the reliability and clinical applicability of MVCT in adaptive radiotherapy. Full article

(This article belongs to the Special Issue Recent Advances in the Acquisition and Processing of Biomedical Signals and Images)

► Show Figures

Figure 1

15 pages, 460 KiB

Open AccessArticle

Unified Domain Adaptation for Specialized Indoor Scene Inpainting Using a Pre-Trained Model

by Asrafi Akter and Myungho Lee

Electronics 2024, 13(24), 4970; https://doi.org/10.3390/electronics13244970 - 17 Dec 2024

Viewed by 1063

Abstract

Image inpainting for indoor environments presents unique challenges due to complex spatial relationships, diverse lighting conditions, and domain-specific object configurations. This paper introduces a resource-efficient post-processing framework that enhances domain-specific image inpainting through an adaptation mechanism. Our architecture integrates a convolutional neural network [...] Read more.

Image inpainting for indoor environments presents unique challenges due to complex spatial relationships, diverse lighting conditions, and domain-specific object configurations. This paper introduces a resource-efficient post-processing framework that enhances domain-specific image inpainting through an adaptation mechanism. Our architecture integrates a convolutional neural network with residual connections optimized via a multi-term objective function combining perceptual losses and adaptive loss weighting. Experiments on our curated dataset of 4000 indoor household scenes demonstrate improved performance, with training completed in 20 min on commodity GPU hardware with 0.14 s of inference latency per image. The framework exhibits enhanced results across standard metrics (FID, SSIM, LPIPS, MAE, and PSNR), showing improvements in structural coherence and perceptual quality while preserving cross-domain generalization abilities. Our methodology offers a novel approach for efficient domain adaptation in image inpainting, particularly suitable for real-world applications under computational constraints. This work advances the development of domain-aware image restoration systems and provides architectural insights for specialized image processing frameworks. Full article

(This article belongs to the Special Issue AI Synergy: Vision, Language, and Modality)

► Show Figures

Figure 1

28 pages, 8980 KiB

Open AccessArticle

AI-Assisted Restoration of Yangshao Painted Pottery Using LoRA and Stable Diffusion

by Xinyi Zhang

Heritage 2024, 7(11), 6282-6309; https://doi.org/10.3390/heritage7110295 - 8 Nov 2024

Cited by 7 | Viewed by 3280

Abstract

This study is concerned with the restoration of painted pottery images from the Yangshao period. The objective is to enhance the efficiency and accuracy of the restoration process for complex pottery patterns. Conventional restoration techniques encounter difficulties in accurately and efficiently reconstructing intricate [...] Read more.

This study is concerned with the restoration of painted pottery images from the Yangshao period. The objective is to enhance the efficiency and accuracy of the restoration process for complex pottery patterns. Conventional restoration techniques encounter difficulties in accurately and efficiently reconstructing intricate designs. To address this issue, the study proposes an AI-assisted restoration workflow that combines Stable Diffusion models (SD) with Low-Rank Adaptation (LoRA) technology. By training a LoRA model on a dataset of typical Yangshao painted pottery patterns and integrating image inpainting techniques, the accuracy and efficiency of the restoration process are enhanced. The results demonstrate that this method provides an effective restoration tool while maintaining consistency with the original artistic style, supporting the digital preservation of cultural heritage. This approach also offers archaeologists flexible restoration options, promoting the broader application and preservation of cultural heritage. Full article

► Show Figures

Figure 1

17 pages, 18662 KiB

Open AccessArticle

Symmetric Connected U-Net with Multi-Head Self Attention (MHSA) and WGAN for Image Inpainting

by Yanyang Hou, Xiaopeng Ma, Junjun Zhang and Chenxian Guo

Symmetry 2024, 16(11), 1423; https://doi.org/10.3390/sym16111423 - 25 Oct 2024

Cited by 1 | Viewed by 1819

Abstract

This study presents a new image inpainting model based on U-Net and incorporating the Wasserstein Generative Adversarial Network (WGAN). The model uses skip connections to connect every encoder block to the corresponding decoder block, resulting in a strictly symmetrical architecture referred to as [...] Read more.

This study presents a new image inpainting model based on U-Net and incorporating the Wasserstein Generative Adversarial Network (WGAN). The model uses skip connections to connect every encoder block to the corresponding decoder block, resulting in a strictly symmetrical architecture referred to as Symmetric Connected U-Net (SC-Unet). By combining SC-Unet with a GAN, the study aims to reconstruct images more effectively and seamlessly. The traditional discriminators only differentiate the entire image as true or false. In this study, the discriminator calculated the probability of each pixel belonging to the hole and non-hole regions, which provided the generator with more gradient loss information for image inpainting. Additionally, every block of SC-Unet incorporated a Dilated Convolutional Neural Network (DCNN) to increase the receptive field of the convolutional layers. Our model also integrated Multi-Head Self-Attention (MHSA) into selected blocks to enable it to efficiently search the entire image for suitable content to fill the missing areas. This study adopts the publicly available datasets CelebA-HQ and ImageNet for evaluation. Our proposed algorithm demonstrates a 10% improvement in PSNR and a 2.94% improvement in SSIM compared to existing representative image inpainting methods in the experiment. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

18 pages, 10444 KiB

Open AccessArticle

Ancient Painting Inpainting Based on Multi-Layer Feature Enhancement and Frequency Perception

by Xiaotong Liu, Jin Wan, Nan Wang and Yuting Wang

Electronics 2024, 13(16), 3309; https://doi.org/10.3390/electronics13163309 - 21 Aug 2024

Viewed by 1108

Abstract

Image inpainting aims to restore the damaged information in images, enhancing their readability and usability. Ancient paintings, as a vital component of traditional art, convey profound cultural and artistic value, yet often suffer from various forms of damage over time. Existing ancient painting [...] Read more.

Image inpainting aims to restore the damaged information in images, enhancing their readability and usability. Ancient paintings, as a vital component of traditional art, convey profound cultural and artistic value, yet often suffer from various forms of damage over time. Existing ancient painting inpainting methods are insufficient in extracting deep semantic information, resulting in the loss of high-frequency detail features of the reconstructed image and inconsistency between global and local semantic information. To address these issues, this paper proposes a Generative Adversarial Network (GAN)-based ancient painting inpainting method using multi-layer feature enhancement and frequency perception, named MFGAN. Firstly, we design a Residual Pyramid Encoder (RPE), which fully extracts the deep semantic features of ancient painting images and strengthens the processing of image details by effectively combining the deep feature extraction module and channel attention. Secondly, we propose a Frequency-Aware Mechanism (FAM) to obtain the high-frequency perceptual features by using the frequency attention module, which captures the high-frequency details and texture features of the ancient paintings by increasing the skip connections between the low-frequency and the high-frequency features, and provides more frequency perception information. Thirdly, a Dual Discriminator (DD) is designed to ensure the consistency of semantic information between global and local region images, while reducing the discontinuity and blurring differences at the boundary during image inpainting. Finally, extensive experiments on the proposed ancient painting and Huaniao datasets show that our proposed method outperforms competitive image inpainting methods and exhibits robust generalization capabilities. Full article

(This article belongs to the Special Issue AI Synergy: Vision, Language, and Modality)

► Show Figures

Figure 1

Search Results (74)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (74)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI