MDPI - Publisher of Open Access Journals

25 pages, 6911 KiB

Open AccessArticle

Image Inpainting Algorithm Based on Structure-Guided Generative Adversarial Network

by Li Zhao, Tongyang Zhu, Chuang Wang, Feng Tian and Hongge Yao

Mathematics 2025, 13(15), 2370; https://doi.org/10.3390/math13152370 - 24 Jul 2025

Viewed by 321

To address the challenges of image inpainting in scenarios with extensive or irregular missing regions—particularly detail oversmoothing, structural ambiguity, and textural incoherence—this paper proposes an Image Structure-Guided (ISG) framework that hierarchically integrates structural priors with semantic-aware texture synthesis. The proposed methodology advances a [...] Read more.

To address the challenges of image inpainting in scenarios with extensive or irregular missing regions—particularly detail oversmoothing, structural ambiguity, and textural incoherence—this paper proposes an Image Structure-Guided (ISG) framework that hierarchically integrates structural priors with semantic-aware texture synthesis. The proposed methodology advances a two-stage restoration paradigm: (1) Structural Prior Extraction, where adaptive edge detection algorithms identify residual contours in corrupted regions, and a transformer-enhanced network reconstructs globally consistent structural maps through contextual feature propagation; (2) Structure-Constrained Texture Synthesis, wherein a multi-scale generator with hybrid dilated convolutions and channel attention mechanisms iteratively refines high-fidelity textures under explicit structural guidance. The framework introduces three innovations: (1) a hierarchical feature fusion architecture that synergizes multi-scale receptive fields with spatial-channel attention to preserve long-range dependencies and local details simultaneously; (2) spectral-normalized Markovian discriminator with gradient-penalty regularization, enabling adversarial training stability while enforcing patch-level structural consistency; and (3) dual-branch loss formulation combining perceptual similarity metrics with edge-aware constraints to align synthesized content with both semantic coherence and geometric fidelity. Our experiments on the two benchmark datasets (Places2 and CelebA) have demonstrated that our framework achieves more unified textures and structures, bringing the restored images closer to their original semantic content. Full article

► Show Figures

Figure 1

27 pages, 8957 KiB

Open AccessArticle

DFAN: Single Image Super-Resolution Using Stationary Wavelet-Based Dual Frequency Adaptation Network

by Gyu-Il Kim and Jaesung Lee

Symmetry 2025, 17(8), 1175; https://doi.org/10.3390/sym17081175 - 23 Jul 2025

Viewed by 302

Abstract

Single image super-resolution is the inverse problem of reconstructing a high-resolution image from its low-resolution counterpart. Although recent Transformer-based architectures leverage global context integration to improve reconstruction quality, they often overlook frequency-specific characteristics, resulting in the loss of high-frequency information. To address this [...] Read more.

Single image super-resolution is the inverse problem of reconstructing a high-resolution image from its low-resolution counterpart. Although recent Transformer-based architectures leverage global context integration to improve reconstruction quality, they often overlook frequency-specific characteristics, resulting in the loss of high-frequency information. To address this limitation, we propose the Dual Frequency Adaptive Network (DFAN). DFAN first decomposes the input into low- and high-frequency components via Stationary Wavelet Transform. In the low-frequency branch, Swin Transformer layers restore global structures and color consistency. In contrast, the high-frequency branch features a dedicated module that combines Directional Convolution with Residual Dense Blocks, precisely reinforcing edges and textures. A frequency fusion module then adaptively merges these complementary features using depthwise and pointwise convolutions, achieving a balanced reconstruction. During training, we introduce a frequency-aware multi-term loss alongside the standard pixel-wise loss to explicitly encourage high-frequency preservation. Extensive experiments on the Set5, Set14, BSD100, Urban100, and Manga109 benchmarks show that DFAN achieves up to +0.64 dBpeak signal-to-noise ratio, +0.01 structural similarity index measure, and −0.01learned perceptual image patch similarity over the strongest frequency-domain baselines, while also delivering visibly sharper textures and cleaner edges. By unifying spatial and frequency-domain advantages, DFAN effectively mitigates high-frequency degradation and enhances SISR performance. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

26 pages, 7178 KiB

Open AccessArticle

Super-Resolution Reconstruction of Formation MicroScanner Images Based on the SRGAN Algorithm

by Changqiang Ma, Xinghua Qi, Liangyu Chen, Yonggui Li, Jianwei Fu and Zejun Liu

Processes 2025, 13(7), 2284; https://doi.org/10.3390/pr13072284 - 17 Jul 2025

Viewed by 333

Abstract

Formation MicroScanner Image (FMI) technology is a key method for identifying fractured reservoirs and optimizing oil and gas exploration, but its inherent insufficient resolution severely constrains the fine characterization of geological features. This study innovatively applies a Super-Resolution Generative Adversarial Network (SRGAN) to [...] Read more.

Formation MicroScanner Image (FMI) technology is a key method for identifying fractured reservoirs and optimizing oil and gas exploration, but its inherent insufficient resolution severely constrains the fine characterization of geological features. This study innovatively applies a Super-Resolution Generative Adversarial Network (SRGAN) to the super-resolution reconstruction of FMI logging image to address this bottleneck problem. By collecting FMI logging image of glutenite from a well in Xinjiang, a training set containing 24,275 images was constructed, and preprocessing strategies such as grayscale conversion and binarization were employed to optimize input features. Leveraging SRGAN’s generator-discriminator adversarial mechanism and perceptual loss function, high-quality mapping from low-resolution FMI logging image to high-resolution images was achieved. This study yields significant results: in RGB image reconstruction, SRGAN achieved a Peak Signal-to-Noise Ratio (PSNR) of 41.39 dB, surpassing the optimal traditional method (bicubic interpolation) by 61.6%; its Structural Similarity Index (SSIM) reached 0.992, representing a 34.1% improvement; in grayscale image processing, SRGAN effectively eliminated edge blurring, with the PSNR (40.15 dB) and SSIM (0.990) exceeding the suboptimal method (bilinear interpolation) by 36.6% and 9.9%, respectively. These results fully confirm that SRGAN can significantly restore edge contours and structural details in FMI logging image, with performance far exceeding traditional interpolation methods. This study not only systematically verifies, for the first time, SRGAN’s exceptional capability in enhancing FMI resolution, but also provides a high-precision data foundation for reservoir parameter inversion and geological modeling, holding significant application value for advancing the intelligent exploration of complex hydrocarbon reservoirs. Full article

(This article belongs to the Special Issue Data Acquisition, Processing, Analysis Methods and Process Control in Energy Exploration Systems)

► Show Figures

Figure 1

16 pages, 2376 KiB

Open AccessArticle

Nested U-Net-Based GAN Model for Super-Resolution of Stained Light Microscopy Images

by Seong-Hyeon Kang and Ji-Youn Kim

Photonics 2025, 12(7), 665; https://doi.org/10.3390/photonics12070665 - 1 Jul 2025

Viewed by 385

Abstract

The purpose of this study was to propose a deep learning-based model for the super-resolution reconstruction of stained light microscopy images. To achieve this, perceptual loss was applied to the generator to reflect multichannel signal intensity, distribution, and structural similarity. A nested U-Net [...] Read more.

The purpose of this study was to propose a deep learning-based model for the super-resolution reconstruction of stained light microscopy images. To achieve this, perceptual loss was applied to the generator to reflect multichannel signal intensity, distribution, and structural similarity. A nested U-Net architecture was employed to address the representational limitations of the conventional U-Net. For quantitative evaluation, the peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and correlation coefficient (CC) were calculated. In addition, intensity profile analysis was performed to assess the model’s ability to restore the boundary signals more precisely. The experimental results demonstrated that the proposed model outperformed both the signal and structural restoration compared to single U-Net and U-Net-based generative adversarial network (GAN) models. Consequently, the PSNR, SSIM, and CC values demonstrated relative improvements of approximately 1.017, 1.023, and 1.010 times, respectively, compared to the input images. In particular, the intensity profile analysis confirmed the effectiveness of the nested U-Net-based generator in restoring cellular boundaries and structures in the stained microscopy images. In conclusion, the proposed model effectively enhanced the resolution of stained light microscopy images acquired in a multichannel format. Full article

(This article belongs to the Special Issue Recent Advances in Biomedical Optics and Biophotonics)

► Show Figures

Figure 1

14 pages, 13345 KiB

Open AccessArticle

Synthetic Fog Generation Using High-Performance Dehazing Networks for Surveillance Applications

by Heekwon Lee, Byeongseon Park, Yong-Kab Kim and Sungkwan Youm

Appl. Sci. 2025, 15(12), 6503; https://doi.org/10.3390/app15126503 - 9 Jun 2025

Viewed by 389

Abstract

This research addresses visibility challenges in surveillance systems under foggy conditions through a novel synthetic fog generation method leveraging the GridNet dehazing architecture. Our approach uniquely reverses GridNet, originally developed for fog removal, to synthesize realistic foggy images. The proposed Fog Generator Model [...] Read more.

This research addresses visibility challenges in surveillance systems under foggy conditions through a novel synthetic fog generation method leveraging the GridNet dehazing architecture. Our approach uniquely reverses GridNet, originally developed for fog removal, to synthesize realistic foggy images. The proposed Fog Generator Model incorporates perceptual and dark channel consistency losses to enhance fog realism and structural consistency. Comparative experiments on the O-HAZY dataset demonstrate that dehazing models trained on our synthetic fog outperform those trained on conventional methods, achieving superior Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) scores. These findings confirm that integrating high-performance dehazing networks into fog synthesis improves the realism and effectiveness of fog removal solutions, offering significant benefits for real-world surveillance applications. Full article

(This article belongs to the Special Issue Advances and Application of Intelligent Video Surveillance Systems: Volume II)

► Show Figures

Figure 1

19 pages, 3331 KiB

Open AccessArticle

Low-Light Image Enhancement Using Deep Learning: A Lightweight Network with Synthetic and Benchmark Dataset Evaluation

by Manuel J. C. S. Reis

Appl. Sci. 2025, 15(11), 6330; https://doi.org/10.3390/app15116330 - 4 Jun 2025

Viewed by 1436

Abstract

Low-light conditions often lead to severe degradation in image quality, impairing critical computer vision tasks in applications such as surveillance and mobile imaging. In this paper, we propose a lightweight deep learning framework for low-light image enhancement, designed to balance visual quality with [...] Read more.

Low-light conditions often lead to severe degradation in image quality, impairing critical computer vision tasks in applications such as surveillance and mobile imaging. In this paper, we propose a lightweight deep learning framework for low-light image enhancement, designed to balance visual quality with computational efficiency, with potential for deployment in latency-sensitive and resource-constrained environments. The architecture builds upon a UNet-inspired encoder–decoder structure, enhanced with attention modules and trained using a combination of perceptual and structural loss functions. Our training strategy utilizes a hybrid dataset composed of both real low-light images and synthetically generated image pairs created through controlled exposure adjustment and noise modeling. Experimental results on benchmark datasets such as LOL and SID demonstrate that our model achieves a Peak Signal-to-Noise Ratio (PSNR) of up to 28.4 dB and a Structural Similarity Index (SSIM) of 0.88 while maintaining a small parameter footprint (~1.3 M) and low inference latency (~6 FPS on Jetson Nano). The proposed approach offers a promising solution for industrial applications such as real-time surveillance, mobile photography, and embedded vision systems. Full article

(This article belongs to the Special Issue Image Processing: Technologies, Methods, Apparatus)

► Show Figures

Figure 1

21 pages, 80544 KiB

Open AccessArticle

An LCD Defect Image Generation Model Integrating Attention Mechanism and Perceptual Loss

by Sheng Zheng, Yuxin Zhao, Xiaoyue Chen and Shi Luo

Symmetry 2025, 17(6), 833; https://doi.org/10.3390/sym17060833 - 27 May 2025

Viewed by 546

Abstract

With the rise of smart manufacturing, defect detection in small-size liquid crystal display (LCD) screens has become essential for ensuring product quality. Traditional manual inspection is inefficient and labor-intensive, making it unsuitable for modern automated production. Although machine vision techniques offer improved efficiency, [...] Read more.

With the rise of smart manufacturing, defect detection in small-size liquid crystal display (LCD) screens has become essential for ensuring product quality. Traditional manual inspection is inefficient and labor-intensive, making it unsuitable for modern automated production. Although machine vision techniques offer improved efficiency, the lack of high-quality defect datasets limits their performance. To overcome this, we propose a symmetry-aware generative framework, the Squeeze-and-Excitation Wasserstein GAN with Gradient Penalty and Visual Geometry Group(VGG)-based perceptual loss (SWG-VGG), for realistic defect image synthesis.By leveraging the symmetry of feature channels through attention mechanisms and perceptual consistency, the model generates high-fidelity defect images that align with real-world structural patterns. Evaluation using the You Only Look Once version 8(YOLOv8) detection model shows that the synthetic dataset improves mAP@0.5 to 0.976—an increase of 10.5% over real-data-only training. Further assessment using Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM), Root Mean Square Error (RMSE), and Content Similarity (CS) confirms the visual and structural quality of the generated images.This symmetry-guided method provides an effective solution for defect data augmentation and aligns closely with Symmetry’s emphasis on structured pattern generation in intelligent vision systems. Full article

(This article belongs to the Section Engineering and Materials)

► Show Figures

Figure 1

21 pages, 7233 KiB

Open AccessArticle

Advancing Traditional Dunhuang Regional Pattern Design with Diffusion Adapter Networks and Cross-Entropy

by Yihuan Tian, Tao Yu, Zuling Cheng and Sunjung Lee

Entropy 2025, 27(5), 546; https://doi.org/10.3390/e27050546 - 21 May 2025

Viewed by 628

Abstract

To promote the inheritance of traditional culture, a variety of emerging methods rooted in machine learning and deep learning have been introduced. Dunhuang patterns, an important part of traditional Chinese culture, are difficult to collect in large numbers due to their limited availability. [...] Read more.

To promote the inheritance of traditional culture, a variety of emerging methods rooted in machine learning and deep learning have been introduced. Dunhuang patterns, an important part of traditional Chinese culture, are difficult to collect in large numbers due to their limited availability. However, existing text-to-image methods are computationally intensive and struggle to capture fine details and complex semantic relationships in text and images. To address these challenges, this paper proposes the Diffusion Adapter Network (DANet). It employs a lightweight adapter module to extract visual structural information, enabling the diffusion model to generate Dunhuang patterns with high accuracy, while eliminating the need for expensive fine-tuning of the original model. The attention adapter incorporates a multihead attention module (MHAM) to enhance image modality cues, allowing the model to focus more effectively on key information. A multiscale attention module (MSAM) is employed to capture features at different scales, thereby providing more precise generative guidance. In addition, an adaptive control mechanism (ACM) dynamically adjusts the guidance coefficients across feature layers to further enhance generation quality. In addition, incorporating a cross-entropy loss function enhances the model’s capability in semantic understanding and the classification of Dunhuang patterns. The DANet achieves state-of-the-art (SOTA) performance on the proposed Diversified Dunhuang Patterns Dataset (DDHP). Specifically, it attains a perceptual similarity score (LPIPS) of 0.498, a graph matching score (CLIP score) of 0.533, and a feature similarity score (CLIP-I) of 0.772. Full article

(This article belongs to the Special Issue Entropy in Machine Learning Applications, 2nd Edition)

► Show Figures

Figure 1

16 pages, 6927 KiB

Open AccessArticle

Estimation of Missing DICOM Windowing Parameters in High-Dynamic-Range Radiographs Using Deep Learning

by Mateja Napravnik, Natali Bakotić, Franko Hržić, Damir Miletić and Ivan Štajduhar

Mathematics 2025, 13(10), 1596; https://doi.org/10.3390/math13101596 - 13 May 2025

Viewed by 414

Abstract

Digital Imaging and Communication in Medicine (DICOM) is a standard format for storing medical images, which are typically represented in higher bit depths (10–16 bits), enabling detailed representation but exceeding the display capabilities of standard displays and human visual perception. To address this, [...] Read more.

Digital Imaging and Communication in Medicine (DICOM) is a standard format for storing medical images, which are typically represented in higher bit depths (10–16 bits), enabling detailed representation but exceeding the display capabilities of standard displays and human visual perception. To address this, DICOM images are often accompanied by windowing parameters, analogous to tone mapping in High-Dynamic-Range image processing, which compress the intensity range to enhance diagnostically relevant regions. This study evaluates traditional histogram-based methods and explores the potential of deep learning for predicting window parameters in radiographs where such information is missing. A range of architectures, including MobileNetV3Small, VGG16, ResNet50, and ViT-B/16, were trained on high-bit-depth computed radiography images using various combinations of loss functions, including structural similarity (SSIM), perceptual loss (LPIPS), and an edge preservation loss. Models were evaluated based on multiple criteria, including pixel entropy preservation, Hellinger distance of pixel value distributions, and peak-signal-to-noise ratio after 8-bit conversion. The tested approaches were further validated on the publicly available GRAZPEDWRI-DX dataset. Although histogram-based methods showed satisfactory performance, especially scaling through identifying the peaks in the pixel value histogram, deep learning-based methods were better at selectively preserving clinically relevant image areas while removing background noise. Full article

(This article belongs to the Special Issue Advances in Image-Based Decision Support Systems for Personalized Healthcare and Computational Biology)

► Show Figures

Figure 1

18 pages, 1672 KiB

Open AccessArticle

Zero-Reference Depth Curve Estimation-Based Low-Light Image Enhancement Method for Coating Workshop Inspection

by Jiaqi Liu, Shanhui Liu, Wuyang Zhou, Huiran Ren, Wanqiu Zhao and Zheng Li

Coatings 2025, 15(4), 478; https://doi.org/10.3390/coatings15040478 - 17 Apr 2025

Viewed by 802

Abstract

To address the challenges of poor image quality and low detection accuracy in low-light environments during coating workshop inspections, this paper proposes a low-light image enhancement method based on zero-reference depth curve estimation, termed Zero-PTDCE. A low-light image dataset, PT-LLIE, tailored for coating [...] Read more.

To address the challenges of poor image quality and low detection accuracy in low-light environments during coating workshop inspections, this paper proposes a low-light image enhancement method based on zero-reference depth curve estimation, termed Zero-PTDCE. A low-light image dataset, PT-LLIE, tailored for coating workshop scenarios is constructed, encompassing various industrial inspection conditions under different lighting environments to enhance model adaptability. Furthermore, an enhancement network integrating a lightweight denoising module and depthwise separable dilated convolution is designed to reduce noise interference, expand the receptive field, and improve image detail restoration. The network training process employs a multi-constraint strategy by incorporating perceptual loss (L_p), color loss (L_c), spatial consistency loss (L_s), exposure loss (L_e), and total variation smoothness loss (L_tv) to ensure balanced brightness, natural color reproduction, and structural integrity in the enhanced images. Experimental results demonstrate that, compared to existing low-light image enhancement methods, the proposed approach achieves superior performance in terms of peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and mean absolute error (MAE), while maintaining high computational efficiency. Beyond general visual enhancement, Zero-PTDCE significantly improves the visibility of fine surface features and defect patterns under low-light conditions, which is crucial for the accurate assessment of coating quality, including defect identification such as uneven thickness, delamination, and surface abrasion. This work provides a reliable image enhancement solution for intelligent inspection systems and supports both the automated operation and material quality evaluation in modern coating workshops, contributing to the broader goals of intelligent manufacturing and material characterization. Full article

(This article belongs to the Section Surface Engineering for Energy Harvesting, Conversion, and Storage)

► Show Figures

Figure 1

21 pages, 8405 KiB

Open AccessArticle

YOLOv11-BSS: Damaged Region Recognition Based on Spatial and Channel Synergistic Attention and Bi-Deformable Convolution in Sanding Scenarios

by Yinjiang Li, Zhifeng Zhou and Ying Pan

Electronics 2025, 14(7), 1469; https://doi.org/10.3390/electronics14071469 - 5 Apr 2025

Cited by 1 | Viewed by 853

Abstract

In order to address the problem that the paint surface of the damaged region of the body is similar to the color texture characteristics of the usual paint surface, which leads to the phenomenon of leakage or misdetection in the detection process, an [...] Read more.

In order to address the problem that the paint surface of the damaged region of the body is similar to the color texture characteristics of the usual paint surface, which leads to the phenomenon of leakage or misdetection in the detection process, an algorithm for detecting the damaged region of the body based on the improved YOLOv11 is proposed. Firstly, bi-deformable convolution is proposed to optimize the convolution kernel shape offset direction, which effectively improves the feature representation power of the backbone network; secondly, the C2PSA-SCSA module is designed to construct the coupling between spatial attention and channel attention, which enhances the perceptual power of the backbone network, and makes the model pay better attention to the damaged region features. Then, based on the GSConv module and the DWConv module, we build the slim-neck feature fusion network based on the GSConv module and DWConv module, which effectively fuses local features and global features to improve the saturation of semantic features; finally, the Focaler-CIoU border loss function is designed, which makes use of the principle of Focaler-IoU segmented linear mapping, adjusts the border loss function’s attention to different samples, and improves the model’s convergence of feature learning at various scales. The experimental results show that the enhanced YOLOv11-BSS network improves the precision rate by 7.9%, the recall rate by 1.4%, and the mAP@50 by 3.7% over the baseline network, which effectively reduces the leakage and misdetection of the damaged areas of the car body. Full article

► Show Figures

Figure 1

17 pages, 19409 KiB

Open AccessArticle

Wavelet-Based Topological Loss for Low-Light Image Denoising

by Alexandra Malyugina, Nantheera Anantrasirichai and David Bull

Sensors 2025, 25(7), 2047; https://doi.org/10.3390/s25072047 - 25 Mar 2025

Viewed by 681

Abstract

Despite significant advances in image denoising, most algorithms rely on supervised learning, with their performance largely dependent on the quality and diversity of training data. It is widely assumed that digital image distortions are caused by spatially invariant Additive White Gaussian Noise (AWGN). [...] Read more.

Despite significant advances in image denoising, most algorithms rely on supervised learning, with their performance largely dependent on the quality and diversity of training data. It is widely assumed that digital image distortions are caused by spatially invariant Additive White Gaussian Noise (AWGN). However, the analysis of real-world data suggests that this assumption is invalid. Therefore, this paper tackles image corruption by real noise, providing a framework to capture and utilise the underlying structural information of an image along with the spatial information conventionally used for deep learning tasks. We propose a novel denoising loss function that incorporates topological invariants and is informed by textural information extracted from the image wavelet domain. The effectiveness of this proposed method was evaluated by training state-of-the-art denoising models on the BVI-Lowlight dataset, which features a wide range of real noise distortions. Adding a topological term to common loss functions leads to a significant increase in the LPIPS (Learned Perceptual Image Patch Similarity) metric, with the improvement reaching up to 25%. The results indicate that the proposed loss function enables neural networks to learn noise characteristics better. We demonstrate that they can consequently extract the topological features of noise-free images, resulting in enhanced contrast and preserved textural information. Full article

(This article belongs to the Special Issue Machine Learning in Image/Video Processing and Sensing)

► Show Figures

Figure 1

18 pages, 117603 KiB

Open AccessArticle

A Novel Framework for Remote Sensing Image Synthesis with Optimal Transport

by Jinlong He, Xia Yuan, Yong Kou and Yanci Zhang

Sensors 2025, 25(6), 1792; https://doi.org/10.3390/s25061792 - 13 Mar 2025

Viewed by 590

Abstract

We propose a Generative Adversarial Network (GAN)-based method for image synthesis from remote sensing data. Remote sensing images (RSIs) are characterized by large intraclass variance and small interclass variance, which pose significant challenges for image synthesis. To address these issues, we design and [...] Read more.

We propose a Generative Adversarial Network (GAN)-based method for image synthesis from remote sensing data. Remote sensing images (RSIs) are characterized by large intraclass variance and small interclass variance, which pose significant challenges for image synthesis. To address these issues, we design and incorporate two distinct attention modules into our GAN framework. The first attention module is designed to enhance similarity measurements within label groups, effectively handling the large intraclass variance by reinforcing consistency within the same class. The second module addresses the small interclass variance by promoting diversity between adjacent label groups, ensuring that different classes are distinguishable in the generated images. These attention mechanisms play a critical role in generating more realistic and visually coherent images. Our GAN-based framework consists of an advanced image encoder and a generator, which are both enhanced by these attention modules. Furthermore, we integrate optimal transport (OT) to approximate human perceptual loss, further improving the visual quality of the synthesized images. Experimental results demonstrate the effectiveness of our approach, highlighting its advantages in the remote sensing field by significantly enhancing the quality of generated RSIs. Full article

(This article belongs to the Special Issue AI-Based Computer Vision Sensors & Systems)

► Show Figures

Figure 1

16 pages, 861 KiB

Open AccessArticle

PixMed-Enhancer: An Efficient Approach for Medical Image Augmentation

by M. J. Aashik Rasool, Akmalbek Abdusalomov, Alpamis Kutlimuratov, M. J. Akeel Ahamed, Sanjar Mirzakhalilov, Abror Shavkatovich Buriboev and Heung Seok Jeon

Bioengineering 2025, 12(3), 235; https://doi.org/10.3390/bioengineering12030235 - 26 Feb 2025

Cited by 1 | Viewed by 1293

Abstract

AI-powered medical imaging faces persistent challenges, such as limited datasets, class imbalances, and high computational costs. To overcome these barriers, we introduce PixMed-Enhancer, a novel conditional GAN that integrates the ghost module into its encoder—a pioneering approach that achieves efficient feature extraction while [...] Read more.

AI-powered medical imaging faces persistent challenges, such as limited datasets, class imbalances, and high computational costs. To overcome these barriers, we introduce PixMed-Enhancer, a novel conditional GAN that integrates the ghost module into its encoder—a pioneering approach that achieves efficient feature extraction while significantly reducing the computational complexity without compromising the performance. Our method features a hybrid loss function, uniquely combining binary cross-entropy (BCE) and a Structural Similarity Index Measure (SSIM), to ensure pixel-level precision while enhancing the perceptual realism. Additionally, the use of conditional input masks offers unparalleled control over the generation of tumor features, marking a breakthrough in fine-grained dataset augmentation for segmentation and diagnostic tasks. Rigorous testing on diverse datasets establishes PixMed-Enhancer as a state-of-the-art solution, excelling in its realism, structural fidelity, and computational efficiency. PixMed-Enhancer establishes a robust foundation for real-world clinical applications in AI-driven medical imaging. Full article

(This article belongs to the Special Issue Recent Advances in Biomedical Imaging: 2nd Edition)

► Show Figures

Figure 1

20 pages, 3137 KiB

Open AccessArticle

Image Super-Resolution Reconstruction Algorithm Based on SRGAN and Swin Transformer

by Chuilian Sun, Chunmeng Wang and Chen He

Symmetry 2025, 17(3), 337; https://doi.org/10.3390/sym17030337 - 24 Feb 2025

Cited by 1 | Viewed by 1438

Abstract

Existing methods have problems such as loss of details and insufficient reconstruction effect when processing complex images. To improve the quality and efficiency of image super-resolution reconstruction, this study proposes an improved algorithm based on super-resolution generative adversarial network and Swin Transformer. Firstly, [...] Read more.

Existing methods have problems such as loss of details and insufficient reconstruction effect when processing complex images. To improve the quality and efficiency of image super-resolution reconstruction, this study proposes an improved algorithm based on super-resolution generative adversarial network and Swin Transformer. Firstly, on the ground of the traditional super-resolution generative adversarial network, combined with the global feature extraction capability of Swin Transformer, the model’s capacity to capture multi-scale features and restore details is enhanced. Subsequently, by utilizing adversarial loss and perceptual loss to further optimize the training process, the image’s visual quality is improved. The results show that the optimization algorithm had high PSNR and structural similarity index values in multiple benchmark test datasets, with the highest reaching 43.81 and 0.94, respectively, which are significantly better than the comparison algorithm. In practical applications, this algorithm demonstrated higher reconstruction accuracy and efficiency when reconstructing images with complex textures and rich edge details. The highest reconstruction accuracy could reach 98.03%, and the reconstruction time was as low as 0.2 s or less. In summary, this model can greatly improve the visual quality of image super-resolution reconstruction, better restore details, reduce detail loss, and provide an efficient and reliable solution for image super-resolution reconstruction tasks. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

Search Results (72)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (72)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI