Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (151)

Search Parameters:
Keywords = structural similarity loss (SSIM Loss)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
14 pages, 21956 KiB  
Article
Evaluating Image Quality Metrics as Loss Functions for Image Dehazing
by Rareș Dobre-Baron, Adrian Savu-Jivanov and Cosmin Ancuți
Sensors 2025, 25(15), 4755; https://doi.org/10.3390/s25154755 - 1 Aug 2025
Viewed by 209
Abstract
The difficulty and manual nature of procuring human evaluators for ranking the quality of images affected by various types of degradations, and of those cleaned up by developed algorithms, has lead to the widespread adoption of automated metrics, like the Peak Signal-to-Noise Ratio [...] Read more.
The difficulty and manual nature of procuring human evaluators for ranking the quality of images affected by various types of degradations, and of those cleaned up by developed algorithms, has lead to the widespread adoption of automated metrics, like the Peak Signal-to-Noise Ratio (PSNR) and the Structural Similarity Index Metric (SSIM). However, disparities between rankings given by these metrics and those given by human evaluators have encouraged the development of improved image quality assessment (IQA) metrics that are a better fit for this purpose. These methods have been previously used solely for quality assessments and not as objectives in the training of neural networks for high-level vision tasks, despite the potential improvements that may come about by directly optimizing for desired metrics. This paper examines the adequacy of ten recent IQA metrics, compared with standard loss functions, within two trained dehazing neural networks, with observed broad improvement in their performance. Full article
(This article belongs to the Special Issue Sensing and Imaging in Computer Vision)
21 pages, 97817 KiB  
Article
Compression of 3D Optical Encryption Using Singular Value Decomposition
by Kyungtae Park, Min-Chul Lee and Myungjin Cho
Sensors 2025, 25(15), 4742; https://doi.org/10.3390/s25154742 - 1 Aug 2025
Viewed by 231
Abstract
In this paper, we propose a compressionmethod for optical encryption using singular value decomposition (SVD). Double random phase encryption (DRPE), which employs two distinct random phase masks, is adopted as the optical encryption technique. Since the encrypted data in DRPE have the same [...] Read more.
In this paper, we propose a compressionmethod for optical encryption using singular value decomposition (SVD). Double random phase encryption (DRPE), which employs two distinct random phase masks, is adopted as the optical encryption technique. Since the encrypted data in DRPE have the same size as the input data and consists of complex values, a compression technique is required to improve data efficiency. To address this issue, we introduce SVD as a compression method. SVD decomposes any matrix into simpler components, such as a unitary matrix, a rectangular diagonal matrix, and a complex unitary matrix. By leveraging this property, the encrypted data generated by DRPE can be effectively compressed. However, this compression may lead to some loss of information in the decrypted data. To mitigate this loss, we employ volumetric computational reconstruction based on integral imaging. As a result, the proposed method enhances the visual quality, compression ratio, and security of DRPE simultaneously. To validate the effectiveness of the proposed method, we conduct both computer simulations and optical experiments. The performance is evaluated quantitatively using peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and peak sidelobe ratio (PSR) as evaluation metrics. Full article
Show Figures

Figure 1

21 pages, 4388 KiB  
Article
An Omni-Dimensional Dynamic Convolutional Network for Single-Image Super-Resolution Tasks
by Xi Chen, Ziang Wu, Weiping Zhang, Tingting Bi and Chunwei Tian
Mathematics 2025, 13(15), 2388; https://doi.org/10.3390/math13152388 - 25 Jul 2025
Viewed by 286
Abstract
The goal of single-image super-resolution (SISR) tasks is to generate high-definition images from low-quality inputs, with practical uses spanning healthcare diagnostics, aerial imaging, and surveillance systems. Although cnns have considerably improved image reconstruction quality, existing methods still face limitations, including inadequate restoration of [...] Read more.
The goal of single-image super-resolution (SISR) tasks is to generate high-definition images from low-quality inputs, with practical uses spanning healthcare diagnostics, aerial imaging, and surveillance systems. Although cnns have considerably improved image reconstruction quality, existing methods still face limitations, including inadequate restoration of high-frequency details, high computational complexity, and insufficient adaptability to complex scenes. To address these challenges, we propose an Omni-dimensional Dynamic Convolutional Network (ODConvNet) tailored for SISR tasks. Specifically, ODConvNet comprises four key components: a Feature Extraction Block (FEB) that captures low-level spatial features; an Omni-dimensional Dynamic Convolution Block (DCB), which utilizes a multidimensional attention mechanism to dynamically reweight convolution kernels across spatial, channel, and kernel dimensions, thereby enhancing feature expressiveness and context modeling; a Deep Feature Extraction Block (DFEB) that stacks multiple convolutional layers with residual connections to progressively extract and fuse high-level features; and a Reconstruction Block (RB) that employs subpixel convolution to upscale features and refine the final HR output. This mechanism significantly enhances feature extraction and effectively captures rich contextual information. Additionally, we employ an improved residual network structure combined with a refined Charbonnier loss function to alleviate gradient vanishing and exploding to enhance the robustness of model training. Extensive experiments conducted on widely used benchmark datasets, including DIV2K, Set5, Set14, B100, and Urban100, demonstrate that, compared with existing deep learning-based SR methods, our ODConvNet method improves Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM), and the visual quality of SR images is also improved. Ablation studies further validate the effectiveness and contribution of each component in our network. The proposed ODConvNet offers an effective, flexible, and efficient solution for the SISR task and provides promising directions for future research. Full article
Show Figures

Figure 1

26 pages, 7178 KiB  
Article
Super-Resolution Reconstruction of Formation MicroScanner Images Based on the SRGAN Algorithm
by Changqiang Ma, Xinghua Qi, Liangyu Chen, Yonggui Li, Jianwei Fu and Zejun Liu
Processes 2025, 13(7), 2284; https://doi.org/10.3390/pr13072284 - 17 Jul 2025
Viewed by 337
Abstract
Formation MicroScanner Image (FMI) technology is a key method for identifying fractured reservoirs and optimizing oil and gas exploration, but its inherent insufficient resolution severely constrains the fine characterization of geological features. This study innovatively applies a Super-Resolution Generative Adversarial Network (SRGAN) to [...] Read more.
Formation MicroScanner Image (FMI) technology is a key method for identifying fractured reservoirs and optimizing oil and gas exploration, but its inherent insufficient resolution severely constrains the fine characterization of geological features. This study innovatively applies a Super-Resolution Generative Adversarial Network (SRGAN) to the super-resolution reconstruction of FMI logging image to address this bottleneck problem. By collecting FMI logging image of glutenite from a well in Xinjiang, a training set containing 24,275 images was constructed, and preprocessing strategies such as grayscale conversion and binarization were employed to optimize input features. Leveraging SRGAN’s generator-discriminator adversarial mechanism and perceptual loss function, high-quality mapping from low-resolution FMI logging image to high-resolution images was achieved. This study yields significant results: in RGB image reconstruction, SRGAN achieved a Peak Signal-to-Noise Ratio (PSNR) of 41.39 dB, surpassing the optimal traditional method (bicubic interpolation) by 61.6%; its Structural Similarity Index (SSIM) reached 0.992, representing a 34.1% improvement; in grayscale image processing, SRGAN effectively eliminated edge blurring, with the PSNR (40.15 dB) and SSIM (0.990) exceeding the suboptimal method (bilinear interpolation) by 36.6% and 9.9%, respectively. These results fully confirm that SRGAN can significantly restore edge contours and structural details in FMI logging image, with performance far exceeding traditional interpolation methods. This study not only systematically verifies, for the first time, SRGAN’s exceptional capability in enhancing FMI resolution, but also provides a high-precision data foundation for reservoir parameter inversion and geological modeling, holding significant application value for advancing the intelligent exploration of complex hydrocarbon reservoirs. Full article
Show Figures

Figure 1

23 pages, 6440 KiB  
Article
A Gravity Data Denoising Method Based on Multi-Scale Attention Mechanism and Physical Constraints Using U-Net
by Bing Liu, Houpu Li, Shaofeng Bian, Chaoliang Zhang, Bing Ji and Yujie Zhang
Appl. Sci. 2025, 15(14), 7956; https://doi.org/10.3390/app15147956 - 17 Jul 2025
Viewed by 282
Abstract
Gravity and gravity gradient data serve as fundamental inputs for geophysical resource exploration and geological structure analysis. However, traditional denoising methods—including wavelet transforms, moving averages, and low-pass filtering—exhibit signal loss and limited adaptability under complex, non-stationary noise conditions. To address these challenges, this [...] Read more.
Gravity and gravity gradient data serve as fundamental inputs for geophysical resource exploration and geological structure analysis. However, traditional denoising methods—including wavelet transforms, moving averages, and low-pass filtering—exhibit signal loss and limited adaptability under complex, non-stationary noise conditions. To address these challenges, this study proposes an improved U-Net deep learning framework that integrates multi-scale feature extraction and attention mechanisms. Furthermore, a Laplace consistency constraint is introduced into the loss function to enhance denoising performance and physical interpretability. Notably, the datasets used in this study are generated by the authors, involving simulations of subsurface prism distributions with realistic density perturbations (±20% of typical rock densities) and the addition of controlled Gaussian noise (5%, 10%, 15%, and 30%) to simulate field-like conditions, ensuring the diversity and physical relevance of training samples. Experimental validation on these synthetic datasets and real field datasets demonstrates the superiority of the proposed method over conventional techniques. For noise levels of 5%, 10%, 15%, and 30% in test sets, the improved U-Net achieves Peak Signal-to-Noise Ratios (PSNR) of 59.13 dB, 52.03 dB, 48.62 dB, and 48.81 dB, respectively, outperforming wavelet transforms, moving averages, and low-pass filtering by 10–30 dB. In multi-component gravity gradient denoising, our method excels in detail preservation and noise suppression, improving Structural Similarity Index (SSIM) by 15–25%. Field data tests further confirm enhanced identification of key geological anomalies and overall data quality improvement. In summary, the improved U-Net not only delivers quantitative advancements in gravity data denoising but also provides a novel approach for high-precision geophysical data preprocessing. Full article
(This article belongs to the Special Issue Applications of Machine Learning in Earth Sciences—2nd Edition)
Show Figures

Figure 1

24 pages, 7849 KiB  
Article
Face Desensitization for Autonomous Driving Based on Identity De-Identification of Generative Adversarial Networks
by Haojie Ji, Liangliang Tian, Jingyan Wang, Yuchi Yao and Jiangyue Wang
Electronics 2025, 14(14), 2843; https://doi.org/10.3390/electronics14142843 - 15 Jul 2025
Viewed by 278
Abstract
Automotive intelligent agents are increasingly collecting facial data for applications such as driver behavior monitoring and identity verification. These excessive collections of facial data bring serious risks of sensitive information leakage to autonomous driving. Facial information has been explicitly required to be anonymized, [...] Read more.
Automotive intelligent agents are increasingly collecting facial data for applications such as driver behavior monitoring and identity verification. These excessive collections of facial data bring serious risks of sensitive information leakage to autonomous driving. Facial information has been explicitly required to be anonymized, but the availability of most desensitized facial data is poor, which will greatly affect its application in autonomous driving. This paper proposes an automotive sensitive information anonymization method with high-quality generated facial images by considering the data availability under privacy protection. By comparing K-Same and Generative Adversarial Networks (GANs), this paper proposes a hierarchical self-attention mechanism in StyleGAN3 to enhance the feature perception of face images. The synchronous regularization of sample data is applied to optimize the loss function of the discriminator of StyleGAN3, thereby improving the convergence stability of the model. The experimental results demonstrate that the proposed facial desensitization model reduces the Frechet inception distance (FID) and structural similarity index measure (SSIM) by 95.8% and 24.3%, respectively. The image quality and privacy desensitization of the facial data generated by the StyleGAN3 model have been fully verified in this work. This research provides an efficient and robust facial privacy protection solution for autonomous driving, which is conducive to promoting the security guarantee of automotive data. Full article
(This article belongs to the Special Issue Development and Advances in Autonomous Driving Technology)
Show Figures

Figure 1

24 pages, 5976 KiB  
Article
Spatial Downscaling of Sea Level Anomaly Using a Deep Separable Distillation Network
by Senmin Shi, Yineng Li, Yuhang Zhu, Tao Song and Shiqiu Peng
Remote Sens. 2025, 17(14), 2428; https://doi.org/10.3390/rs17142428 - 13 Jul 2025
Viewed by 428
Abstract
The use of high-resolution sea level anomaly (SLA) data in climate change research and ocean forecasting has become increasingly important. However, existing datasets often lack the fine spatial resolution required for capturing mesoscale ocean processes accurately. This has led to the development of [...] Read more.
The use of high-resolution sea level anomaly (SLA) data in climate change research and ocean forecasting has become increasingly important. However, existing datasets often lack the fine spatial resolution required for capturing mesoscale ocean processes accurately. This has led to the development of conventional deep learning models for SLA spatial downscaling, but these models often overlook spatial disparities between land and ocean regions and do not adequately address the spatial structures of SLA data. As a result, their accuracy and structural consistency are suboptimal. To address these issues, we propose a Deep Separable Distillation Network (DSDN) that integrates Depthwise Separable Distillation Blocks (DSDB) and a Landmask Contextual Attention Mechanism (M_CAMB) to achieve efficient and accurate spatial downscaling. The M_CAMB employs geographically-informed land masks to enhance the attention mechanism, prioritizing ocean regions. Additionally, we introduce a novel Pixel-Structure Loss (PSLoss) to enforce spatial structure constraints, significantly improving the structural fidelity of the SLA downscaling results. Experimental results demonstrate that DSDN achieves a root mean square error (RMSE) of 0.062 cm, a peak signal-to-noise ratio (PSNR) of 42.22 dB, and a structural similarity index (SSIM) of 0.976 in SLA downscaling. These results surpass those of baseline models and highlight the superior precision and structural consistency of DSDN. Full article
Show Figures

Figure 1

23 pages, 8011 KiB  
Article
Efficient Prediction of Shallow-Water Acoustic Transmission Loss Using a Hybrid Variational Autoencoder–Flow Framework
by Bolin Su, Haozhong Wang, Xingyu Zhu, Penghua Song and Xiaolei Li
J. Mar. Sci. Eng. 2025, 13(7), 1325; https://doi.org/10.3390/jmse13071325 - 10 Jul 2025
Viewed by 241
Abstract
Efficient prediction of shallow-water acoustic transmission loss (TL) is crucial for underwater detection, recognition, and communication systems. Traditional physical modeling methods require repeated calculations for each new scenario in practical waveguide environments, leading to low computational efficiency. Deep learning approaches, based on data-driven [...] Read more.
Efficient prediction of shallow-water acoustic transmission loss (TL) is crucial for underwater detection, recognition, and communication systems. Traditional physical modeling methods require repeated calculations for each new scenario in practical waveguide environments, leading to low computational efficiency. Deep learning approaches, based on data-driven principles, enable accurate input–output approximation and batch processing of large-scale datasets, significantly reducing computation time and cost. To establish a rapid prediction model mapping sound speed profiles (SSPs) to acoustic TL through controllable generation, this study proposes a hybrid framework that integrates a variational autoencoder (VAE) and a normalizing flow (Flow) through a two-stage training strategy. The VAE network is employed to learn latent representations of TL data on a low-dimensional manifold, while the Flow network is additionally used to establish a bijective mapping between the latent variables and underwater physical parameters, thereby enhancing the controllability of the generation process. Combining the trained normalizing flow with the VAE decoder could establish an end-to-end mapping from SSPs to TL. The results demonstrated that the VAE–Flow network achieved higher computational efficiency, with a computation time of 4 s for generating 1000 acoustic TL samples, versus the over 500 s required by the KRAKEN model, while preserving accuracy, with median structural similarity index measure (SSIM) values over 0.90. Full article
(This article belongs to the Special Issue Data-Driven Methods for Marine Structures)
Show Figures

Figure 1

21 pages, 3406 KiB  
Article
ResNet-SE-CBAM Siamese Networks for Few-Shot and Imbalanced PCB Defect Classification
by Chao-Hsiang Hsiao, Huan-Che Su, Yin-Tien Wang, Min-Jie Hsu and Chen-Chien Hsu
Sensors 2025, 25(13), 4233; https://doi.org/10.3390/s25134233 - 7 Jul 2025
Viewed by 584
Abstract
Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product [...] Read more.
Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product defects using limited data, enhancing model generalization and stability. Unlike previous deep learning models that require extensive datasets, our approach effectively performs defect detection with minimal data. We propose a Siamese network that integrates Residual blocks, Squeeze and Excitation blocks, and Convolution Block Attention Modules (ResNet-SE-CBAM Siamese network) for feature extraction, optimized through triplet loss for embedding learning. The ResNet-SE-CBAM Siamese network incorporates two primary features: attention mechanisms and metric learning. The recently developed attention mechanisms enhance the convolutional neural network operations and significantly improve feature extraction performance. Meanwhile, metric learning allows for the addition or removal of feature classes without the need to retrain the model, improving its applicability in industrial production lines with limited defect samples. To further improve training efficiency with imbalanced datasets, we introduce a sample selection method based on the Structural Similarity Index Measure (SSIM). Additionally, a high defect rate training strategy is utilized to reduce the False Negative Rate (FNR) and ensure no missed defect detections. At the classification stage, a K-Nearest Neighbor (KNN) classifier is employed to mitigate overfitting risks and enhance stability in few-shot conditions. The experimental results demonstrate that with a good-to-defect ratio of 20:40, the proposed system achieves a classification accuracy of 94% and an FNR of 2%. Furthermore, when the number of defective samples increases to 80, the system achieves zero false negatives (FNR = 0%). The proposed metric learning approach outperforms traditional deep learning models, such as parametric-based YOLO series models in defect detection, achieving higher accuracy and lower miss rates, highlighting its potential for high-reliability industrial deployment. Full article
Show Figures

Figure 1

20 pages, 4254 KiB  
Article
Positional Component-Guided Hangul Font Image Generation via Deep Semantic Segmentation and Adversarial Style Transfer
by Avinash Kumar, Irfanullah Memon, Abdul Sami, Youngwon Jo and Jaeyoung Choi
Electronics 2025, 14(13), 2699; https://doi.org/10.3390/electronics14132699 - 4 Jul 2025
Viewed by 412
Abstract
Automated font generation for complex, compositional scripts like Korean Hangul presents a significant challenge due to the 11,172 characters and their complicated component-based structure. While existing component-based methods for font image generation acknowledge the compositional nature of Hangul, they often fail to explicitly [...] Read more.
Automated font generation for complex, compositional scripts like Korean Hangul presents a significant challenge due to the 11,172 characters and their complicated component-based structure. While existing component-based methods for font image generation acknowledge the compositional nature of Hangul, they often fail to explicitly leverage the crucial positional semantics of its basic elements as initial, middle, and final components, known as Jamo. This oversight can lead to structural inconsistencies and artifacts in the generated glyphs. This paper introduces a novel two-stage framework that directly addresses this gap by imposing a strong, linguistically informed structural principle on the font image generation process. In the first stage, we employ a You Only Look Once version 8 for Segmentation (YOLOv8-Seg) model, a state-of-the-art instance segmentation network, to decompose Hangul characters into their basic components. Notably, this process generates a dataset of position-aware semantic components, categorizing each jamo according to its structural role within the syllabic block. In the second stage, a conditional Generative Adversarial Network (cGAN) is explicitly conditioned on these extracted positional components to perform style transfer with high structural information. The generator learns to synthesize a character’s appearance by referencing the style of the target components while preserving the content structure of a source character. Our model achieves state-of-the-art performance, reducing L1 loss to 0.2991 and improving the Structural Similarity Index (SSIM) to 0.9798, quantitatively outperforming existing methods like MX-Font and CKFont. This position-guided approach demonstrates significant quantitative and qualitative improvements over existing methods in structured script generation, offering enhanced control over glyph structure and a promising approach for generating font images for other complex, structured scripts. Full article
(This article belongs to the Special Issue Applications of Computer Vision, 3rd Edition)
Show Figures

Figure 1

16 pages, 2376 KiB  
Article
Nested U-Net-Based GAN Model for Super-Resolution of Stained Light Microscopy Images
by Seong-Hyeon Kang and Ji-Youn Kim
Photonics 2025, 12(7), 665; https://doi.org/10.3390/photonics12070665 - 1 Jul 2025
Viewed by 390
Abstract
The purpose of this study was to propose a deep learning-based model for the super-resolution reconstruction of stained light microscopy images. To achieve this, perceptual loss was applied to the generator to reflect multichannel signal intensity, distribution, and structural similarity. A nested U-Net [...] Read more.
The purpose of this study was to propose a deep learning-based model for the super-resolution reconstruction of stained light microscopy images. To achieve this, perceptual loss was applied to the generator to reflect multichannel signal intensity, distribution, and structural similarity. A nested U-Net architecture was employed to address the representational limitations of the conventional U-Net. For quantitative evaluation, the peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and correlation coefficient (CC) were calculated. In addition, intensity profile analysis was performed to assess the model’s ability to restore the boundary signals more precisely. The experimental results demonstrated that the proposed model outperformed both the signal and structural restoration compared to single U-Net and U-Net-based generative adversarial network (GAN) models. Consequently, the PSNR, SSIM, and CC values demonstrated relative improvements of approximately 1.017, 1.023, and 1.010 times, respectively, compared to the input images. In particular, the intensity profile analysis confirmed the effectiveness of the nested U-Net-based generator in restoring cellular boundaries and structures in the stained microscopy images. In conclusion, the proposed model effectively enhanced the resolution of stained light microscopy images acquired in a multichannel format. Full article
(This article belongs to the Special Issue Recent Advances in Biomedical Optics and Biophotonics)
Show Figures

Figure 1

20 pages, 3406 KiB  
Article
Single-Image Super-Resolution via Cascaded Non-Local Mean Network and Dual-Path Multi-Branch Fusion
by Yu Xu and Yi Wang
Sensors 2025, 25(13), 4044; https://doi.org/10.3390/s25134044 - 28 Jun 2025
Viewed by 568
Abstract
Image super-resolution (SR) aims to reconstruct high-resolution (HR) images from low-resolution (LR) inputs. It plays a crucial role in applications such as medical imaging, surveillance, and remote sensing. However, due to the ill-posed nature of the task and the inherent limitations of imaging [...] Read more.
Image super-resolution (SR) aims to reconstruct high-resolution (HR) images from low-resolution (LR) inputs. It plays a crucial role in applications such as medical imaging, surveillance, and remote sensing. However, due to the ill-posed nature of the task and the inherent limitations of imaging sensors, obtaining accurate HR images remains challenging. While numerous methods have been proposed, the traditional approaches suffer from oversmoothing and limited generalization; CNN-based models lack the ability to capture long-range dependencies; and Transformer-based solutions, although effective in modeling global context, are computationally intensive and prone to texture loss. To address these issues, we propose a hybrid CNN–Transformer architecture that cascades a pixel-wise self-attention non-local means module (PSNLM) and an adaptive dual-path multi-scale fusion block (ADMFB). The PSNLM is inspired by the non-local means (NLM) algorithm. We use weighted patches to estimate the similarity between pixels centered at each patch while limiting the search region and constructing a communication mechanism across ranges. The ADMFB enhances texture reconstruction by adaptively aggregating multi-scale features through dual attention paths. The experimental results demonstrate that our method achieves superior performance on multiple benchmarks. For instance, in challenging ×4 super-resolution, our method outperforms the second-best method by 0.0201 regarding the Structural Similarity Index (SSIM) on the BSD100 dataset. On the texture-rich Urban100 dataset, our method achieves a 26.56 dB Peak Signal-to-Noise Ratio (PSNR) and 0.8133 SSIM. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

22 pages, 4478 KiB  
Article
Welding Image Data Augmentation Method Based on LRGAN Model
by Ying Wang, Zhe Dai, Qiang Zhang and Zihao Han
Appl. Sci. 2025, 15(12), 6923; https://doi.org/10.3390/app15126923 - 19 Jun 2025
Viewed by 373
Abstract
This study focuses on the data bottleneck issue in the training of deep learning models during the intelligent welding control process and proposes an improved model called LRGAN (loss reconstruction generative adversarial networks). First, a five-layer spectral normalization neural network was designed as [...] Read more.
This study focuses on the data bottleneck issue in the training of deep learning models during the intelligent welding control process and proposes an improved model called LRGAN (loss reconstruction generative adversarial networks). First, a five-layer spectral normalization neural network was designed as the discriminator of the model. By incorporating the least squares loss function, the gradients of the model parameters were constrained within a reasonable range, which not only accelerated the convergence process but also effectively limited drastic changes in model parameters, alleviating the vanishing gradient problem. Next, a nine-layer residual structure was introduced in the generator to optimize the training of deep networks, preventing the mode collapse issue caused by the increase in the number of layers. The final experimental results show that the proposed LRGAN model outperforms other generative models in terms of evaluation metrics such as peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and Fréchet inception distance (FID). It provides an effective solution to the small sample problem in the intelligent welding control process. Full article
(This article belongs to the Section Robotics and Automation)
Show Figures

Figure 1

14 pages, 13345 KiB  
Article
Synthetic Fog Generation Using High-Performance Dehazing Networks for Surveillance Applications
by Heekwon Lee, Byeongseon Park, Yong-Kab Kim and Sungkwan Youm
Appl. Sci. 2025, 15(12), 6503; https://doi.org/10.3390/app15126503 - 9 Jun 2025
Viewed by 394
Abstract
This research addresses visibility challenges in surveillance systems under foggy conditions through a novel synthetic fog generation method leveraging the GridNet dehazing architecture. Our approach uniquely reverses GridNet, originally developed for fog removal, to synthesize realistic foggy images. The proposed Fog Generator Model [...] Read more.
This research addresses visibility challenges in surveillance systems under foggy conditions through a novel synthetic fog generation method leveraging the GridNet dehazing architecture. Our approach uniquely reverses GridNet, originally developed for fog removal, to synthesize realistic foggy images. The proposed Fog Generator Model incorporates perceptual and dark channel consistency losses to enhance fog realism and structural consistency. Comparative experiments on the O-HAZY dataset demonstrate that dehazing models trained on our synthetic fog outperform those trained on conventional methods, achieving superior Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) scores. These findings confirm that integrating high-performance dehazing networks into fog synthesis improves the realism and effectiveness of fog removal solutions, offering significant benefits for real-world surveillance applications. Full article
Show Figures

Figure 1

19 pages, 3331 KiB  
Article
Low-Light Image Enhancement Using Deep Learning: A Lightweight Network with Synthetic and Benchmark Dataset Evaluation
by Manuel J. C. S. Reis
Appl. Sci. 2025, 15(11), 6330; https://doi.org/10.3390/app15116330 - 4 Jun 2025
Viewed by 1472
Abstract
Low-light conditions often lead to severe degradation in image quality, impairing critical computer vision tasks in applications such as surveillance and mobile imaging. In this paper, we propose a lightweight deep learning framework for low-light image enhancement, designed to balance visual quality with [...] Read more.
Low-light conditions often lead to severe degradation in image quality, impairing critical computer vision tasks in applications such as surveillance and mobile imaging. In this paper, we propose a lightweight deep learning framework for low-light image enhancement, designed to balance visual quality with computational efficiency, with potential for deployment in latency-sensitive and resource-constrained environments. The architecture builds upon a UNet-inspired encoder–decoder structure, enhanced with attention modules and trained using a combination of perceptual and structural loss functions. Our training strategy utilizes a hybrid dataset composed of both real low-light images and synthetically generated image pairs created through controlled exposure adjustment and noise modeling. Experimental results on benchmark datasets such as LOL and SID demonstrate that our model achieves a Peak Signal-to-Noise Ratio (PSNR) of up to 28.4 dB and a Structural Similarity Index (SSIM) of 0.88 while maintaining a small parameter footprint (~1.3 M) and low inference latency (~6 FPS on Jetson Nano). The proposed approach offers a promising solution for industrial applications such as real-time surveillance, mobile photography, and embedded vision systems. Full article
(This article belongs to the Special Issue Image Processing: Technologies, Methods, Apparatus)
Show Figures

Figure 1

Back to TopTop