Electronics

Research

24 pages, 6881 KiB

Open AccessArticle

Sign Language Anonymization: Face Swapping Versus Avatars

by Marina Perea-Trigo, Manuel Vázquez-Enríquez, Jose C. Benjumea-Bellot, Jose L. Alba-Castro and Juan A. Álvarez-García

Electronics 2025, 14(12), 2360; https://doi.org/10.3390/electronics14122360 - 9 Jun 2025

Viewed by 584

Abstract

The visual nature of Sign Language datasets raises privacy concerns that hinder data sharing, which is essential for advancing deep learning (DL) models in Sign Language recognition and translation. This study evaluated two anonymization techniques, realistic avatar synthesis and face swapping (FS), designed [...] Read more.

The visual nature of Sign Language datasets raises privacy concerns that hinder data sharing, which is essential for advancing deep learning (DL) models in Sign Language recognition and translation. This study evaluated two anonymization techniques, realistic avatar synthesis and face swapping (FS), designed to anonymize the identities of signers, while preserving the semantic integrity of signed content. A novel metric, Identity Anonymization with Expressivity Preservation (IAEP), is introduced to assess the balance between effective anonymization and the preservation of facial expressivity crucial for Sign Language communication. In addition, the quality evaluation included the LPIPS and FID metrics, which measure perceptual similarity and visual quality. A survey with deaf participants further complemented the analysis, providing valuable insight into the practical usability and comprehension of anonymized videos. The results show that while face swapping achieved acceptable anonymization and preserved semantic clarity, avatar-based anonymization struggled with comprehension. These findings highlight the need for further research efforts on securing privacy while preserving Sign Language understandability, both for dataset accessibility and the anonymous participation of deaf people in digital content. Full article

(This article belongs to the Special Issue Application of Machine Learning in Graphics and Images, 2nd Edition)

► Show Figures

Figure 1

16 pages, 5141 KiB

Open AccessArticle

Multi-Channel Attention Fusion Algorithm for Railway Image Dehazing

by Haofei Xu, Ziyu Cai, Shanshan Li, Siyang Hu, Junrong Tu, Song Chen, Kai Xie and Wei Zhang

Electronics 2025, 14(11), 2241; https://doi.org/10.3390/electronics14112241 - 30 May 2025

Viewed by 369

Abstract

Railway safety inspections, a critical component of modern transportation systems, face significant challenges from adverse weather conditions, like fog and rain, which degrade image quality and compromise inspection accuracy. To address this limitation, we propose a novel deep learning-based image dehazing algorithm optimized [...] Read more.

Railway safety inspections, a critical component of modern transportation systems, face significant challenges from adverse weather conditions, like fog and rain, which degrade image quality and compromise inspection accuracy. To address this limitation, we propose a novel deep learning-based image dehazing algorithm optimized for outdoor railway environments. Our method integrates adaptive high-pass filtering and bilateral grid processing during the feature extraction phase to enhance detail preservation while maintaining computational efficiency. The framework uniquely combines RGB color channels with atmospheric brightness channels to disentangle environmental interference from critical structural information, ensuring balanced restoration across all spectral components. A dual-attention mechanism (channel and spatial attention modules) is incorporated during feature fusion to dynamically prioritize haze-relevant regions and suppress weather-induced artifacts. Comprehensive evaluations demonstrate the algorithm’s superior performance: On the SOTS-Outdoor benchmark, it achieves state-of-the-art PSNR (35.27) and SSIM (0.9869) scores. When tested on a specialized railway inspection dataset containing 12,840 fog-affected track images, the method attains a PSNR of 30.41 and SSIM of 0.9511, with the SSIM being marginally lower (0.0017) than DeHamer while outperforming other comparative methods in perceptual clarity. Quantitative and qualitative analyses confirm that our approach effectively restores critical infrastructure details obscured by atmospheric particles, improving defect detection accuracy by 18.6 percent compared to non-processed images in simulated inspection scenarios. This work establishes a robust solution for weather-resilient railway monitoring systems, demonstrating practical value for automated transportation safety applications. Full article

(This article belongs to the Special Issue Application of Machine Learning in Graphics and Images, 2nd Edition)

► Show Figures

Figure 1

22 pages, 9103 KiB

Open AccessArticle

IRST-CGSeg: Infrared Small Target Detection Based on Clustering-Guided Graph Learning and Hierarchical Features

by Guimin Jia, Tao Chen, Yu Cheng and Pengyu Lu

Electronics 2025, 14(5), 858; https://doi.org/10.3390/electronics14050858 - 21 Feb 2025

Viewed by 806

Abstract

Infrared small target detection (IRSTD) aims to segment small targets from an infrared clutter background. However, the long imaging distance, complex background, and extremely limited number of target pixels pose great challenges for IRSTD. In this paper, we propose a new IRSTD method [...] Read more.

Infrared small target detection (IRSTD) aims to segment small targets from an infrared clutter background. However, the long imaging distance, complex background, and extremely limited number of target pixels pose great challenges for IRSTD. In this paper, we propose a new IRSTD method based on the deep graph neural network to fully extract and fuse the texture and structural information of images. Firstly, a clustering algorithm is designed to divide the image into several subgraphs as a prior knowledge to guide the initialization of the graph structure of the infrared image, and the image texture features are integrated to graph construction. Then, a graph feature extraction module is designed, which guides nodes to interact with features within their subgraph via the adjacency matrix. Finally, a hierarchical graph texture feature fusion module is designed to concatenate and stack the structure and texture information at different levels to realize IRSTD. Extensive experiments have been conducted, and the experimental results demonstrate that the proposed method has high interaction over union (IoU) and probability of detection (P_d) on public datasets and the self-constructed dataset, indicating that it has fine shape segmentation and accurate positioning for infrared small targets. Full article

(This article belongs to the Special Issue Application of Machine Learning in Graphics and Images, 2nd Edition)

► Show Figures

Figure 1

18 pages, 1869 KiB

Open AccessArticle

A Deepfake Image Detection Method Based on a Multi-Graph Attention Network

by Guorong Chen, Chongling Du, Yuan Yu, Hong Hu, Hongjun Duan and Huazheng Zhu

Electronics 2025, 14(3), 482; https://doi.org/10.3390/electronics14030482 - 24 Jan 2025

Viewed by 2420

Abstract

Deep forgery detection plays a crucial role in addressing the challenges posed by the rapid spread of deeply generated content that significantly erodes public trust in online information and media. Deeply forged images typically present subtle but significant artifacts in multiple regions, such [...] Read more.

Deep forgery detection plays a crucial role in addressing the challenges posed by the rapid spread of deeply generated content that significantly erodes public trust in online information and media. Deeply forged images typically present subtle but significant artifacts in multiple regions, such as in the background, lighting, and localized details. These artifacts manifest as unnatural visual distortions, inconsistent lighting, or irregularities in subtle features that break the natural coherence of the real image. To address these features of forged images, we propose a novel and efficient deep image forgery detection method that utilizes Multi-Graph Attention (MGA) techniques to extract global and local features and minimize accuracy loss. Specifically, our method introduces an interactive dual-channel encoder (DIRM), which aims to extract global and channel-specific features and facilitate complex interactions between these feature sets. In the decoding phase, one of the channels is processed as a block and combined with a Dynamic Graph Attention Network (PDGAN), which is capable of recognizing and amplifying forged traces in local information. To further enhance the model’s ability to capture global context, we propose a global Height–Width Graph Attention Module (HWGAN), which effectively extracts and associates global spatial features. Experimental results show that the classification accuracy of our method for forged images in the GenImage and CIFAKE datasets is comparable to that of the optimal benchmark method. Notably, our model achieves 97.89% accuracy on the CIFAKE dataset and has the lowest number of model parameters and lowest computational overhead. These results highlight the potential of our method for deep forgery image detection. Full article

(This article belongs to the Special Issue Application of Machine Learning in Graphics and Images, 2nd Edition)

► Show Figures

Figure 1

26 pages, 23622 KiB

Open AccessArticle

CPS-RAUnet++: A Jet Axis Detection Method Based on Cross-Pseudo Supervision and Extended Unet++ Model

by Jianhong Gan, Kun Cai, Changyuan Fan, Xun Deng, Wendong Hu, Zhibin Li, Peiyang Wei, Tao Liao and Fan Zhang

Electronics 2025, 14(3), 441; https://doi.org/10.3390/electronics14030441 - 22 Jan 2025

Cited by 1 | Viewed by 949

Abstract

Atmospheric jets are pivotal components of atmospheric circulation, profoundly influencing surface weather patterns and the development of extreme weather events such as storms and cold waves. Accurate detection of the jet stream axis is indispensable for enhancing weather forecasting, monitoring climate change, and [...] Read more.

Atmospheric jets are pivotal components of atmospheric circulation, profoundly influencing surface weather patterns and the development of extreme weather events such as storms and cold waves. Accurate detection of the jet stream axis is indispensable for enhancing weather forecasting, monitoring climate change, and mitigating disasters. However, traditional methods for delineating atmospheric jets are plagued by inefficiency, substantial errors, and pronounced subjectivity, limiting their applicability in complex atmospheric scenarios. Current research on semi-supervised methods for extracting atmospheric jets remains scarce, with most approaches dependent on traditional techniques that struggle with stability and generalization. To address these limitations, this study proposes a semi-supervised jet stream axis extraction method leveraging an enhanced U-Net++ model. The approach incorporates improved residual blocks and enhanced attention gate mechanisms, seamlessly integrating these enhanced attention gates into the dense skip connections of U-Net++. Furthermore, it optimizes the consistency learning phase within semi-supervised frameworks, effectively addressing data scarcity challenges while significantly enhancing the precision of jet stream axis detection. Experimental results reveal the following: (1) With only 30% of labeled data, the proposed method achieves a precision exceeding 80% on the test set, surpassing state-of-the-art (SOTA) baselines. Compared to fully supervised U-Net and U-Net++ methods, the precision improves by 17.02% and 9.91%. (2) With labeled data proportions of 10%, 20%, and 30%, the proposed method outperforms the MT semi-supervised method, achieving precision gains of 9.44%, 15.58%, and 19.50%, while surpassing the DCT semi-supervised method with improvements of 10.24%, 16.64%, and 14.15%, respectively. Ablation studies further validate the effectiveness of the proposed method in accurately identifying the jet stream axis. The proposed method exhibits remarkable consistency, stability, and generalization capabilities, producing jet stream axis extractions closely aligned with wind field data. Full article

(This article belongs to the Special Issue Application of Machine Learning in Graphics and Images, 2nd Edition)

► Show Figures

Figure 1

20 pages, 5843 KiB

Open AccessArticle

DW-MLSR: Unsupervised Deformable Medical Image Registration Based on Dual-Window Attention and Multi-Latent Space

by Yuxuan Huang, Mengxiao Yin, Zhipan Li and Feng Yang

Electronics 2024, 13(24), 4966; https://doi.org/10.3390/electronics13244966 - 17 Dec 2024

Viewed by 1019

Abstract

(1) Background: In recent years, the application of Transformers and Vision Transformers (ViTs) in medical image registration has been constrained by sliding attention mechanisms, which struggle to effectively capture non-adjacent but critical structures, such as the hippocampus and ventricles in the brain. Additionally, [...] Read more.

(1) Background: In recent years, the application of Transformers and Vision Transformers (ViTs) in medical image registration has been constrained by sliding attention mechanisms, which struggle to effectively capture non-adjacent but critical structures, such as the hippocampus and ventricles in the brain. Additionally, the lack of labels in unsupervised registration often leads to overfitting. (2) To address these issues, we propose a novel method, DW-MLSR, based on dual-window attention and multi-latent space. The dual-window attention mechanism enhances the transmission of information across non-adjacent structures, while the multi-latent space improves the model’s generalization by learning latent image representations. (3) Experimental results demonstrate that DW-MLSR outperforms mainstream registration models, showcasing significant potential in medical image registration. (4) The DW-MLSR method addresses the limitations of sliding attention in transmitting information between non-adjacent windows, improves the performance of unsupervised registration, and demonstrates broad application prospects in medical image registration. Full article

(This article belongs to the Special Issue Application of Machine Learning in Graphics and Images, 2nd Edition)

► Show Figures

Figure 1

20 pages, 2467 KiB

Open AccessArticle

RegMamba: An Improved Mamba for Medical Image Registration

by Xin Hu, Jiaqi Chen and Yilin Chen

Electronics 2024, 13(16), 3305; https://doi.org/10.3390/electronics13163305 - 20 Aug 2024

Cited by 5 | Viewed by 3127

Abstract

Deformable medical image registration aims to minimize the differences between fixed and moving images to provide comprehensive physiological or structural information for further medical analysis. Traditional learning-based convolutional network approaches usually suffer from the problem of perceptual limitations, and in recent years, the [...] Read more.

Deformable medical image registration aims to minimize the differences between fixed and moving images to provide comprehensive physiological or structural information for further medical analysis. Traditional learning-based convolutional network approaches usually suffer from the problem of perceptual limitations, and in recent years, the Transformer architecture has gained popularity for its superior long-range relational modeling capabilities, but still faces severe computational challenges in handling high-resolution medical images. Recently, selective state-space models have shown great potential in the vision domain due to their fast inference and efficient modeling. Inspired by this, in this paper, we propose RegMamba, a novel medical image registration architecture that combines convolutional and state-space models (SSMs), designed to efficiently capture complex correspondence in registration while maintaining efficient computational effort. Firstly our model introduces Mamba to efficiently remotely model and process potential dependencies of the data to capture large deformations. At the same time, we use a scaled convolutional layer in Mamba to alleviate the problem of spatial information loss in 3D data flattening processing in Mamba. Then, a deformable convolutional residual module (DCRM) is proposed to adaptively adjust the sampling position and process deformations to capture more flexible spatial features while learning fine-grained features of different anatomical structures to construct local correspondences and improve model perception. We demonstrate the advanced registration performance of our method on the LPBA40 and IXI public datasets. Full article

(This article belongs to the Special Issue Application of Machine Learning in Graphics and Images, 2nd Edition)

► Show Figures

Figure 1

19 pages, 17496 KiB

Open AccessArticle

HR-YOLO: A Multi-Branch Network Model for Helmet Detection Combined with High-Resolution Network and YOLOv5

by Yuanfeng Lian, Jing Li, Shaohua Dong and Xingtao Li

Electronics 2024, 13(12), 2271; https://doi.org/10.3390/electronics13122271 - 10 Jun 2024

Cited by 6 | Viewed by 1843

Abstract

Automatic detection of safety helmet wearing is significant in ensuring safe production. However, the accuracy of safety helmet detection can be challenged by various factors, such as complex environments, poor lighting conditions and small-sized targets. This paper presents a novel and efficient deep [...] Read more.

Automatic detection of safety helmet wearing is significant in ensuring safe production. However, the accuracy of safety helmet detection can be challenged by various factors, such as complex environments, poor lighting conditions and small-sized targets. This paper presents a novel and efficient deep learning framework named High-Resolution You Only Look Once (HR-YOLO) for safety helmet wearing detection. The proposed framework synthesizes safety helmet wearing information from the features of helmet objects and human pose. HR-YOLO can use features from two branches to make the bounding box of suppression predictions more accurate for small targets. Then, to further improve the iterative efficiency and accuracy of the model, we design an optimized residual network structure by using Optimized Powered Stochastic Gradient Descent (OP-SGD). Moreover, a Laplace-Aware Attention Model (LAAM) is designed to make the YOLOv5 decoder pay more attention to the feature information from human pose and suppress interference from irrelevant features, which enhances network representation. Finally, non-maximum suppression voting (PA-NMS voting) is proposed to improve detection accuracy for occluded targets, using pose information to constrain the confidence of bounding boxes and select optimal bounding boxes through a modified voting process. Experimental results demonstrate that the presented safety helmet detection network outperforms other approaches and has practical value in application scenarios. Compared with the other algorithms, the proposed algorithm improves the precision, recall and mAP by 7.27%, 5.46% and 7.3%, on average, respectively. Full article

(This article belongs to the Special Issue Application of Machine Learning in Graphics and Images, 2nd Edition)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Application of Machine Learning in Graphics and Images, 2nd Edition

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Related Special Issue

Published Papers (8 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI