Next Article in Journal
Evaluating Fairness Strategies in Educational Data Mining: A Comparative Study of Bias Mitigation Techniques
Previous Article in Journal
The Integration of AI and IoT in Marketing: A Systematic Literature Review
Previous Article in Special Issue
Estimation of Uncertain Parameters in Single and Double Diode Models of Photovoltaic Panels Using Frilled Lizard Optimization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Enhancing Human Pose Transfer with Convolutional Block Attention Module and Facial Loss Optimization

1
Department of Computer Science and Information Engineering, National Central University, Taoyuan City 320317, Taiwan
2
Department of Information and Computer Engineering, Chung Yuan Christian University, Taoyuan City 320314, Taiwan
*
Authors to whom correspondence should be addressed.
Electronics 2025, 14(9), 1855; https://doi.org/10.3390/electronics14091855
Submission received: 31 March 2025 / Revised: 28 April 2025 / Accepted: 30 April 2025 / Published: 1 May 2025
(This article belongs to the Special Issue Machine Learning Techniques for Image Processing)

Abstract

Pose transfer methods often struggle to simultaneously preserve fine-grained clothing textures and facial details, especially under large pose variations. To address these limitations, we propose a model based on the Multi-scale attention guided pose transfer model, with modifications to its residual block by integrating the convolutional block attention module and changing the activation function from ReLU to Mish to capture more features related to clothing and skin color. Additionally, as the generated images had facial features differing from the original image, we propose two different facial feature loss functions to help the model learn more precise image features. According to the experimental results, the proposed method demonstrates superior performance compared to the Multi-scale Attention Guided Pose Transfer (MAGPT) on the DeepFashion dataset, achieving a 3.41% reduction in FID, a 0.65% improvement in SSIM, a 2% decrease in LPIPS, and a 2.7% decrease in LPIPS. Ultimately, only one reference image is required to enable users to transform into different action videos with the proposed system architecture.
Keywords: generative adversarial network; pose transfer; image generation generative adversarial network; pose transfer; image generation

Share and Cite

MDPI and ACS Style

Cheng, H.-Y.; Chiang, C.-C.; Jiang, C.-L.; Yu, C.-C. Enhancing Human Pose Transfer with Convolutional Block Attention Module and Facial Loss Optimization. Electronics 2025, 14, 1855. https://doi.org/10.3390/electronics14091855

AMA Style

Cheng H-Y, Chiang C-C, Jiang C-L, Yu C-C. Enhancing Human Pose Transfer with Convolutional Block Attention Module and Facial Loss Optimization. Electronics. 2025; 14(9):1855. https://doi.org/10.3390/electronics14091855

Chicago/Turabian Style

Cheng, Hsu-Yung, Chun-Chen Chiang, Chi-Lun Jiang, and Chih-Chang Yu. 2025. "Enhancing Human Pose Transfer with Convolutional Block Attention Module and Facial Loss Optimization" Electronics 14, no. 9: 1855. https://doi.org/10.3390/electronics14091855

APA Style

Cheng, H.-Y., Chiang, C.-C., Jiang, C.-L., & Yu, C.-C. (2025). Enhancing Human Pose Transfer with Convolutional Block Attention Module and Facial Loss Optimization. Electronics, 14(9), 1855. https://doi.org/10.3390/electronics14091855

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop