This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
DFST-UNet: Dual-Domain Fusion Swin Transformer U-Net for Image Forgery Localization
by
Jianhua Yang
Jianhua Yang †,
Anjun Xie
Anjun Xie †,
Tao Mai
Tao Mai and
Yifang Chen
Yifang Chen
Dr. Yifang Chen is currently an Associate Professor at the School of Cyber Security, Guangdong She [...]
Dr. Yifang Chen is currently an Associate Professor at the School of Cyber Security, Guangdong Polytechnic Normal University, Guangzhou, China. She received her B.S.degree from the School of Information, Yunnan University, Kunming, China, and her Ph.D. degree from the School of Electronics and Information Technology, SunYat-sen University, Guangzhou, China. She visited the Electrical and Computer Engineering Department, University of British Columbia from 2019 to 2020. Her main research interests include multimedia information forensics, deep learning, and reinforcement learning.
*
School of Cyber Security, Guangdong Polytechnic Normal University, Guangzhou 510630, China
*
Author to whom correspondence should be addressed.
†
These authors contributed equally to this work.
Entropy 2025, 27(5), 535; https://doi.org/10.3390/e27050535 (registering DOI)
Submission received: 21 April 2025
/
Revised: 13 May 2025
/
Accepted: 15 May 2025
/
Published: 17 May 2025
Abstract
Image forgery localization is critical in defending against the malicious manipulation of image content, and is attracting increasing attention worldwide. In this paper, we propose a Dual-domain Fusion Swin Transformer U-Net (DFST-UNet) for image forgery localization. DFST-UNet is built on a U-shaped encoder–decoder architecture. Swin Transformer blocks are integrated into the U-Net architecture to capture long-range context information and perceive forged regions at different scales. Considering the fact that high-frequency forgery information is an essential clue for forgery localization, a dual-stream encoder is proposed to comprehensively expose forgery clues in both the RGB domain and the frequency domain. A novel high-frequency feature extractor module (HFEM) is designed to extract robust high-frequency features. A hierarchical attention fusion module (HAFM) is designed to effectively fuse the dual-domain features. Extensive experimental results demonstrate the superiority of DFST-UNet over the state-of-the-art methods in the task of image forgery localization.
Share and Cite
MDPI and ACS Style
Yang, J.; Xie, A.; Mai, T.; Chen, Y.
DFST-UNet: Dual-Domain Fusion Swin Transformer U-Net for Image Forgery Localization. Entropy 2025, 27, 535.
https://doi.org/10.3390/e27050535
AMA Style
Yang J, Xie A, Mai T, Chen Y.
DFST-UNet: Dual-Domain Fusion Swin Transformer U-Net for Image Forgery Localization. Entropy. 2025; 27(5):535.
https://doi.org/10.3390/e27050535
Chicago/Turabian Style
Yang, Jianhua, Anjun Xie, Tao Mai, and Yifang Chen.
2025. "DFST-UNet: Dual-Domain Fusion Swin Transformer U-Net for Image Forgery Localization" Entropy 27, no. 5: 535.
https://doi.org/10.3390/e27050535
APA Style
Yang, J., Xie, A., Mai, T., & Chen, Y.
(2025). DFST-UNet: Dual-Domain Fusion Swin Transformer U-Net for Image Forgery Localization. Entropy, 27(5), 535.
https://doi.org/10.3390/e27050535
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.