USF-Net: Infrared-Visible Image Fusion via Unified Semantics and Context Modulation
Abstract
1. Introduction
2. Related Work
2.1. Purely Visual Models
2.2. Text-Guided Models
2.3. Image Manifold-Based Domain Transform
3. Method
3.1. Construction of Multiple Text Prompts and Contextual Semantic Descriptions
3.2. Model Overview
3.3. Unified Semantic-Guided Shared Feature Alignment Encoder (SFAE)
3.3.1. Semantic Anchor Alignment and TextCube Construction
3.3.2. Text-Guided Domain Transform
3.3.3. Weight Generation and Context Modulation
3.4. Unified Semantic-Guided Specific Feature Reweighting Fusion (SFRF)
3.4.1. Text–Visual Interaction and Edge Vectors
3.4.2. Weight Matrix Construction and Context Modulation
3.5. Visual Feature Extraction and Reconstruction
3.6. Multi-Stage Training and Loss Functions
3.6.1. Stage I: Pure Visual Pretraining
3.6.2. Stage II: Joint Text–Visual Fine-Tuning
4. Experimental Results and Analysis
4.1. Experimental Settings
4.2. Benchmark Settings
4.2.1. Evaluation Metrics
4.2.2. Datasets
4.2.3. Definition of Experimental Settings
4.3. Comparison with State-of-the-Art Methods
4.3.1. VIS-IR Fusion Under the Semantics-Off Setting
4.3.2. VIS-IR Fusion Under the Semantics-On Setting
4.4. Efficiency Comparison
4.5. Ablation Study
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Ma, J.; Ma, Y.; Li, C. Infrared and Visible Image Fusion Methods and Applications: A Survey. Inf. Fusion 2019, 45, 153–178. [Google Scholar] [CrossRef]
- Zhang, H.; Xu, H.; Tian, X.; Jiang, J.; Ma, J. Image Fusion Meets Deep Learning: A Survey and Perspective. Inf. Fusion 2021, 76, 323–336. [Google Scholar] [CrossRef]
- Tang, L.; Yuan, J.; Ma, J. Image Fusion in the Loop of High-Level Vision Tasks: A Semantic-Aware Real-Time Infrared and Visible Image Fusion Network. Inf. Fusion 2022, 82, 28–42. [Google Scholar] [CrossRef]
- Sun, Y.; Cao, B.; Zhu, P.; Hu, Q. DetFusion: A Detection-Driven Infrared and Visible Image Fusion Network. In Proceedings of the ACM International Conference on Multimedia; ACM: New York, NY, USA, 2022; pp. 4003–4011. [Google Scholar]
- Tang, L.; Yuan, J.; Zhang, H.; Jiang, X.; Ma, J. PIAFusion: A Progressive Infrared and Visible Image Fusion Network Based on Illumination Aware. Inf. Fusion 2022, 83, 79–92. [Google Scholar] [CrossRef]
- Yi, X.; Xu, H.; Zhang, H.; Tang, L.; Ma, J. Text-IF: Leveraging Semantic Text Guidance for Degradation-Aware and Interactive Image Fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2024; pp. 27026–27035. [Google Scholar]
- Li, H.; Wu, X.-J. DenseFuse: A Fusion Approach to Infrared and Visible Images. IEEE Trans. Image Process. 2019, 28, 2614–2623. [Google Scholar] [CrossRef] [PubMed]
- Xu, H.; Ma, J.; Jiang, J.; Guo, X.; Ling, H. U2Fusion: A Unified Unsupervised Image Fusion Network. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 502–518. [Google Scholar] [CrossRef] [PubMed]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning Transferable Visual Models from Natural Language Supervision. In Proceedings of the International Conference on Machine Learning (ICML); IEEE: New York, NY, USA, 2021; pp. 8748–8763. [Google Scholar]
- Patashnik, O.; Wu, Z.; Shechtman, E.; Cohen-Or, D.; Lischinski, D. StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); IEEE: New York, NY, USA, 2021; pp. 2085–2094. [Google Scholar]
- Potlapalli, V.; Zamir, S.W.; Khan, S.; Khan, F.S. PromptIR: Prompting for All-in-One Blind Image Restoration. arXiv 2023, arXiv:2306.13090. [Google Scholar] [CrossRef]
- OpenAI. GPT-4 Technical Report. arXiv 2023, arXiv:2303.08774. [Google Scholar] [CrossRef]
- Zhao, Z.; Bai, H.; Zhang, J.; Zhang, Y.; Xu, S.; Lin, Z.; Timofte, R.; Van Gool, L. CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2023; pp. 5906–5916. [Google Scholar]
- Ma, J.; Xu, H.; Jiang, J.; Mei, X.; Zhang, X.-P. DDCGAN: A Dual-Discriminator Conditional Generative Adversarial Network for Multi-Resolution Image Fusion. IEEE Trans. Image Process. 2020, 29, 4980–4995. [Google Scholar] [CrossRef]
- Ma, J.; Yu, W.; Liang, P.; Li, C.; Jiang, J. FusionGAN: A Generative Adversarial Network for Infrared and Visible Image Fusion. Inf. Fusion 2019, 48, 11–26. [Google Scholar] [CrossRef]
- Ma, J.; Zhang, H.; Shao, Z.; Liang, P.; Xu, H. GANMcC: A Generative Adversarial Network with Multiclassification Constraints for Infrared and Visible Image Fusion. IEEE Trans. Instrum. Meas. 2021, 70, 1–14. [Google Scholar] [CrossRef]
- Xu, H.; Yuan, J.; Ma, J. MURF: Mutually Reinforcing Multi-Modal Image Registration and Fusion. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 12148–12166. [Google Scholar] [CrossRef] [PubMed]
- Liu, R.; Liu, Z.; Liu, J.; Fan, X.; Luo, Z. A Task-Guided, Implicitly-Searched and Meta-Initialized Deep Model for Image Fusion. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 6594–6609. [Google Scholar] [CrossRef]
- Liu, J.; Fan, X.; Huang, Z.; Wu, G.; Liu, R.; Zhong, W.; Luo, Z. Target-Aware Dual Adversarial Learning and a Multi-Scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5802–5811. [Google Scholar]
- Deng, B.; He, Y.; Shen, Z.; Zhang, Y.; Deng, Q.; Nie, Z.; Wang, Y. YCNNet: Road Target Recognition Method by Fusion of LiDAR and Thermal Infrared Camera. IEEE Sens. J. 2026, 26, 3278–3288. [Google Scholar] [CrossRef]
- He, Y.; Hao, Y.; Qian, M.; Gu, Q.; Deng, B.; Wang, Y. SCMF-Net: Sparse Self-Attention Driven Cross-Modal Fusion for Robust Detection in Complex Road Scenes. IEEE Sens. J. 2026, 26, 10721–10730. [Google Scholar] [CrossRef]
- Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2022; pp. 10684–10695. [Google Scholar]
- Kim, G.; Kwon, T.; Ye, J.C. DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2022; pp. 2426–2435. [Google Scholar]
- Kimmel, R.; Sochen, N.; Malladi, R. From High Energy Physics to Low Level Vision. In Proceedings of the Scale-Space Theory in Computer Vision, Utrecht, The Netherlands, 2–4 July 1997; pp. 236–247. [Google Scholar]
- Gastal, E.S.; Oliveira, M.M. Domain Transform for Edge-Aware Image and Video Processing. In ACM SIGGRAPH 2011 Papers; ACM: New York, NY, USA, 2011; pp. 1–12. [Google Scholar]
- Wang, Z.; Li, X.; Zhao, L.; Duan, H.; Wang, S.; Liu, H.; Zhang, X. When Multi-Focus Image Fusion Networks Meet Traditional Edge-Preservation Technology. Int. J. Comput. Vis. 2023, 131, 2529–2552. [Google Scholar] [CrossRef]
- Zhang, Y.; Gong, K.; Zhang, K.; Li, H.; Qiao, Y.; Ouyang, W.; Yue, X. Meta-Transformer: A Unified Framework for Multimodal Learning. arXiv 2023, arXiv:2307.10802. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Dinh, L.; Sohl-Dickstein, J.; Bengio, S. Density Estimation Using Real NVP. arXiv 2016, arXiv:1605.08803. [Google Scholar]
- Zhou, M.; Huang, J.; Fang, Y.; Fu, X.; Liu, A. Pan-Sharpening with Customized Transformer and Invertible Neural Network. In Proceedings of the AAAI Conference on Artificial Intelligence; Association for the Advancement of Artificial Intelligence: Washington, DC, USA, 2022; pp. 3553–3561. [Google Scholar]
- Sun, Y.; Dong, L.; Huang, S.; Ma, S.; Xia, Y.; Xue, J.; Wang, J.; Wei, F. Retentive Network: A Successor to Transformer for Large Language Models. arXiv 2023, arXiv:2307.08621. [Google Scholar] [CrossRef]
- Aslantas, V.; Bendes, E. A New Image Quality Metric for Image Fusion: The Sum of the Correlations of Differences. AEU-Int. J. Electron. Commun. 2015, 69, 1890–1896. [Google Scholar] [CrossRef]
- Han, Y.; Cai, Y.; Cao, Y.; Xu, X. A New Image Fusion Performance Metric Based on Visual Information Fidelity. Inf. Fusion 2013, 14, 127–135. [Google Scholar] [CrossRef]
- Xydeas, C.S.; Petrovic, V. Objective Image Fusion Performance Measure. Electron. Lett. 2000, 36, 308–309. [Google Scholar] [CrossRef]
- Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “Completely Blind” Image Quality Analyzer. IEEE Signal Process. Lett. 2012, 20, 209–212. [Google Scholar] [CrossRef]
- Mittal, A.; Moorthy, A.K.; Bovik, A.C. No-Reference Image Quality Assessment in the Spatial Domain. IEEE Trans. Image Process. 2012, 21, 4695–4708. [Google Scholar] [CrossRef]
- Ke, J.; Wang, Q.; Wang, Y.; Milanfar, P.; Yang, F. MUSIQ: Multi-Scale Image Quality Transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); IEEE: New York, NY, USA, 2021; pp. 5148–5157. [Google Scholar]
- Wang, J.; Chan, K.C.K.; Loy, C.C. Exploring CLIP for Assessing the Look and Feel of Images. In Proceedings of the AAAI Conference on Artificial Intelligence; Association for the Advancement of Artificial Intelligence: Washington, DC, USA, 2023; pp. 2555–2563. [Google Scholar]
- Ha, Q.; Watanabe, K.; Karasawa, T.; Ushiku, Y.; Harada, T. MFNet: Towards Real-Time Semantic Segmentation for Autonomous Vehicles with Multi-Spectral Scenes. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); IEEE: New York, NY, USA, 2017; pp. 5108–5115. [Google Scholar]
- Jia, X.; Zhu, C.; Li, M.; Tang, W.; Zhou, W. LLVIP: A Visible-Infrared Paired Dataset for Low-Light Vision. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); IEEE: New York, NY, USA, 2021; pp. 3496–3504. [Google Scholar]
- Huang, Z.; Liu, J.; Fan, X.; Liu, R.; Zhong, W.; Luo, Z. ReCoNet: Recurrent Correction Network for Fast and Efficient Multi-Modality Image Fusion. In Proceedings of the European Conference on Computer Vision (ECCV); Springer: Cham, Switzerland, 2022; pp. 539–555. [Google Scholar]
- Zhao, W.; Xie, S.; Zhao, F.; He, Y.; Lu, H. MetaFusion: Infrared and Visible Image Fusion via Meta-Feature Embedding from Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2023; pp. 13955–13965. [Google Scholar]
- Xie, H.; Zhang, Y.; Qiu, J.; Zhai, X.; Liu, X.; Yang, Y.; Zhao, S.; Luo, Y.; Zhong, J. Semantics Lead All: Towards Unified Image Registration and Fusion from a Semantic Perspective. Inf. Fusion 2023, 98, 101835. [Google Scholar] [CrossRef]
- Zhang, H.; Zuo, X.; Jiang, J.; Guo, C.; Ma, J. MRFS: Mutually Reinforcing Image Fusion and Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2024; pp. 26974–26983. [Google Scholar]
- Wu, G.; Liu, H.; Fu, H.; Peng, Y.; Liu, J.; Fan, X.; Liu, R. Every SAM Drop Counts: Embracing Semantic Priors for Multi-Modality Image Fusion and Beyond. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2025; pp. 17882–17891. [Google Scholar]
- Liu, J.; Zhang, B.; Mei, Q.; Li, X.; Zou, Y.; Jiang, Z.; Ma, L.; Liu, R.; Fan, X. DCEvo: Discriminative Cross-Dimensional Evolutionary Learning for Infrared and Visible Image Fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2025; pp. 2226–2235. [Google Scholar]
- Wu, W.; Weng, J.; Zhang, P.; Wang, X.; Yang, W.; Jiang, J. URetinex-Net: Retinex-Based Deep Unfolding Network for Low-Light Image Enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2022; pp. 5901–5910. [Google Scholar]
- Li, B.; Liu, X.; Hu, P.; Wu, Z.; Lv, J.; Peng, X. All-In-One Image Restoration for Unknown Corruption. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2022; pp. 17452–17462. [Google Scholar]
- Chen, H.; Gu, J.; Liu, Y.; Magid, S.A.; Dong, C.; Wang, Q.; Pfister, H.; Zhu, L. Masked Image Training for Generalizable Deep Image Denoising. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2023; pp. 1692–1703. [Google Scholar]
- Afifi, M.; Derpanis, K.G.; Ommer, B.; Brown, M.S. Learning Multi-Scale Photo Exposure Correction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2021; pp. 9157–9167. [Google Scholar]







| Dataset | Methods | SCD | SD | EN | VIF | |
|---|---|---|---|---|---|---|
| MSRS | ReCoNet | 1.191 | 44.374 | 5.052 | 0.433 | 0.367 |
| PIAFusion | 1.522 | 41.953 | 6.746 | 0.925 | 0.575 | |
| U2Fusion | 1.182 | 23.541 | 5.246 | 0.506 | 0.372 | |
| MetaFusion | 1.486 | 39.432 | 6.368 | 0.726 | 0.478 | |
| SemLA | 1.254 | 30.518 | 5.953 | 0.664 | 0.458 | |
| MRFS | 1.431 | 39.843 | 6.551 | 0.723 | 0.489 | |
| SAGE | 1.733 | 44.912 | 6.871 | 1.024 | 0.643 | |
| DCEvo | 1.756 | 45.632 | 6.901 | 1.101 | 0.675 | |
| OURS | 1.802 | 46.732 | 6.886 | 1.171 | 0.693 | |
| Dataset | Methods | SCD | SD | EN | VIF | |
| LLVIP | ReCoNet | 1.345 | 41.234 | 5.514 | 0.513 | 0.364 |
| PIAFusion | 1.323 | 44.853 | 6.523 | 0.882 | 0.465 | |
| U2Fusion | 0.757 | 23.614 | 5.972 | 0.552 | 0.341 | |
| MetaFusion | 1.317 | 42.446 | 6.823 | 0.833 | 0.493 | |
| SemLA | 1.036 | 27.984 | 5.981 | 0.631 | 0.364 | |
| MRFS | 1.123 | 35.485 | 6.263 | 0.581 | 0.395 | |
| SAGE | 1.581 | 47.972 | 7.124 | 0.982 | 0.585 | |
| DCEvo | 1.664 | 49.768 | 7.453 | 1.113 | 0.653 | |
| OURS | 1.702 | 50.621 | 7.542 | 1.321 | 0.687 | |
| Dataset | Methods | SCD | SD | EN | VIF | |
| RoadScene | ReCoNet | 1.589 | 37.581 | 6.822 | 0.504 | 0.354 |
| PIAFusion | 1.586 | 49.283 | 6.975 | 0.701 | 0.453 | |
| U2Fusion | 1.498 | 30.969 | 6.739 | 0.513 | 0.467 | |
| MetaFusion | 1.581 | 51.643 | 7.223 | 0.512 | 0.468 | |
| SemLA | 1.248 | 31.869 | 6.548 | 0.503 | 0.438 | |
| MRFS | 1.399 | 40.874 | 6.947 | 0.501 | 0.431 | |
| SAGE | 1.758 | 51.637 | 7.073 | 0.658 | 0.497 | |
| DCEvo | 1.642 | 49.833 | 7.468 | 0.801 | 0.611 | |
| OURS | 1.651 | 50.816 | 7.476 | 0.853 | 0.621 |
| Method | MSRS Dataset | LLVIP Dataset | ||||
|---|---|---|---|---|---|---|
| CLIP-IQA | EN | NIQE | EN | NIQE | MUSIQ | |
| eir. + ReCoNet | 0.117 | 7.216 | 5.769 | 7.109 | 4.695 | 44.187 |
| eir. + PLAFusion | 0.123 | 7.082 | 3.781 | 7.332 | 3.986 | 48.255 |
| eir. + U2Fusion | 0.127 | 6.724 | 3.997 | 7.439 | 3.969 | 48.481 |
| eir. + MetaFusion | 0.106 | 7.307 | 3.584 | 7.495 | 3.722 | 49.628 |
| eir. + SemLA | 0.113 | 6.861 | 3.944 | 7.214 | 4.184 | 46.053 |
| eir. + MRFS | 0.121 | 7.051 | 3.822 | 7.281 | 4.061 | 47.241 |
| eir. + SAGE | 0.131 | 7.275 | 3.563 | 7.552 | 3.683 | 50.356 |
| eir. + DCEvo | 0.132 | 7.292 | 3.521 | 7.583 | 3.621 | 50.836 |
| OURS | 0.134 | 7.301 | 3.478 | 7.624 | 3.541 | 51.217 |
| Method | MFNet Dataset | DN-MSRS Dataset | RoadScene Dataset | ||||||
|---|---|---|---|---|---|---|---|---|---|
| SD | EN | MUSIQ | SD | EN | NIQE | SF | NIQE | BRISQUE | |
| eir. + ReCoNet | 41.654 | 5.161 | 29.299 | 41.525 | 4.463 | 8.631 | 10.312 | 4.785 | 37.775 |
| eir. + PLAFusion | 39.853 | 6.123 | 34.184 | 36.952 | 6.025 | 5.083 | 14.852 | 3.864 | 31.651 |
| eir. + U2Fusion | 33.945 | 5.741 | 34.255 | 28.812 | 4.609 | 7.185 | 18.006 | 4.215 | 34.577 |
| eir. + MetaFusion | 42.026 | 6.665 | 34.764 | 39.956 | 6.398 | 4.337 | 26.653 | 3.473 | 29.521 |
| eir. + SemLA | 32.622 | 5.982 | 33.526 | 30.654 | 5.323 | 5.882 | 12.252 | 3.988 | 32.253 |
| eir. + MRFS | 38.651 | 6.225 | 34.325 | 36.211 | 5.958 | 5.151 | 14.487 | 3.843 | 31.458 |
| eir. + SAGE | 42.882 | 6.568 | 35.253 | 41.957 | 6.453 | 4.525 | 16.543 | 3.625 | 30.053 |
| eir. + DCEvo | 43.553 | 6.755 | 35.525 | 43.055 | 6.554 | 4.254 | 17.726 | 3.522 | 29.254 |
| OURS | 44.025 | 6.689 | 35.852 | 43.522 | 6.661 | 4.053 | 17.701 | 3.381 | 28.958 |
| Models | Params (M) | FLOPs (G) | MSRS (FPS) | LLVIP (FPS) | RoadScene (FPS) |
|---|---|---|---|---|---|
| ReCoNet | 0.441 | 10.81 | 12.82 | 12.35 | 11.9 |
| PIAFusion | 0.392 | 9.14 | 13.33 | 12.82 | 12.99 |
| U2Fusion | 1.095 | 28.92 | 2.92 | 2.72 | 2.85 |
| MetaFusion | 0.272 | 5.23 | 17.86 | 16.39 | 17.24 |
| SemLA | 0.793 | 18.74 | 8.77 | 8.26 | 8.47 |
| MRFS | 0.325 | 7.91 | 15.63 | 14.93 | 15.15 |
| SAGE | 1.171 | 31.62 | 6.9 | 6.62 | 6.76 |
| DCEvo | 1.362 | 36.86 | 6.37 | 6.02 | 6.17 |
| OURS | 0.346 | 8.04 | 15.87 | 15.38 | 14.08 |
| Setting | EN | SD | SCD | VIF | |
|---|---|---|---|---|---|
| w/o | 7.567 | 50.285 | 1.724 | 1.353 | 0.687 |
| w/o SFAE | 7.442 | 48.291 | 1.715 | 1.237 | 0.612 |
| w/o SFRF | 7.323 | 46.325 | 1.533 | 1.271 | 0.599 |
| OURS | 7.642 | 51.462 | 1.781 | 1.402 | 0.712 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Fu, D.; Li, Z.; Fan, W.; Wang, Q. USF-Net: Infrared-Visible Image Fusion via Unified Semantics and Context Modulation. Sensors 2026, 26, 2874. https://doi.org/10.3390/s26092874
Fu D, Li Z, Fan W, Wang Q. USF-Net: Infrared-Visible Image Fusion via Unified Semantics and Context Modulation. Sensors. 2026; 26(9):2874. https://doi.org/10.3390/s26092874
Chicago/Turabian StyleFu, Dingding, Zhongguo Li, Wenbin Fan, and Qi Wang. 2026. "USF-Net: Infrared-Visible Image Fusion via Unified Semantics and Context Modulation" Sensors 26, no. 9: 2874. https://doi.org/10.3390/s26092874
APA StyleFu, D., Li, Z., Fan, W., & Wang, Q. (2026). USF-Net: Infrared-Visible Image Fusion via Unified Semantics and Context Modulation. Sensors, 26(9), 2874. https://doi.org/10.3390/s26092874

