MEFA-Net: Multilevel Feature Extraction and Fusion Attention Network for Infrared Small-Target Detection
Abstract
1. Introduction
- (1)
- Collaborative feature extraction strategy: The DDCB module focuses on the intrinsic attributes of the target and fully considers the Gaussian characteristics of the gray-scale distribution of infrared small targets. Synergistically combining context features and Gaussian salient features enhances the expression of small-target sparse features in the encoding stage.
- (2)
- Attention-guided hierarchical fusion method: Inspired by medical image segmentation, the EAF module uses attention mechanisms to guide the fusion of output features from adjacent encoder layers and transmits the fused features to the corresponding decoder layers by utilizing enhanced skip connections. This process significantly strengthens the semantic correlation between different feature layers, facilitating more precise segmentation in complex backgrounds.
- (3)
- Efficient up-sampling mechanism: the EUB module combines PixelShuffle to rearrange a portion of the feature map’s channel information into spatial information to achieve up-sampling, which retains the rich information in the original features and enhances the flexibility and feature preservation during decoding.
2. Related Works
2.1. Infrared Small-Target Detection
2.2. Image Segmentation Structure Based on U-Net
2.3. Discussion
3. Methods
3.1. Overall Architecture
3.2. Dilated Direction-Sensitive Convolution Block
3.3. Encoder Attention Fusion Module
3.4. Efficient Up-Sampling Block
4. Experiment
4.1. Experimental Settings
4.2. Evaluation Metrics
4.3. Comparison with SOTA Methods
4.3.1. Quantitative Results
4.3.2. Qualitative Comparison
4.4. Ablation Study
4.4.1. The Ablation Study on the MEFA-Net
4.4.2. The Ablation Study on the DDCB Module
4.4.3. The Ablation Study on the EUB Module
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Hadhoud, M.M.; Thomas, D.W. The two-dimensional adaptive LMS (TDLMS) algorithm. IEEE Trans. Circuits Syst. 1988, 35, 485–494. [Google Scholar] [CrossRef]
- Bae, T.W. Small target detection using bilateral filter and temporal cross product in infrared images. Infrared Phys. Technol. 2011, 54, 403–411. [Google Scholar] [CrossRef]
- Yang, L.; Yang, J.; Yang, K. Adaptive detection for infrared small target under sea-sky complex background. Electron. Lett. 2004, 40, 1083–1085. [Google Scholar] [CrossRef]
- Wang, X.; Peng, Z.; Zhang, P.; He, Y. Infrared small target detection via nonnegativity-constrained variational mode decomposition. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1700–1704. [Google Scholar] [CrossRef]
- Ji, H.; Cui, Z. Infrared background suppression method based on low-pass adaptive morphological filtering. In Proceedings of the 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications (CVIDL & ICCEA), Changchun, China, 20–22 May 2022; IEEE: New York, NY, USA, 2022; pp. 574–577. [Google Scholar]
- Tom, V.T.; Peli, T.; Leung, M.; Bondaryk, J.E. Morphology-based algorithm for point target detection in infrared backgrounds. In Signal and Data Processing of Small Targets 1993, Proceedings of the Optical Engineering and Photonics in Aerospace Sensing, Orlando, FL, USA, 11–16 April 1993; SPIE: Bellingham, WA, USA, 1993; Volume 1954, pp. 2–11. [Google Scholar]
- Bai, X.; Zhou, F. Analysis of new top-hat transformation and the application for infrared dim small target detection. Pattern Recognit. 2010, 43, 2145–2156. [Google Scholar] [CrossRef]
- Deng, L.; Zhang, J.; Xu, G.; Zhu, H. Infrared small target detection via adaptive M-estimator ring top-hat transformation. Pattern Recognit. 2021, 112, 107729. [Google Scholar] [CrossRef]
- Chen, C.L.P.; Li, H.; Wei, Y.; Xia, T.; Tang, Y.Y. A local contrast method for small infrared target detection. IEEE Trans. Geosci. Remote Sens. 2013, 52, 574–581. [Google Scholar] [CrossRef]
- Han, J.; Moradi, S.; Faramarzi, I.; Liu, C.; Zhang, H.; Zhao, Q. A local contrast method for infrared small-target detection utilizing a tri-layer window. IEEE Geosci. Remote Sens. Lett. 2019, 17, 1822–1826. [Google Scholar] [CrossRef]
- Han, J.; Liu, C.; Liu, Y.; Luo, Z.; Zhang, X.; Niu, Q. Infrared small target detection utilizing the enhanced closest-mean background estimation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 14, 645–662. [Google Scholar] [CrossRef]
- Kou, R.; Wang, C.; Fu, Q.; Yu, Y.; Zhang, D. Infrared small target detection based on the improved density peak global search and human visual local contrast mechanism. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 6144–6157. [Google Scholar] [CrossRef]
- Deng, H.; Sun, X.; Liu, M.; Ye, C.; Zhou, X. Infrared small-target detection using multiscale gray difference weighted image entropy. IEEE Trans. Aerosp. Electron. Syst. 2016, 52, 60–72. [Google Scholar] [CrossRef]
- Aghaziyarati, S.; Moradi, S.; Talebi, H. Small infrared target detection using absolute average difference weighted by cumulative directional derivatives. Infrared Phys. Technol. 2019, 101, 78–87. [Google Scholar] [CrossRef]
- Gao, C.; Meng, D.; Yang, Y.; Wang, Y.; Zhou, X.; Hauptmann, A.G. Infrared patch-image model for small target detection in a single image. IEEE Trans. Image Process. 2013, 22, 4996–5009. [Google Scholar] [CrossRef] [PubMed]
- Dai, Y.; Wu, Y.; Song, Y. Infrared small target and background separation via column-wise weighted robust principal component analysis. Infrared Phys. Technol. 2016, 77, 421–430. [Google Scholar] [CrossRef]
- Dai, Y.; Wu, Y.; Song, Y.; Guo, J. Non-negative infrared patch-image model: Robust target-background separation via partial sum minimization of singular values. Infrared Phys. Technol. 2017, 81, 182–194. [Google Scholar] [CrossRef]
- Wang, X.; Peng, Z.; Kong, D.; Zhang, P.; He, Y. Infrared dim target detection based on total variation regularization and principal component pursuit. Image Vis. Comput. 2017, 63, 1–9. [Google Scholar] [CrossRef]
- Dai, Y.; Wu, Y. Reweighted infrared patch-tensor model with both nonlocal and local priors for single-frame small target detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 3752–3767. [Google Scholar] [CrossRef]
- Kou, R.; Wang, C.; Luo, Y.; Zhang, Y.; Xu, Z.; Peng, Z.; Wu, C.; Fu, Q. Multi-scale small target detection techniques in single-frame infrared images: A review. J. Image Graph 2024, 29, 0193–0217. [Google Scholar]
- Wang, H.; Zhou, L.; Wang, L. Miss detection vs. false alarm: Adversarial learning for small object segmentation in infrared images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8509–8518. [Google Scholar]
- Dai, Y.; Wu, Y.; Zhou, F.; Barnard, K. Asymmetric contextual modulation for infrared small target detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual Conference, 5–9 January 2021; pp. 950–959. [Google Scholar]
- Dai, Y.; Wu, Y.; Zhou, F.; Barnard, K. Attentional local contrast networks for infrared small target detection. IEEE Trans. Geosci. Remote Sens. 2021, 59, 9813–9824. [Google Scholar] [CrossRef]
- Li, B.; Xiao, C.; Wang, L.; Wang, Y.; Lin, Z.; Li, M.; An, W.; Guo, Y. Dense Nested Attention Network for Infrared Small Target Detection. IEEE Trans. Image Process. 2023, 32, 1745–1758. [Google Scholar] [CrossRef]
- Wu, X.; Hong, D.; Chanussot, J. UIU-Net: U-Net in U-Net for Infrared Small Object Detection. IEEE Trans. Image Process. 2023, 32, 364–376. [Google Scholar] [CrossRef] [PubMed]
- Chung, W.Y.; Lee, I.H.; Park, C.G. Lightweight infrared small target detection network using full-scale skip connection U-Net. IEEE Geosci. Remote Sens. Lett. 2023, 20, 7000705. [Google Scholar] [CrossRef]
- Mao, Q.; Li, Q.; Wang, B.; Zhang, Y.; Dai, T.; Chen, C.P. SpirDet: Towards efficient, accurate and lightweight infrared small target detector. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5006912. [Google Scholar] [CrossRef]
- Bai, S.; Yang, L.; Liu, Y.; Yu, H. DMF-Net: A Dual-Encoding Multi-Scale Fusion Network for Pavement Crack Detection. IEEE Trans. Intell. Transp. Syst. 2024, 25, 5981–5996. [Google Scholar] [CrossRef]
- Quan, W.; Zhao, W.; Wang, W.; Xie, H.; Wang, F.L.; Wei, M. Lost in UNet: Improving Infrared Small Target Detection by Underappreciated Local Features. IEEE Trans. Geosci. Remote Sens. 2024, 63, 5000115. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015, Proceedings of the 18th International Conference, Munich, Germany, 5–9 October 2015; Springer International Publishing: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Munia, A.A.; Abdar, M.; Hasan, M.; Jalali, M.S.; Banerjee, B.; Khosravi, A.; Hossain, I.; Fu, H.; Frangi, A.F. Attention-guided hierarchical fusion U-Net for uncertainty-driven medical image segmentation. Inf. Fusion 2025, 115, 102719. [Google Scholar] [CrossRef]
- Azad, R.; Aghdam, E.K.; Rauland, A.; Jia, Y.; Avval, A.H.; Bozorgpour, A.; Karimijafarbigloo, S.; Cohen, J.P.; Adeli, E.; Merhof, D. Medical image segmentation review: The success of u-net. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 10076–10095. [Google Scholar] [CrossRef]
- Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. Unet++: A nested u-net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Proceedings of the 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, 20 September 2018; Springer International Publishing: Cham, Switzerland, 2018; pp. 3–11. [Google Scholar]
- Huang, H.; Lin, L.; Tong, R.; Hu, H.; Zhang, Q.; Iwamoto, Y.; Han, X.; Chen, Y.-W.; Wu, J. Unet 3+: A full-scale connected unet for medical image segmentation. In Proceedings of the ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; IEEE: New York, NY, USA, 2020; pp. 1055–1059. [Google Scholar]
- Zang, D.; Su, W.; Zhang, B.; Liu, H. DCANet: Dense Convolutional Attention Network for infrared small target detection. Measurement 2025, 240, 115595. [Google Scholar] [CrossRef]
- Dai, Y.; Pan, P.; Qian, Y.; Li, Y.; Li, X.; Yang, J.; Wang, H. Pick of the bunch: Detecting infrared small targets beyond hit-miss trade-offs via selective rank-aware attention. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5006215. [Google Scholar] [CrossRef]
- Yang, J.; Liu, S.; Wu, J.; Su, X.; Hai, N.; Huang, X. Pinwheel-shaped convolution and scale-based dynamic loss for infrared small target detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; Volume 39, pp. 9202–9210. [Google Scholar]
- Zhang, M.; Zhang, R.; Yang, Y.; Bai, H.; Zhang, J.; Guo, J. ISNet: Shape matters for infrared small target detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 877–886. [Google Scholar]
- Qin, Y.; Bruzzone, L.; Gao, C.; Li, B. Infrared small target detection based on facet kernel and random walker. IEEE Trans. Geosci. Remote Sens. 2019, 57, 7104–7118. [Google Scholar] [CrossRef]
- Zhang, L.; Peng, Z. Infrared small target detection based on partial sum of the tensor nuclear norm. Remote Sens. 2019, 11, 382. [Google Scholar] [CrossRef]
- Hou, Q.; Zhang, L.; Tan, F.; Xi, Y.; Zheng, H.; Li, N. ISTDU-Net: Infrared Small-Target Detection U-Net. IEEE Geosci. Remote Sens. Lett. 2022, 19, 7506205. [Google Scholar] [CrossRef]
Type | |
---|---|
Input image | 256 × 256 × 3 |
256 × 256 × 16 | |
128 × 128 × 32 | |
64 × 64 × 64 | |
32 × 32 × 128 | |
64 × 64 × 64 | |
128 × 128 × 32 | |
256 × 256 × 16 | |
Output image | 256 × 256 × 1 |
Method | FLOPs/G | Params/M | NUAA-SIRST | IRSTD-1k | FPS | ||||
---|---|---|---|---|---|---|---|---|---|
IoU (%) | IoU (%) | ||||||||
FKRW | - | - | 18.88 | 87.79 | 1.33 | 24.30 | 68.78 | 6.11 | - |
PSTNN | - | - | 60.91 | 88.86 | 10.75 | 30.60 | 81.34 | 93.03 | - |
UNet | 65.45 | 34.53 | 72.77 | 95.00 | 1.38 | 62.91 | 89.15 | 6.04 | 89.30 |
ACM | 0.40 | 0.40 | 68.75 | 90.83 | 6.77 | 53.49 | 89.15 | 12.46 | 513.83 |
ALCNet | 0.38 | 0.43 | 68.86 | 90.83 | 3.81 | 58.59 | 89.83 | 4.76 | 475.35 |
ISTDU-Net | 7.94 | 2.75 | 71.06 | 94.17 | 2.73 | 65.32 | 90.85 | 4.30 | 206.01 |
UIUNet | 54.43 | 50.54 | 75.57 | 97.50 | 1.94 | 63.18 | 89.15 | 4.45 | 132.15 |
DNANet | 14.26 | 4.50 | 76.73 | 97.50 | 1.49 | 65.84 | 91.19 | 4.05 | 162.24 |
DMFNet | 40.59 | 10.91 | 76.84 | 98.33 | 0.53 | 66.82 | 89.15 | 1.07 | 157.60 |
MEFA-Net | 10.47 | 2.49 | 77.76 | 98.33 | 0.23 | 67.58 | 91.19 | 1.06 | 226.17 |
Strategy | DDCB | EAF | EUB | FLOPs/G | Params/M | IoU (%) | ||
---|---|---|---|---|---|---|---|---|
(a) | - | - | - | 3.82 | 0.53 | 65.33 | 88.14 | 2.15 |
(b) | √ | - | - | 9.53 | 2.39 | 66.67 | 90.17 | 1.30 |
(c) | - | √ | - | 5.18 | 0.68 | 66.66 | 89.15 | 1.45 |
(d) | - | - | √ | 3.40 | 0.49 | 66.66 | 88.81 | 1.28 |
(e) | √ | √ | - | 10.89 | 2.53 | 66.88 | 91.53 | 1.23 |
(f) | √ | √ | √ | 10.47 | 2.49 | 67.58 | 91.19 | 1.06 |
Strategy | Conv | Dconv | Pconv | FLOPs/G | Params/M | IoU (%) | ||
---|---|---|---|---|---|---|---|---|
(a) | - | - | - | 3.82 | 0.53 | 65.33 | 88.14 | 2.15 |
(b) | √ | - | - | 3.87 | 0.55 | 65.75 | 88.47 | 2.09 |
(c) | √ | √ | - | 6.02 | 1.26 | 65.80 | 89.15 | 1.48 |
(d) | √ | √ | √ | 9.53 | 2.39 | 66.67 | 90.17 | 1.30 |
Strategy | Pixelshuffle + Upsample | Dilation | FLOPs/G | Params/M | IoU (%) | ||
---|---|---|---|---|---|---|---|
(a) | - | - | 3.82 | 0.53 | 65.33 | 88.14 | 2.15 |
(b) | √ | - | 3.06 | 0.46 | 65.90 | 88.81 | 1.43 |
(c) | √ | √ | 3.40 | 0.49 | 66.66 | 88.81 | 1.28 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ma, J.; Pan, N.; Yin, D.; Wang, D.; Zhou, J. MEFA-Net: Multilevel Feature Extraction and Fusion Attention Network for Infrared Small-Target Detection. Remote Sens. 2025, 17, 2502. https://doi.org/10.3390/rs17142502
Ma J, Pan N, Yin D, Wang D, Zhou J. MEFA-Net: Multilevel Feature Extraction and Fusion Attention Network for Infrared Small-Target Detection. Remote Sensing. 2025; 17(14):2502. https://doi.org/10.3390/rs17142502
Chicago/Turabian StyleMa, Jingcui, Nian Pan, Dengyu Yin, Di Wang, and Jin Zhou. 2025. "MEFA-Net: Multilevel Feature Extraction and Fusion Attention Network for Infrared Small-Target Detection" Remote Sensing 17, no. 14: 2502. https://doi.org/10.3390/rs17142502
APA StyleMa, J., Pan, N., Yin, D., Wang, D., & Zhou, J. (2025). MEFA-Net: Multilevel Feature Extraction and Fusion Attention Network for Infrared Small-Target Detection. Remote Sensing, 17(14), 2502. https://doi.org/10.3390/rs17142502