An Improved U-Net Infrared Small Target Detection Algorithm Based on Multi-Scale Feature Decomposition and Fusion and Attention Mechanism
Abstract
:1. Introduction
- 1.
- Instead of maximum pooling, downsampling based on haar wavelet transform is used to reduce the information loss caused by pooling computation and retain more detailed information.
- 2.
- A multi-scale feature fusion module is developed as the fundamental building block of the network to capture multi-scale contextual information through cavity convolution with varying cavity rates, enhancing the sensory field and boosting feature expression.
- 3.
- Following multilevel feature fusion within the decoder architecture, a triple attention mechanism is employed to enhance the utilization of multidimensional information, enabling the decoder to recover more feature details and enhance target segmentation accuracy.
- 4.
- Results from experiments conducted on the NUDT-SIRST dataset demonstrate that the proposed method in this study achieves superior segmentation outcomes, surpassing other algorithms in terms of Intersection-over-Union (IoU), Normalized Intersection over Union (nIoU), Probability of detection (Pd), and False-alarm Rate (Fa), providing a more accurate description of the target contour.
2. Data and Methodology
2.1. Experimental Data
2.2. MST-UNet Overall Structure
2.3. Multi-Scale Residual Block
2.4. Haar Wavelet Transform Downsampling
2.5. Triple Attention Mechanism
2.6. Loss Function
3. Experimental Analysis
3.1. Experimental Environment
3.2. Evaluation Metrics
3.3. Analysis of Ablation Experiments
3.4. Comparative Analysis of Multiple Methods
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
ACM | Asymmetric Context Modulation |
DBT | Detect Before Track |
DNA-Net | Dense Nested Attention Network |
HWD | Haar wavelet transform downsampling |
ILNet | low-level network |
MDvsFA-cGAN | conditional generative adversarial network |
MSDC | multiscale deep convolution |
MSRB | Multiscale residual block |
TBD | Track Before Detect |
References
- Zheng, H. Research on Infrared Dim and Small Target Detection Method Based on Convolutional Neural Network. Ph.D. Thesis, Harbin Institute of Technology, Harbin, China, 2021. [Google Scholar]
- Wei, J. Research on Infrared Weak and Small Target Detection Methods under Complex Background Conditions. Ph.D. Thesis, Xi’an Institute of Optics & Precision Mechanics, Chinese Academy of Sciences, Xi’an, China, 2023. [Google Scholar]
- Ren, X.; Wang, J.; Ma, T.; Zhu, X.; Bai, K.; Wang, J. Review on Infrared Dim and Small Target Detection Technology. J. Zhengzhou Univ. Nat. Sci. Ed. 2020, 52, 1–21. [Google Scholar]
- Han, H.; Wei, Y.; Peng, Z.; Zhao, Q.; Chen, Y.; Qin, Y.; Li, N. Infrared dim and small target detection: A review. Infrared Laser Eng. 2022, 51, 20210393. [Google Scholar]
- Wang, X. Dim Small Target Detection Based on Adaptive TDLMS Algorithm. Electro-Opt. Control. 2018, 25, 78–80. [Google Scholar]
- Zeng, M.; Li, J.; Peng, Z. The design of top-hat morphological filter and application to infrared target detection. Infrared Phys. Technol. 2006, 48, 67–76. [Google Scholar] [CrossRef]
- Dong, H.; Li, J.; Shen, Z. Small target detection based on high-pass filtering and sequential filtering. Syst. Eng. Electron. 2004, 26, 596–598. [Google Scholar]
- Chen, C.P.; Li, H.; Wei, Y.; Xia, T.; Tang, Y.Y. A local contrast method for small infrared target detection. IEEE Trans. Geosci. Remote Sens. 2013, 52, 574–581. [Google Scholar] [CrossRef]
- Zhang, T.; Li, M.; Zuo, Z.; Yang, W.; Sun, X. Moving dim point target detection with three-dimensional wide-to-exact search directional filtering. Pattern Recognit. Lett. 2007, 28, 246–253. [Google Scholar] [CrossRef]
- Qin, H.; Han, J.; Yan, X.; Li, J.; Zhou, H.; Zong, J.; Wang, B.; Zeng, Q. Multiscale random projection based background suppression of infrared small target image. Infrared Phys. Technol. 2015, 73, 255–262. [Google Scholar] [CrossRef]
- Guo, Q.; Li, Z.; Song, W.; Fu, W. Parallel computing based dynamic programming algorithm of track-before-detect. Symmetry 2018, 11, 29. [Google Scholar] [CrossRef]
- Li, M.; Liu, X.; Zhang, F.; Zhai, P. Multi target detection and tracking algorithm based on particle filtering and background subtraction. Appl. Res. Comput. Yingyong Yanjiu 2018, 35. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot Multibox Detector. In Proceedings of the Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part I 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. arXiv 2015, arXiv:1506.01497. [Google Scholar] [CrossRef] [PubMed]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Wang, K.; Li, S.; Niu, S.; Zhang, K. Detection of infrared small targets using feature fusion convolutional network. IEEE Access 2019, 7, 146081–146092. [Google Scholar] [CrossRef]
- Cai, W.; Xu, P.; Yang, Z.; Jiang, X.; Jiang, B. Dim-small targets detection of infrared images in complex backgrounds. Appl. Opt. 2021, 42, 643–650. [Google Scholar]
- Huang, X.; Zhang, T.; Zhu, Q.; Cui, W.; Li, J. Research on dim and small target detection algorithm in sky backgrounds infrared image sequence. Electron. Meas. Technol. 2021, 44, 138–144. [Google Scholar]
- Dai, J.; Zhao, X.; Li, L.; Liu, W.; Chu, X. Improved YOLOv5-based Infrared Dim-small Target Detection under Complex Background. Infrared Technol. 2022, 44, 504–512. [Google Scholar]
- Liu, B.; Fan, Y.; Qin, M.; Xie, P.; Guo, H.; Zhang, L. Infrared small target detection algorithm combined with YOLOv5 and optical flow. Laser Infrared 2022, 52, 435–441. [Google Scholar]
- Wang, Q.; Wu, L.; Li, H.; Wang, Y.; Wang, H.; Yang, W. An Infrared Small Target Detection Method via Dual Network Collaboration. Acta Armamentarii 2023, 44, 3165–3176. [Google Scholar]
- Dai, Y.; Wu, Y.; Zhou, F.; Barnard, K. Asymmetric contextual modulation for infrared small target detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Montreal, QC, Canada, 5–9 January 2021; pp. 950–959. [Google Scholar]
- Li, B.; Xiao, C.; Wang, L.; Wang, Y.; Lin, Z.; Li, M.; An, W.; Guo, Y. Dense nested attention network for infrared small target detection. IEEE Trans. Image Process. 2022, 32, 1745–1758. [Google Scholar] [CrossRef]
- Wu, X.; Hong, D.; Chanussot, J. UIU-Net: U-Net in U-Net for infrared small object detection. IEEE Trans. Image Process. 2022, 32, 364–376. [Google Scholar] [CrossRef]
- Li, H.; Yang, J.; Wang, R.; Xu, Y. ILNet: Low-level matters for salient infrared small target detection. arXiv 2023, arXiv:2309.13646. [Google Scholar]
- Wang, H.; Zhou, L.; Wang, L. Miss detection vs. false alarm: Adversarial learning for small object segmentation in infrared images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8509–8518. [Google Scholar]
- Gao, S.H.; Cheng, M.M.; Zhao, K.; Zhang, X.Y.; Yang, M.H.; Torr, P. Res2net: A new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 652–662. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- Xu, G.; Liao, W.; Zhang, X.; Li, C.; He, X.; Wu, X. Haar wavelet downsampling: A simple but effective downsampling module for semantic segmentation. Pattern Recognit. 2023, 143, 109819. [Google Scholar] [CrossRef]
- Misra, D.; Nalamada, T.; Arasanipalai, A.U.; Hou, Q. Rotate to attend: Convolutional triplet attention module. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Montreal, QC, Canada, 5–9 January 2021; pp. 3139–3148. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18. Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
- Xiao, X.; Lian, S.; Luo, Z.; Li, S. Weighted res-unet for high-quality retina vessel segmentation. In Proceedings of the 2018 9th International Conference on Information Technology in Medicine and Education (ITME), New York, NY, USA, 19–21 October 2018; pp. 327–331. [Google Scholar]
- Jha, D.; Smedsrud, P.H.; Riegler, M.A.; Johansen, D.; De Lange, T.; Halvorsen, P.; Johansen, H.D. Resunet++: An advanced architecture for medical image segmentation. In Proceedings of the 2019 IEEE International Symposium on Multimedia (ISM), San Diego, CA, USA, 9–11 December 2019; pp. 225–2255. [Google Scholar]
- Wang, H.; Cao, P.; Wang, J.; Zaiane, O.R. Uctransnet: Rethinking the skip connections in u-net from a channel-wise perspective with transformer. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 22 February–1 March 2022; Volume 36, pp. 2441–2449. [Google Scholar]
- Huang, H.; Lin, L.; Tong, R.; Hu, H.; Zhang, Q.; Iwamoto, Y.; Han, X.; Chen, Y.W.; Wu, J. Unet 3+: A full-scale connected unet for medical image segmentation. In Proceedings of the ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 1055–1059. [Google Scholar]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
Platform | Configuration |
---|---|
Integrated development environment | PyCharm |
Deep Learning Framework | Pytorch |
Scripting language | Python3.9 |
Operating system | Windows11 |
CPU | I5-12400F |
GPU | NVIDIA GeForce RTX3060 |
Memory | 16G |
CUDA | 11.7 |
NUDT-SIRST | |||||||
---|---|---|---|---|---|---|---|
Method | Metrics | ||||||
HWD | Res | MSRB | TA | IoU | nIoU | Pd | Fa() |
72.31 | 73.92 | 96.21 | 3.4281 | ||||
✓ | 74.35 | 74.63 | 97.02 | 3.3506 | |||
✓ | 76.18 | 76.1 | 97.29 | 2.7212 | |||
✓ | ✓ | 78.05 | 77.35 | 97.69 | 2.1034 | ||
✓ | ✓ | ✓ | 79.23 | 79.64 | 98.37 | 1.3161 | |
✓ | ✓ | ✓ | ✓ | 80.09 | 80.19 | 98.51 | 1.2011 |
NUDT-SIRST | ||||
---|---|---|---|---|
Method | Metrics | |||
IoU | nIoU | Recall (Pd) | Fa () | |
U-Net | 72.31 | 73.92 | 96.21 | 3.4281 |
ResUnet | 62.21 | 63.92 | 90.12 | 6.2127 |
ResUnet++ | 48.24 | 49.12 | 83.76 | 6.6609 |
UCTransNet | 71.64 | 72.98 | 95.12 | 1.7327 |
U-Net3+ | 74.60 | 75.3 | 96.34 | 3.9569 |
DeepLab3+ | 50.15 | 47.68 | 86.87 | 5.2069 |
Ours | 80.09 | 80.19 | 98.51 | 1.2011 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fan, X.; Ding, W.; Li, X.; Li, T.; Hu, B.; Shi, Y. An Improved U-Net Infrared Small Target Detection Algorithm Based on Multi-Scale Feature Decomposition and Fusion and Attention Mechanism. Sensors 2024, 24, 4227. https://doi.org/10.3390/s24134227
Fan X, Ding W, Li X, Li T, Hu B, Shi Y. An Improved U-Net Infrared Small Target Detection Algorithm Based on Multi-Scale Feature Decomposition and Fusion and Attention Mechanism. Sensors. 2024; 24(13):4227. https://doi.org/10.3390/s24134227
Chicago/Turabian StyleFan, Xiangsuo, Wentao Ding, Xuyang Li, Tingting Li, Bo Hu, and Yuqiu Shi. 2024. "An Improved U-Net Infrared Small Target Detection Algorithm Based on Multi-Scale Feature Decomposition and Fusion and Attention Mechanism" Sensors 24, no. 13: 4227. https://doi.org/10.3390/s24134227
APA StyleFan, X., Ding, W., Li, X., Li, T., Hu, B., & Shi, Y. (2024). An Improved U-Net Infrared Small Target Detection Algorithm Based on Multi-Scale Feature Decomposition and Fusion and Attention Mechanism. Sensors, 24(13), 4227. https://doi.org/10.3390/s24134227