A Wide and Shallow Network Tailored for Infrared Small Target Detection
Highlights
- Extremely Lightweight Model: WSNet achieves state–of–the–art efficiency with only 0.054 M parameters and 1.050 G FLOPs, making it the lightest model to date in the field of Infrared Small Target Detection (IRSTD).
- Wide and Shallow Architecture: Contrary to conventional deep networks, WSNet adopts a wide and shallow design, which is more suitable for infrared images that lack rich semantic information. Excessive depth leads to performance degradation in IRSTD.
- Superior Performance–Speed Trade–off: WSNet achieves competitive detection accuracy (e.g., highest IoU on SIRST, and best Pd on NUDT–SIRST) while offering the fastest inference speed (up to 146 FPS on GPU, 30 FPS on CPU).
- Practical Deployment in Resource–Limited Environments: WSNet’s lightweight design and real–time CPU compatibility enable its deployment in embedded systems, drones, and portable infrared devices, where computational resources are limited but low–latency detection is critical.
- Paradigm Shift in IRSTD Architecture Design: The success of a wide and shallow network challenges the prevailing “deeper is better” assumption in deep learning for IRSTD, encouraging the community to reconsider architecture tailoring based on domain–specific characteristics.
Abstract
1. Introduction
- (1)
- We design an extremely simple and efficient architecture named WSNet, which contains only 0.054 M parameters and requires 1.050 G FLOPs. It represents the most lightweight model in the field of IRSTD and achieves the fastest inference speed to date. The code is available at https://github.com/CPaul33/WSNet on 15 January 2025.
- (2)
- To fit the scarcity of hierarchical semantic content in IRSTD tasks, we introduce a Width Extension Module (WEM) that enhances feature representation by expanding the network’s width. Furthermore, building upon the CBAM module [32], we propose a customized Channel–Spatial Hybrid Attention (CSHA) mechanism. This is the first work to incorporate Lp Pooling into such a structure, which effectively smooths outliers while preserving the overall trend of features.
- (3)
- Extensive experiments on multiple benchmark datasets—SIRST [19], NUDT–SIRST [20], and IRSTD–1K [21]—show that WSNet achieves performance comparable to state–of–the–art models, while delivering the fastest inference speed, several times faster than current SOTA approaches. Furthermore, experiments demonstrate that WSNet can be directly deployed on resource–constrained devices such as CPUs while still maintaining real–time detection capability, making it highly suitable for practical real–time applications and large–scale deployment in real–world IRSTD scenarios.
2. Related Work
2.1. Infrared Small Target Detection
2.2. Lightweight Network
2.3. Width Extension Learning
2.4. Attention Mechanism
3. Methodology
3.1. Overall Architecture
3.2. Width Extension Module
3.3. Channel–Spatial Hybrid Attention
3.4. Fully Convolutional Network Head
4. Experiment
4.1. Implementation Details
4.2. Comparison to State–of–the–Art Methods
4.3. Comparison to Lightweight Methods
4.4. Deployment in Resource–Constrained Device
4.5. Ablation Study
4.6. The Experimental Findings
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zhao, M.; Li, W.; Li, L.; Hu, J.; Ma, P.; Tao, R. Single-frame infrared small–target detection: A survey. IEEE Geosci. Remote Sens. Mag. 2022, 10, 87–119. [Google Scholar] [CrossRef]
- Kou, R.; Wang, C.; Peng, Z.; Zhao, Z.; Chen, Y.; Han, J.; Huang, F.; Yu, Y.; Fu, Q. Infrared small target segmentation networks: A survey. Pattern Recognit. 2023, 143, 109788. [Google Scholar] [CrossRef]
- Deng, H.; Sun, X.; Liu, M.; Ye, C.; Zhou, X. Small infrared target detection based on weighted local difference measure. IEEE Trans. Geosci. Remote Sens. 2016, 54, 4204–4214. [Google Scholar] [CrossRef]
- Zhang, J.; Tao, D. Empowering things with intelligence: A survey of the progress, challenges, and opportunities in artificial intelligence of things. IEEE Internet Things J. 2020, 8, 7789–7817. [Google Scholar] [CrossRef]
- Teutsch, M.; Krüger, W. Classification of small boats in infrared images for maritime surveillance. In Proceedings of the 2010 International Waterside Security Conference, Carrara, Italy, 3–5 November 2010; pp. 1–7. [Google Scholar]
- Wu, X.; Hong, D.; Chanussot, J. UIU–Net: U–Net in U–Net for infrared small object detection. IEEE Trans. Image Process. 2022, 32, 364–376. [Google Scholar] [CrossRef]
- Rivest, J.F.; Fortin, R. Detection of dim targets in digital infrared imagery by morphological image processing. Opt. Eng. 1996, 35, 1886–1893. [Google Scholar] [CrossRef]
- Deshpande, S.D.; Er, M.H.; Venkateswarlu, R.; Chan, P. Max–mean and max–median filters for detection of small targets. In Signal and Data Processing of Small Targets 1999; SPIE: Bellingham, WA, USA, 1999; Volume 3809, pp. 74–83. [Google Scholar]
- Qin, Y.; Bruzzone, L.; Gao, C.; Li, B. Infrared small target detection based on facet kernel and random walker. IEEE Trans. Geosci. Remote Sens. 2019, 57, 7104–7118. [Google Scholar] [CrossRef]
- Chen, C.P.; Li, H.; Wei, Y.; Xia, T.; Tang, Y.Y. A local contrast method for small infrared target detection. IEEE Trans. Geosci. Remote Sens. 2013, 52, 574–581. [Google Scholar] [CrossRef]
- Han, J.; Moradi, S.; Faramarzi, I.; Liu, C.; Zhang, H.; Zhao, Q. A local contrast method for infrared small–target detection utilizing a tri–layer window. IEEE Geosci. Remote Sens. Lett. 2019, 17, 1822–1826. [Google Scholar] [CrossRef]
- Han, J.; Moradi, S.; Faramarzi, I.; Zhang, H.; Zhao, Q.; Zhang, X.; Li, N. Infrared small target detection based on the weighted strengthened local contrast measure. IEEE Geosci. Remote Sens. Lett. 2020, 18, 1670–1674. [Google Scholar] [CrossRef]
- Gao, C.; Meng, D.; Yang, Y.; Wang, Y.; Zhou, X.; Hauptmann, A.G. Infrared patch–image model for small target detection in a single image. IEEE Trans. Image Process. 2013, 22, 4996–5009. [Google Scholar] [CrossRef] [PubMed]
- Zhang, L.; Peng, L.; Zhang, T.; Cao, S.; Peng, Z. Infrared small target detection via non–convex rank approximation minimization joint l2,1 norm. Remote Sens. 2018, 10, 1821. [Google Scholar] [CrossRef]
- Dai, Y.; Wu, Y. Reweighted infrared patch–tensor model with both nonlocal and local priors for single–frame small target detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 3752–3767. [Google Scholar] [CrossRef]
- Zhang, L.; Peng, Z. Infrared small target detection based on partial sum of the tensor nuclear norm. Remote Sens. 2019, 11, 382. [Google Scholar] [CrossRef]
- Sun, Y.; Yang, J.; An, W. Infrared dim and small target detection via multiple subspace learning and spatial–temporal patch–tensor model. IEEE Trans. Geosci. Remote Sens. 2020, 59, 3737–3752. [Google Scholar] [CrossRef]
- Wang, H.; Zhou, L.; Wang, L. Miss detection vs. false alarm: Adversarial learning for small object segmentation in infrared images. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8509–8518. [Google Scholar]
- Dai, Y.; Wu, Y.; Zhou, F.; Barnard, K. Asymmetric contextual modulation for infrared small target detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2021; pp. 950–959. [Google Scholar]
- Li, B.; Xiao, C.; Wang, L.; Wang, Y.; Lin, Z.; Li, M.; An, W.; Guo, Y. Dense nested attention network for infrared small target detection. IEEE Trans. Image Process. 2022, 32, 1745–1758. [Google Scholar] [CrossRef]
- Zhang, M.; Zhang, R.; Yang, Y.; Bai, H.; Zhang, J.; Guo, J. ISNet: Shape matters for infrared small target detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 877–886. [Google Scholar]
- Wu, T.; Li, B.; Luo, Y.; Wang, Y.; Xiao, C.; Liu, T.; Yang, J.; An, W.; Guo, Y. MTU–Net: Multilevel TransUNet for space–based infrared tiny ship detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–15. [Google Scholar] [CrossRef]
- Zhang, T.; Li, L.; Cao, S.; Pu, T.; Peng, Z. Attention–guided pyramid context networks for detecting infrared small target under complex background. IEEE Trans. Aerosp. Electron. Syst. 2023, 59, 4250–4261. [Google Scholar] [CrossRef]
- Liu, Q.; Liu, R.; Zheng, B.; Wang, H.; Fu, Y. Infrared small target detection with scale and location sensitivity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 17490–17499. [Google Scholar]
- Zhang, M.; Wang, Y.; Guo, J.; Li, Y.; Gao, X.; Zhang, J. IRSAM: Advancing segment anything model for infrared small target detection. In European Conference on Computer Vision; Springer Nature: Cham, Switzerland, 2024; pp. 233–249. [Google Scholar]
- Zhang, R.; Xu, L.; Yu, Z.; Shi, Y.; Mu, C.; Xu, M. Deep–IRTarget: An automatic target detector in infrared imagery using dual–domain feature extraction and allocation. IEEE Trans. Multimed. 2021, 24, 1735–1749. [Google Scholar] [CrossRef]
- Zhang, R.; Yang, B.; Xu, L.; Huang, Y.; Xu, X.; Zhang, Q.; Jiang, Z.; Liu, Y. A benchmark and frequency compression method for infrared few–shot object detection. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5001711. [Google Scholar] [CrossRef]
- Yuan, S.; Qin, H.; Yan, X.; Akhtar, N.; Mian, A. Sctransnet: Spatial–channel cross transformer network for infrared small target detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–15. [Google Scholar] [CrossRef]
- Liu, Y.; Ma, Z.; Zhu, W.; Li, N.; Li, C.; Xiong, K.; Wang, Z.; Feng, W.; Jiang, J.; Quan, Y. Forgetting the background: A masking approach for enhanced infrared small target detection. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5005615. [Google Scholar] [CrossRef]
- Cheng, K.; Ma, T.; Fei, R.; Li, J. A Lightweight Feature Enhancement Model for Infrared Small Target Detection. IEEE Sens. J. 2025, 25, 15224–15234. [Google Scholar] [CrossRef]
- Li, B.; Wang, Y.; Wang, L.; Zhang, F.; Liu, T.; Lin, Z.; An, W.; Guo, Y. Monte Carlo linear clustering with single–point supervision is enough for infrared small target detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 1009–1019. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Liu, M.; Du, H.Y.; Zhao, Y.J.; Dong, L.Q.; Hui, M.; Wang, S.X. Image small target detection based on deep learning with SNR controlled sample generation. In Current Trends in Computer Science and Mechanical Automation; De Gruyter: Berlin, Germany, 2017; Volume 1, pp. 211–220. [Google Scholar]
- Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. Ghostnet: More features from cheap operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 1580–1589. [Google Scholar]
- Howard, A.; Sandler, M.; Chu, G.; Chen, L.C.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V.; et al. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar]
- Iandola, F.N.; Han, S.; Moskewicz, M.W.; Ashraf, K.; Dally, W.J.; Keutzer, K. SqueezeNet: AlexNet–level accuracy with 50× fewer parameters and <0.5 MB model size. arXiv 2016, arXiv:1602.07360. [Google Scholar]
- Fan, M.; Lai, S.; Huang, J.; Wei, X.; Chai, Z.; Luo, J.; Wei, X. Rethinking bisenet for real–time semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 9716–9725. [Google Scholar]
- Zhang, X.; Zhou, X.; Lin, M.; Sun, J. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 6848–6856. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 4510–4520. [Google Scholar]
- Tan, M.; Le, Q.V. Mixconv: Mixed depthwise convolutional kernels. arXiv 2019, arXiv:1907.09595. [Google Scholar] [CrossRef]
- Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar] [CrossRef]
- Ma, N.; Zhang, X.; Zheng, H.T.; Sun, J. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 116–131. [Google Scholar]
- Zhao, M.; Cheng, L.; Yang, X.; Feng, P.; Liu, L.; Wu, N. TBC–Net: A real–time detector for infrared small target detection using semantic constraint. arXiv 2019, arXiv:2001.05852. [Google Scholar]
- Wu, S.; Xiao, C.; Wang, L.; Wang, Y.; Yang, J.; An, W. Repisd-net: Learning efficient infrared small-target detection network via structural re-parameterization. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5622712. [Google Scholar] [CrossRef]
- Kou, R.; Wang, C.; Yu, Y.; Peng, Z.; Yang, M.; Huang, F.; Fu, Q. LW-IRSTNet: Lightweight infrared small target segmentation network and application deployment. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5621313. [Google Scholar] [CrossRef]
- Lu, Z.; Pu, H.; Wang, F.; Hu, Z.; Wang, L. The expressive power of neural networks: A view from the width. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Cheng, H.T.; Koc, L.; Harmsen, J.; Shaked, T.; Chandra, T.; Aradhye, H.; Anderson, G.; Corrado, G.; Chai, W.; Ispir, M.; et al. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, Boston, MA, USA, 15 September 2016; pp. 7–10. [Google Scholar]
- Zagoruyko, S.; Komodakis, N. Wide residual networks. arXiv 2016, arXiv:1605.07146. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Hou, Q.; Wang, Z.; Tan, F.; Zhao, Y.; Zheng, H.; Zhang, W. RISTDnet: Robust infrared small target detection network. IEEE Geosci. Remote Sens. Lett. 2021, 19, 7000805. [Google Scholar] [CrossRef]
- Sun, H.; Bai, J.; Yang, F.; Bai, X. Receptive-field and direction induced attention network for infrared dim small target detection with a large–scale dataset IRDST. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5000513. [Google Scholar] [CrossRef]
- Li, X.; Wang, W.; Hu, X.; Yang, J. Selective kernel networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 510–519. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze–and–excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Dai, Y.; Wu, Y.; Zhou, F.; Barnard, K. Attentional local contrast networks for infrared small target detection. IEEE Trans. Geosci. Remote Sens. 2021, 59, 9813–9824. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Dai, Y.; Gieseke, F.; Oehmcke, S.; Wu, Y.; Barnard, K. Attentional feature fusion. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2021; pp. 3560–3569. [Google Scholar]










| Method | SIRST | NUDT–SIRST | IRSTD–1K | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Top-Hat [7] | - | - | - | 7.142 | 79.841 | 10,120.03 | - | 20.724 | 78.408 | 1667.04 | - | 10.062 | 75.108 | 14,320.03 |
| Max-Median [8] | - | - | - | 1.168 | 30.196 | 553.32 | - | 4.201 | 58.413 | 368.88 | - | 7.003 | 65.213 | 597.31 |
| WSLCM [12] | - | - | - | 1.021 | 80.987 | 458,461.64 | - | 0.848 | 74.574 | 523,916.33 | - | 0.989 | 70.026 | 150,270.84 |
| TLLCM [11] | - | - | - | 11.034 | 79.473 | 72.68 | - | 7.059 | 62.014 | 461.18 | - | 5.357 | 63.966 | 49.28 |
| IPI [13] | - | - | - | 25.674 | 85.551 | 114.7 | - | 17.758 | 74.486 | 412.3 | - | 27.923 | 81.374 | 161.83 |
| NRAM [14] | - | - | - | 12.164 | 74.523 | 138.52 | - | 6.931 | 56.403 | 192.67 | - | 15.249 | 70.677 | 169.26 |
| RIPT [15] | - | - | - | 11.048 | 79.077 | 226.12 | - | 29.441 | 91.85 | 3443.03 | - | 14.106 | 77.548 | 283.1 |
| PSTNN [16] | - | - | - | 22.401 | 77.953 | 291.09 | - | 14.848 | 66.132 | 441.7 | - | 24.573 | 71.988 | 352.61 |
| MSLSTIPT [17] | - | - | - | 10.302 | 82.128 | 11,310.02 | - | 8.341 | 47.399 | 881.02 | - | 11.432 | 79.027 | 1524.004 |
| ISNet [21] | 0.966 | 30.618 | 19 | 70.491 | 95.057 | 6.798 | 24 | 81.236 | 97.778 | 0.634 | 13 | 61.852 | 90.236 | 3.156 |
| DNA–Net [20] | 4.697 | 14.261 | 21 | 76.169 | 97.338 | 1.454 | 33 | 93.331 | 98.672 | 0.549 | 12 | 65.466 | 94.276 | 1.615 |
| UIU–Net [6] | 50.540 | 54.426 | 16 | 76.187 | 95.057 | 1.077 | 32 | 92.393 | 97.989 | 0.356 | 11 | 64.200 | 89.226 | 2.517 |
| AGPCNet [23] | 12.360 | 43.181 | 14 | 69.730 | 94.677 | 1.604 | 15 | 73.910 | 97.672 | 2.321 | 14 | 61.382 | 84.014 | 2.057 |
| MTU–Net [22] | 8.221 | 99.437 | 10 | 69.081 | 97.719 | 3.500 | 15 | 79.024 | 97.884 | 2.874 | 9 | 61.401 | 90.416 | 2.874 |
| MSHNet [24] | 4.065 | 6.110 | 31 | 75.116 | 92.015 | 2.257 | 47 | 85.416 | 97.566 | 1.841 | 24 | 64.268 | 90.219 | 1.845 |
| SCTransNet [28] | 11.190 | 10.120 | 9 | 76.277 | 96.192 | 2.046 | 13 | 93.472 | 98.280 | 0.625 | 8 | 66.382 | 91.527 | 1.015 |
| BGM [29] | 4.076 | 6.773 | 32 | 75.372 | 95.502 | 4.175 | 50 | 81.262 | 95.793 | 1.225 | 26 | 63.718 | 90.498 | 1.896 |
| HFMNet [30] | 6.090 | 11.190 | 10 | 74.427 | 95.762 | 5.991 | 17 | 87.729 | 96.957 | 3.617 | 10 | 66.518 | 91.726 | 1.066 |
| WSNet (Ours) | 0.054 | 1.050 | 119 | 76.929 | 96.578 | 2.751 | 146 | 91.194 | 98.730 | 0.432 | 60 | 64.467 | 90.572 | 2.725 |
| Method | SIRST | NUDT–SIRST | IRSTD–1K | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACM [19] | 0.398 | 0.402 | 68 | 67.942 | 91.255 | 2.957 | 72 | 68.902 | 95.979 | 1.342 | 43 | 62.281 | 90.909 | 4.648 |
| ALCNet [58] | 0.427 | 0.378 | 56 | 67.437 | 90.494 | 2.511 | 83 | 63.917 | 94.603 | 2.188 | 47 | 60.530 | 90.236 | 4.097 |
| ISNet [21] | 0.966 | 30.618 | 19 | 70.491 | 95.057 | 6.798 | 24 | 81.236 | 97.778 | 0.634 | 13 | 61.852 | 90.236 | 3.156 |
| RDIAN [53] | 0.217 | 3.718 | 83 | 69.190 | 92.395 | 3.416 | 97 | 78.639 | 96.508 | 1.744 | 42 | 60.613 | 87.205 | 3.004 |
| LW–IRSTNet [46] | 0.163 | 0.304 | 50 | 68.340 | 95.057 | 1.476 | 57 | 61.411 | 91.640 | 3.481 | 51 | 51.094 | 84.014 | 0.432 |
| RepISD [45] | 0.309 | 7.052 | 76 | 73.821 | 95.057 | 5.310 | 96 | 91.984 | 98.201 | 0.699 | 45 | 60.357 | 94.613 | 7.755 |
| HFMNet–Tiny [30] | 1.540 | 2.900 | 26 | 74.611 | 93.869 | 6.552 | 39 | 85.402 | 94.441 | 5.779 | 21 | 64.094 | 89.967 | 1.950 |
| WSNet (Ours) | 0.054 | 1.050 | 119 | 76.929 | 96.578 | 2.751 | 146 | 91.194 | 98.730 | 0.432 | 60 | 64.467 | 90.572 | 2.725 |
| Method | DNANet | MSHNet | SCTransNet | RepISD | HFMNet–Tiny | WSNet |
|---|---|---|---|---|---|---|
| FPS | 5 | 7 | 2 | 14 | 6 | 30 |
| ResBlock | SIRST | NUDT–SIRST | IRSTD–1K | ||||||
|---|---|---|---|---|---|---|---|---|---|
| 1 | 73.606 | 93.536 | 3.451 | 85.027 | 97.884 | 1.539 | 60.192 | 86.195 | 1.587 |
| 2 | 75.005 | 95.060 | 3.598 | 87.906 | 98.942 | 1.709 | 63.459 | 89.521 | 2.881 |
| 3 | 75.236 | 95.437 | 2.380 | 87.244 | 98.518 | 1.974 | 62.117 | 88.215 | 2.243 |
| 4 | 75.643 | 96.198 | 3.629 | 86.898 | 98.941 | 3.345 | 61.789 | 87.542 | 1.245 |
| 5 | 73.270 | 92.776 | 4.157 | - | - | - | - | - | - |
| 6 | 72.220 | 94.676 | 5.858 | - | - | - | - | - | - |
| Method | SIRST | NUDT–SIRST | IRSTD–1K | ||||||
|---|---|---|---|---|---|---|---|---|---|
| w/o WEM+CSHA | 51.494 | 85.932 | 12.629 | 69.052 | 87.196 | 11.635 | 38.742 | 80.808 | 15.500 |
| w/o WEM | 66.561 | 91.255 | 5.859 | 82.230 | 97.519 | 3.152 | 47.251 | 68.6187 | 2.175 |
| w/o CSHA | 66.532 | 92.395 | 7.148 | 85.421 | 96.085 | 5.113 | 57.635 | 89.562 | 6.859 |
| WSNet | 76.929 | 96.578 | 2.751 | 91.194 | 98.730 | 0.432 | 64.467 | 90.572 | 2.725 |
| Number | SIRST | NUDT–SIRST | IRSTD–1K | ||||||
|---|---|---|---|---|---|---|---|---|---|
| 0 | 66.561 | 91.255 | 5.859 | 82.230 | 97.519 | 3.152 | 47.251 | 68.687 | 2.175 |
| 1 | 67.903 | 95.817 | 7.086 | 84.596 | 98.624 | 2.847 | 48.748 | 89.226 | 13.928 |
| 2 | 71.544 | 96.958 | 4.315 | 85.487 | 98.836 | 1.534 | 57.215 | 81.481 | 1.892 |
| 3 | 73.237 | 96.198 | 3.382 | 86.104 | 98.307 | 1.021 | 58.775 | 84.175 | 1.353 |
| 4 | 75.824 | 97.338 | 2.586 | 88.625 | 98.624 | 0.845 | 59.640 | 86.195 | 1.539 |
| 5 | 75.712 | 95.635 | 2.720 | 90.012 | 98.519 | 0.622 | 62.272 | 88.215 | 2.422 |
| 6 | 76.929 | 96.578 | 2.751 | 91.194 | 98.730 | 0.432 | 64.467 | 90.572 | 2.725 |
| 7 | 75.842 | 93.550 | 3.023 | 90.352 | 98.112 | 0.912 | 63.742 | 88.409 | 2.450 |
| 8 | 75.224 | 95.621 | 3.200 | 90.082 | 96.992 | 1.020 | 62.875 | 89.710 | 2.818 |
| Type | SIRST | NUDT–SIRST | IRSTD–1K | ||||||
|---|---|---|---|---|---|---|---|---|---|
| baseline | 66.532 | 92.395 | 7.148 | 85.421 | 96.085 | 5.113 | 57.635 | 89.562 | 6.859 |
| + CA (SENet [55]) | 68.652 | 93.916 | 7.265 | 88.986 | 98.730 | 1.847 | 59.173 | 85.859 | 1.961 |
| + SA | 68.228 | 93.536 | 5.838 | 84.861 | 96.402 | 2.195 | 59.490 | 91.246 | 4.813 |
| +CBAM [32] | 73.279 | 93.536 | 4.123 | 89.856 | 98.730 | 1.081 | 62.181 | 86.195 | 1.959 |
| +CSHA (Ours) | 76.929 | 96.578 | 2.751 | 91.194 | 98.730 | 0.432 | 64.467 | 90.572 | 2.725 |
| 73.473 | 94.704 | 4.233 | 90.142 | 98.116 | 0.824 | 63.072 | 90.351 | 3.893 | |
| 76.929 | 96.578 | 2.751 | 91.194 | 98.730 | 0.432 | 64.467 | 90.572 | 2.725 | |
| 74.679 | 96.192 | 3.424 | 89.758 | 96.667 | 0.936 | 63.861 | 90.385 | 3.367 | |
| 73.465 | 94.252 | 4.794 | 88.743 | 95.975 | 1.269 | 62.460 | 88.729 | 3.479 | |
| Type | SIRST | NUDT–SIRST | IRSTD–1K | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| low–level and high–level feature fusion choice | AFF–ResBlock [60] | 76.810 | 95.967 | 2.031 | 91.364 | 98.389 | 0.367 | 64.251 | 90.176 | 2.456 |
| iAFF–ResBlock [60] | 75.802 | 95.143 | 2.769 | 90.730 | 98.107 | 0.958 | 64.002 | 89.793 | 2.675 | |
| skip connection (concatenation) | 74.672 | 94.446 | 3.110 | 90.497 | 97.142 | 1.097 | 62.501 | 88.775 | 2.729 | |
| skip connection (summation) | 76.929 | 96.578 | 2.751 | 91.194 | 98.730 | 0.432 | 64.467 | 90.572 | 2.725 | |
| WEM’s multi–scale features fusion choice | concatenation | 75.619 | 94.160 | 3.493 | 90.223 | 98.712 | 1.174 | 63.290 | 89.990 | 2.853 |
| summation | 76.929 | 96.578 | 2.751 | 91.194 | 98.730 | 0.432 | 64.467 | 90.572 | 2.725 | |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Lu, P.; Luo, Y.; Zhang, X.; Jia, H.; Xia, S.; Liu, Y. A Wide and Shallow Network Tailored for Infrared Small Target Detection. Remote Sens. 2026, 18, 307. https://doi.org/10.3390/rs18020307
Lu P, Luo Y, Zhang X, Jia H, Xia S, Liu Y. A Wide and Shallow Network Tailored for Infrared Small Target Detection. Remote Sensing. 2026; 18(2):307. https://doi.org/10.3390/rs18020307
Chicago/Turabian StyleLu, Pengsen, Yihan Luo, Xinyu Zhang, Haolong Jia, Shiye Xia, and Yaqing Liu. 2026. "A Wide and Shallow Network Tailored for Infrared Small Target Detection" Remote Sensing 18, no. 2: 307. https://doi.org/10.3390/rs18020307
APA StyleLu, P., Luo, Y., Zhang, X., Jia, H., Xia, S., & Liu, Y. (2026). A Wide and Shallow Network Tailored for Infrared Small Target Detection. Remote Sensing, 18(2), 307. https://doi.org/10.3390/rs18020307

