DTRFR: A Unified Detector for Diverse Target Detection in High-Spatial-Resolution Spaceborne Infrared Video
Highlights
- A unified end-to-end framework (DTRFR) is developed for mixed-size infrared small-target detection in high-spatial-resolution spaceborne videos.
- Multi-scale feature extraction and adaptive temporal alignment are jointly employed to enhance robustness under size variation and dynamic backgrounds.
- The proposed framework enables reliable detection of both in-distribution and distribution-shift targets in realistic spaceborne infrared scenarios.
- This work provides a practical foundation for high-spatial-resolution spaceborne infrared video analysis and compatibility with existing multi-frame target detection requirements.
Abstract
1. Introduction
1.1. Single-Frame Infrared Small-Target Detection (SIRST)
1.2. Multi-Frame Infrared Small-Target Detection (MIRST)
- Construction of the SITP-QLSD dataset using real QLSAT-2 infrared backgrounds, featuring diverse scenes, mixed-size targets (–), and a generalization sub-test set (–) with extremely small targets, filling the gap in evaluating size-difference impacts.
- Design of a multi-scale IRFeatureExtractor using serial-to-parallel convolutions and dilated receptive fields to enhance cross-scale discriminability and clutter suppression.
- Proposal of an adaptive Gating Pyramid Deformable Alignment mechanism to optimize multi-frame feature alignment and improve detection robustness in sequences with dynamic backgrounds.
2. Materials and Methods
2.1. Overall Network Architecture
2.1.1. IRFeatureExtractor
2.1.2. Adaptive Gating Pyramid Deformable Alignment (AGPDA)
2.1.3. Loss Function
3. Results
3.1. Dataset Preparation
3.2. Evaluation Metrics
3.2.1. Target-Level Metrics
- Detection Probability (Pd): The fraction of ground-truth targets that are successfully detected.where is the number of correctly matched ground-truth targets and is the total number of ground-truth targets.
- False Alarm Rate (FA): The average number of falsely detected instances per image (FA per image) [47], which is the most widely adopted form in infrared small-target detection literature for its direct engineering interpretability.where denotes the number of background pixels incorrectly classified as target pixels, represents the total number of pixels in the image.
- Receiver Operating Characteristic (ROC) Curve: The trade-off curve obtained by plotting Pd against Fa while varying the detection confidence threshold. The area under this curve serves as a threshold-independent indicator of the model’s ability to achieve high detection rates while maintaining low false alarm levels.
3.2.2. Pixel-Level Metrics
- Mean Intersection over Union (mIoU): Averages the per-sample IoU across all test images, effectively mitigating the dominance of larger targets.where N is the total number of samples, and , , are the true positive, false positive, and false negative pixels in the i-th sample.
3.3. Quantitative Results
| Frame | Methods | Pd (%) | FA (×10−5) | mIoU (%) | Time (ms) |
|---|---|---|---|---|---|
| Single Frame | ACM | 67.26 | 84.38 | 32.54 | 1.39 |
| ALCNet | 67.18 | 72.86 | 34.46 | 1.34 | |
| DNANet | 61.65 | 20.24 | 44.23 | 17.54 | |
| ISTUD-UNet | 70.90 | 38.40 | 44.40 | 4.15 | |
| ResUNet | 63.14 | 25.37 | 43.42 | 0.10 | |
| UIU-Net | 88.97 | 4356.07 | 2.38 | 7.00 | |
| MSHNet | 61.31 | 27.16 | 42.22 | 7.92 | |
| Multi Frame | RFR+ACM | 69.52 | 40.06 | 42.53 | 2.21 |
| RFR+ALCNet | 74.24 | 82.17 | 35.39 | 2.16 | |
| RFR+DNANet | 63.08 | 15.71 | 46.36 | 18.71 | |
| RFR+ResUNet | 68.59 | 30.26 | 45.20 | 2.06 | |
| Ours | 94.51 | 5.88 | 74.32 | 2.68 |
| Frame | Methods | Pd (%) | FA (×10−5) | mIoU (%) |
|---|---|---|---|---|
| Single Frame | ACM | 63.99 | 132.27 | 15.41 |
| ALCNet | 65.88 | 68.68 | 20.24 | |
| DNANet | 59.13 | 7.28 | 26.76 | |
| ISTUD-UNet | 64.44 | 24.27 | 22.97 | |
| ResUNet | 63.92 | 23.58 | 25.67 | |
| UIU-Net | 90.53 | 4296.88 | 4.29 | |
| MSHNet | 67.35 | 20.48 | 26.25 | |
| Multi Frame | RFR+ACM | 72.34 | 35.67 | 24.10 |
| RFR+ALCNet | 74.32 | 68.22 | 21.32 | |
| RFR+DNANet | 68.75 | 30.39 | 23.73 | |
| RFR+ResUNet | 70.95 | 22.60 | 22.63 | |
| Ours | 92.47 | 62.82 | 23.80 |
| Frame | Methods | SITP-QLSD | Dataset 3 | ||||
|---|---|---|---|---|---|---|---|
| Pd (%) | FA () | mIoU (%) | Pd (%) | FA () | mIoU (%) | ||
| Single Frame | ACM | 65.65 | 104.38 | 24.67 | 93.90 | 1.91 | 25.10 |
| ALCNet | 66.54 | 71.13 | 28.67 | 94.50 | 2.60 | 30.43 | |
| DNANet | 60.42 | 14.88 | 37.67 | 94.85 | 2.36 | 29.42 | |
| ISTUD-UNet | 67.73 | 32.02 | 36.66 | 94.37 | 7.23 | 28.75 | |
| ResUNet | 63.52 | 24.63 | 36.27 | 93.56 | 1.28 | 30.84 | |
| UIU-Net | 89.74 | 4331.57 | 1.90 | 83.12 | 859.31 | 3.80 | |
| MSHNet | 64.29 | 25.71 | 35.71 | 95.07 | 2.43 | 28.57 | |
| Multiple Frame | RFR+ACM | 70.91 | 38.24 | 34.85 | 93.41 | 3.73 | 30.10 |
| RFR+ALCNet | 74.28 | 76.40 | 29.76 | 93.90 | 1.24 | 33.43 | |
| RFR+DNANet | 65.87 | 21.79 | 36.29 | 94.88 | 2.32 | 35.27 | |
| RFR+ResUNet | 69.74 | 27.09 | 37.53 | 94.60 | 2.14 | 31.79 | |
| Ours | 93.51 | 29.45 | 47.19 | 95.14 | 3.27 | 36.72 | |
3.4. Visualization Results
3.5. Ablation Study
- Full model (only weight factor): The tanh operation is removed, reducing the gating layer to a simple linear channel-wise weighting .
- Full model: The complete model using , which incorporates the proposed bounded non-linear gating mechanism.
4. Discussion
- 1.
- Idealized target modeling
- 2.
- Modest gains on near-distribution small targets
5. Conclusions
- A realistic SITP-QLSD dataset is constructed from QLSAT-2 infrared backgrounds, featuring diverse scenes, mixed-size small targets, and a dedicated generalization sub-test set with extremely small targets, providing a reliable benchmark for evaluating size-diverse and generalization detection in complex spaceborne scenarios.
- The multi-scale IRFeatureExtractor module, leveraging serial-to-parallel convolutions and dilated receptive fields, effectively enhances cross-scale feature representation and clutter suppression, enabling accurate target discrimination across different sizes.
- The adaptive gating pyramid deformable alignment mechanism optimizes multi-frame feature alignment through adaptive gating modulation, enhancing temporal coherence and delivering superior overall robustness.
- Extensive experiments demonstrate superior detection accuracy and false alarm suppression compared to single-frame and multi-frame baselines, with mIoU of 74.32% and Pd of 94.51% on the main set, and robust Pd of 92.37% on the generalization sub-test set. The core strength lies in learning multi-scale features, which enables reliable detection of extremely small targets under severe distribution shifts.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Sun, Y.; Yang, J.; An, W. Infrared Dim and Small Target Detection via Multiple Subspace Learning and Spatial-Temporal Patch-Tensor Model. IEEE Trans. Geosci. Remote Sens. 2021, 59, 3737–3752. [Google Scholar] [CrossRef]
- Zhao, M.; Li, L.; Li, W.; Tao, R.; Li, L.; Zhang, W. Infrared Small-Target Detection Based on Multiple Morphological Profiles. IEEE Trans. Geosci. Remote Sens. 2021, 59, 6077–6091. [Google Scholar] [CrossRef]
- Wu, P.; Huang, H.; Qian, H.; Su, S.; Sun, B.; Zuo, Z. SRCANet: Stacked Residual Coordinate Attention Network for Infrared Ship Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5003614. [Google Scholar] [CrossRef]
- Cui, Y.; Lei, T.; Chen, G.; Zhang, Y.; Zhang, G.; Hao, X. Infrared Small Target Detection via Modified Fast Saliency and Weighted Guided Image Filtering. Sensors 2025, 25, 4405. [Google Scholar] [CrossRef]
- Driggers, R.; Pollak, E.; Grimming, R.; Velazquez, E.; Short, R.; Holst, G.; Furxhi, O. Detection of Small Targets in the Infrared: An Infrared Search and Track Tutorial. Appl. Opt. 2021, 60, 4762–4777. [Google Scholar] [CrossRef]
- Guo, L.; Rao, P.; Gao, C.; Su, Y.; Li, F.; Chen, X. Adaptive Differential Event Detection for Space-Based Infrared Aerial Targets. Remote Sens. 2025, 17, 845. [Google Scholar] [CrossRef]
- He, H.; Wan, M.; Xu, Y.; Kong, X.; Liu, Z.; Chen, Q.; Gu, G. WTAPNet: Wavelet Transform-Based Augmented Perception Network for Infrared Small-Target Detection. IEEE Trans. Instrum. Meas. 2024, 73, 5037217. [Google Scholar] [CrossRef]
- Wang, Y.; Cao, L.; Su, K.; Dai, D.; Li, N.; Wu, D. Infrared Moving Small Target Detection Based on Space–Time Combination in Complex Scenes. Remote Sens. 2023, 15, 5380. [Google Scholar] [CrossRef]
- Parry, I.; Hawker, G.; Gomez-Jenkins, M.; Goncalves, M.; Ang, E.; Barkhuysen, R.; Desborough, P.; Donaghy, J.; Dovhalenko, T.; Gonzalez, S.; et al. Innovative Technologies for Very-High-Resolution MWIR and LWIR Earth Observations. In Small Satellites Systems and Services Symposium (4S 2024); SPIE: Bellingham, WA, USA, 2025; Volume 13546, pp. 632–642. [Google Scholar] [CrossRef]
- Fevgas, G.; Lagkas, T.; Argyriou, V.; Sarigiannidis, P. New vegetation stress assessment approach via WorldView-3 imagery, validated with UAV thermal imaging. Int. J. Remote Sens. 2025, 46, 4764–4780. [Google Scholar] [CrossRef]
- Lin, M.; Jin, M.; Li, J.; Bai, Y. GEOSatDB: Global Civil Earth Observation Satellite Semantic Database. Big Earth Data 2024, 8, 522–539. [Google Scholar] [CrossRef]
- Chapple, P.B.; Bertilone, D.C.; Caprari, R.S.; Angeli, S.; Newsam, G.N. Target Detection in Infrared and SAR Terrain Images Using a Non-Gaussian Stochastic Model. In Targets and Backgrounds: Characterization and Representation V; SPIE: Bellingham, WA, USA, 1999; Volume 3699, pp. 122–132. [Google Scholar] [CrossRef]
- Zhao, M.; Li, W.; Li, L.; Hu, J.; Ma, P.; Tao, R. Single-Frame Infrared Small-Target Detection: A Survey. IEEE Geosci. Remote Sens. Mag. 2022, 10, 87–119. [Google Scholar] [CrossRef]
- Arce, G.; McLoughlin, M. Theoretical Analysis of the Max/Median Filter. IEEE Trans. Acoust. Speech Signal Process. 1987, 35, 60–69. [Google Scholar] [CrossRef]
- Chen, T.; Wu, Q.H.; Rahmani-Torkaman, R.; Hughes, J. A pseudo top-hat mathematical morphological approach to edge detection in dark regions. Pattern Recognit. 2002, 35, 199–210. [Google Scholar] [CrossRef]
- Bai, X.; Zhou, F. Analysis of new top-hat transformation and the application for infrared dim small target detection. Pattern Recognit. 2010, 43, 2145–2156. [Google Scholar] [CrossRef]
- Strickland, R.N.; Hahn, H.I. Wavelet transform methods for object detection and recovery. IEEE Trans. Image Process. 1997, 6, 724–735. [Google Scholar] [CrossRef]
- Qi, S.; Ma, J.; Li, H.; Zhang, S.; Tian, J. Infrared small target enhancement via phase spectrum of quaternion Fourier transform. Infrared Phys. Technol. 2014, 62, 50–58. [Google Scholar] [CrossRef]
- Ren, K.; Song, C.; Miao, X.; Wan, M.; Xiao, J.; Gu, G.; Chen, Q. Infrared small target detection based on non-subsampled shearlet transform and phase spectrum of quaternion Fourier transform. Opt. Quantum Electron. 2020, 52, 168. [Google Scholar] [CrossRef]
- Xu, Y.; Shao, A.; Kong, X.; Wu, J.; Chen, Q.; Gu, G.; Wan, M. Infrared small target detection based on sub-maximum filtering and local intensity weighted gradient measure. IEEE Sens. J. 2024, 24, 22236–22248. [Google Scholar] [CrossRef]
- Han, J.; Liang, K.; Zhou, B.; Zhu, X.; Zhao, J.; Zhao, L. Infrared small target detection utilizing the multiscale relative local contrast measure. IEEE Geosci. Remote Sens. Lett. 2018, 15, 612–616. [Google Scholar] [CrossRef]
- Guan, X.; Peng, Z.; Huang, S.; Chen, Y. Gaussian scale-space enhanced local contrast measure for small infrared target detection. IEEE Geosci. Remote Sens. Lett. 2020, 17, 327–331. [Google Scholar] [CrossRef]
- Chen, L.; Rao, P.; Chen, X. Infrared dim target detection method based on local feature contrast and energy concentration degree. Optik 2021, 248, 167651. [Google Scholar] [CrossRef]
- Chen, Z.; Luo, S.; Xie, T.; Liu, J.; Wang, G.; Lei, G. A novel infrared small target detection method based on BEMD and local inverse entropy. Infrared Phys. Technol. 2014, 66, 114–124. [Google Scholar] [CrossRef]
- Gao, C.; Meng, D.; Yang, Y.; Wang, Y.; Zhou, X.; Hauptmann, A.G. Infrared patch-image model for small target detection in a single image. IEEE Trans. Image Process. 2013, 22, 4996–5009. [Google Scholar] [CrossRef] [PubMed]
- Dai, Y.; Wu, Y.; Song, Y. Infrared small target and background separation via column-wise weighted robust principal component analysis. Infrared Phys. Technol. 2016, 77, 421–430. [Google Scholar] [CrossRef]
- Dai, Y.; Wu, Y.; Song, Y.; Guo, J. Non-negative infrared patch-image model: Robust target-background separation via partial sum minimization of singular values. Infrared Phys. Technol. 2017, 81, 182–194. [Google Scholar] [CrossRef]
- Zhang, L.; Peng, L.; Zhang, T.; Cao, S.; Peng, Z. Infrared small target detection via non-convex rank approximation minimization joint l2,1 norm. Remote Sens. 2018, 10, 1821. [Google Scholar] [CrossRef]
- Dai, Y.; Wu, Y. Reweighted infrared patch-tensor model with both nonlocal and local priors for single-frame small target detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 3752–3767. [Google Scholar] [CrossRef]
- Zhang, M.; Zhang, R.; Yang, Y.; Bai, H.; Zhang, J.; Guo, J. ISNet: Shape matters for infrared small target detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 867–876. [Google Scholar] [CrossRef]
- Wu, T.; Li, B.; Luo, Y.; Wang, Y.; Xiao, C.; Liu, T.; Yang, J.; An, W.; Guo, Y. MTU-Net: Multilevel TransUNet for space-based infrared tiny ship detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5601015. [Google Scholar] [CrossRef]
- Jiang, N.; Wang, K.; Peng, X.; Yu, X.; Wang, Q.; Xing, J.; Li, G.; Guo, G.; Ye, Q.; Jiao, J.; et al. Anti-UAV: A large-scale benchmark for vision-based UAV tracking. IEEE Trans. Multimed. 2023, 25, 486–500. [Google Scholar] [CrossRef]
- Sun, H.; Bai, J.; Yang, F.; Bai, X. Receptive-field and direction induced attention network for infrared dim small target detection with a large-scale dataset IRDST. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5000513. [Google Scholar] [CrossRef]
- Wang, H.; Zhou, L.; Wang, L. Miss detection vs. false alarm: Adversarial learning for small object segmentation in infrared images. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 29 October–2 November 2019; pp. 8508–8517. [Google Scholar] [CrossRef]
- Zhao, B.; Wang, C.; Fu, Q.; Han, Z. A novel pattern for infrared small target detection with generative adversarial network. IEEE Trans. Geosci. Remote Sens. 2021, 59, 4481–4492. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. In Proceedings of the International Conference on Learning Representations (ICLR), Vienna, Austria, 4 May 2021; Available online: https://openreview.net/forum?id=YicbFdNTTy (accessed on 23 February 2026).
- Gu, A.; Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. arXiv 2024, arXiv:2312.00752. [Google Scholar] [CrossRef]
- Chen, T.; Ye, Z.; Tan, Z.; Gong, T.; Wu, Y.; Chu, Q.; Liu, B.; Yu, N.; Ye, J. MiM-ISTD: Mamba-in-Mamba for efficient infrared small target detection. arXiv 2024, arXiv:2403.02148. [Google Scholar] [CrossRef]
- Wang, Y.; Xu, Z.; Wang, X.; Shen, C.; Cheng, B.; Shen, H.; Xia, H. End-to-end video instance segmentation with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 8741–8750. [Google Scholar]
- Chen, S.; Ji, L.; Zhu, J.; Ye, M.; Yao, X. SSTNet: Sliced spatio-temporal network with cross-slice ConvLSTM for moving infrared dim-small target detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5000912. [Google Scholar] [CrossRef]
- Li, J.; Liu, P.; Huang, X.; Cui, W.; Zhang, T. Learning motion constraint-based spatio-temporal networks for infrared dim target detections. Appl. Sci. 2022, 12, 11519. [Google Scholar] [CrossRef]
- Yan, P.; Hou, R.; Duan, X.; Yue, C.; Wang, X.; Cao, X. STDMANet: Spatio-temporal differential multiscale attention network for small moving infrared target detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5602516. [Google Scholar] [CrossRef]
- Wang, P.; Niu, W.; Gao, W.; Guo, Y.; Peng, X. Dim moving point target detection in cloud clutter scenes based on temporal profile learning. IEEE Geosci. Remote Sens. Lett. 2023, 20, 6006905. [Google Scholar] [CrossRef]
- Tong, X.; Zuo, Z.; Su, S.; Wei, J.; Sun, X.; Wu, P.; Zhao, Z. ST-Trans: Spatial-temporal transformer for infrared small target detection in sequential images. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5001819. [Google Scholar] [CrossRef]
- Zhang, Z.; Shao, F.; Dai, Z.; Zhu, S. Towards robust video instance segmentation with temporal-aware transformer. arXiv 2023, arXiv:2301.09416. [Google Scholar] [CrossRef]
- Karim, R.; Zhao, H.; Wildes, R.P.; Siam, M. MED-VT++: Unifying multimodal learning with a multiscale encoder-decoder video transformer. arXiv 2024, arXiv:2304.05930. [Google Scholar] [CrossRef]
- Huang, Y.; Zhi, X.; Hu, J.; Yu, L.; Han, Q.; Chen, W.; Zhang, W. LMAFormer: Local motion aware transformer for small moving infrared target detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5008117. [Google Scholar] [CrossRef]
- Li, R.; An, W.; Xiao, C.; Li, B.; Wang, Y.; Li, M.; Guo, Y. Direction-coded temporal U-shape module for multiframe infrared small target detection. IEEE Trans. Neural Netw. Learn. Syst. 2025, 36, 555–568. [Google Scholar] [CrossRef] [PubMed]
- Ying, X.; Liu, L.; Lin, Z.; Shi, Y.; Wang, Y.; Li, R.; Cao, X.; Li, B.; Zhou, S.; An, W. Infrared small target detection in satellite videos: A new dataset and a novel recurrent feature refinement framework. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5002818. [Google Scholar] [CrossRef]
- Deng, C.; Guo, Y.; Xu, X.; Zhao, Z.; Xia, Y.; An, R.; Li, J.; Plaza, A. Learning Global Dynamic Query for Large-Motion Infrared Small Target Detection. IEEE Trans. Geosci. Remote Sens. 2026, 64, 5002016. [Google Scholar] [CrossRef]
- Li, F.; Rao, P.; Sun, W.; Su, Y.; Chen, X. A New Motion Feature-Enhanced Multiframe Spatial–Temporal Infrared Target Detection Network. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5006819. [Google Scholar] [CrossRef]
- Li, F.; Xu, Q.; Bao, S.; Yang, Z.; Cong, R.; Cao, X.; Huang, Q. Size-invariance matters: Rethinking metrics and losses for imbalanced multi-object salient object detection. arXiv 2024, arXiv:2405.09782. [Google Scholar] [CrossRef]
- Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 834–848. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar] [CrossRef]
- Sun, H.; Hao, X.; Wang, J.; Pan, B.; Pei, P.; Tai, B.; Zhao, Y.; Feng, S. Flame edge detection method based on a convolutional neural network. ACS Omega 2022, 7, 26680–26686. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G.; Albanie, S. Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2011–2023. [Google Scholar] [CrossRef]
- Dai, Y.; Wu, Y.; Zhou, F.; Barnard, K. Asymmetric contextual modulation for infrared small target detection. In Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Virtual, 5–9 January 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1069–1078. [Google Scholar] [CrossRef]
- Dai, Y.; Wu, Y.; Zhou, F.; Barnard, K. Attentional local contrast networks for infrared small target detection. IEEE Trans. Geosci. Remote Sens. 2021, 59, 9813–9824. [Google Scholar] [CrossRef]
- Li, B.; Xiao, C.; Wang, L.; Wang, Y.; Lin, Z.; Li, M.; An, W.; Guo, Y. Dense nested attention network for infrared small target detection. IEEE Trans. Image Process. 2023, 32, 972–986. [Google Scholar] [CrossRef] [PubMed]
- Hou, Q.; Zhang, L.; Tan, F.; Xi, Y.; Zheng, H.; Li, N. Istdu-net: Infrared small-target detection u-net. IEEE Geosci. Remote Sens. Lett. 2022, 19, 7506205. [Google Scholar] [CrossRef]
- Xiao, X.; Lian, S.; Luo, Z.; Li, S. Weighted res-unet for high-quality retina vessel segmentation. In Proceedings of the 2018 9th International Conference on Information Technology in Medicine and Education (ITME), Hangzhou, China, 19–21 October 2018; pp. 327–331. [Google Scholar] [CrossRef]
- Wu, X.; Hong, D.; Chanussot, J. Uiu-net: U-net in u-net for infrared small object detection. IEEE Trans. Image Process. 2023, 32, 364–376. [Google Scholar] [CrossRef] [PubMed]
- Liu, Q.; Liu, R.; Zheng, B.; Wang, H.; Fu, Y. Infrared small target detection with scale and location sensitivity. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 17489–17498. [Google Scholar] [CrossRef]







| Dataset | Target Size | Target Pixel | Background Std | Seq. | Mode | Frames | T-Num | SNR | GSD | Resolution | Band |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | – | 69.20 | 88 | Push-broom | 45,425 | 80,483 | 2–4 | MWIR | |||
| 2 | – | – | 12 | Push-broom | 6395 | 15,956 | 2–4 | MWIR | |||
| 3 | 25 | 26.52 | 67 | Staring | 11,029 | 23,796 | 4 | MWIR |
| A | B | FLOPs | Params | Dataset 1 | Dataset 3 | ||||
|---|---|---|---|---|---|---|---|---|---|
| (G) | (M) | mIoU (%) | Pd (%) | FA () | mIoU (%) | Pd (%) | FA () | ||
| × | × | 72.28 | 1.01 | 45.20 | 68.59 | 30.26 | 31.79 | 94.60 | 2.14 |
| √ | × | 96.07 | 1.04 | 71.93 | 95.14 | 11.87 | 35.51 | 94.22 | 1.41 |
| × | √ | 62.85 | 0.98 | 46.21 | 67.81 | 26.49 | 33.38 | 95.12 | 4.96 |
| √ | √ | 86.64 | 1.02 | 74.32 | 94.51 | 5.88 | 36.72 | 95.14 | 3.27 |
| Method | Pd (%) | FA () | mIoU (%) |
|---|---|---|---|
| Full model (only weight factor ) | 94.02 | 11.14 | 70.61 |
| Full model | 94.51 | 5.88 | 74.32 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Wu, X.; Li, D.; Chen, X.; Hu, K.; Rao, P. DTRFR: A Unified Detector for Diverse Target Detection in High-Spatial-Resolution Spaceborne Infrared Video. Remote Sens. 2026, 18, 780. https://doi.org/10.3390/rs18050780
Wu X, Li D, Chen X, Hu K, Rao P. DTRFR: A Unified Detector for Diverse Target Detection in High-Spatial-Resolution Spaceborne Infrared Video. Remote Sensing. 2026; 18(5):780. https://doi.org/10.3390/rs18050780
Chicago/Turabian StyleWu, Xiaoying, Dandan Li, Xin Chen, Kai Hu, and Peng Rao. 2026. "DTRFR: A Unified Detector for Diverse Target Detection in High-Spatial-Resolution Spaceborne Infrared Video" Remote Sensing 18, no. 5: 780. https://doi.org/10.3390/rs18050780
APA StyleWu, X., Li, D., Chen, X., Hu, K., & Rao, P. (2026). DTRFR: A Unified Detector for Diverse Target Detection in High-Spatial-Resolution Spaceborne Infrared Video. Remote Sensing, 18(5), 780. https://doi.org/10.3390/rs18050780

