GMF-Net: A Gaussian-Matched Fusion Network for Weak Small Object Detection in Satellite Laser Ranging Imagery
Abstract
1. Introduction
1.1. Technical Bottlenecks in Target Detection in SLR-CCD Images
- Inadequate feature modeling and target adaptability:Objects in SLR images typically follow an approximate Gaussian distribution, with energy highly concentrated in the central region and decaying rapidly toward the periphery (see Section 2.2 for data analysis). Traditional convolutional structures—with fixed receptive fields and no directional sensitivity—struggle to extract these spatial features effectively. Moreover, when standard convolutional kernels shift across a uniform grid, they assign equal weight to low-energy peripheral regions and high-energy central regions, which increases noise interference and dilutes the true signal.
- Mismatch between algorithms and practical applications:Traditional detection methods rely on manually designed features (e.g., Haar wavelet [12], Viola-Jones integral map [13], HOG features [14], etc.), which lack robustness in complex backgrounds. In deep learning, two-stage detectors (e.g., the R-CNN series [15,16,17]) achieve high accuracy but incur high computational complexity due to their multi-stage structure, making them unsuitable for real-time deployment. Single-stage detectors (e.g.SSD [18], YOLO series [19,20,21,22,23,24]) have become the mainstream solution for real-time inference and have been applied in drone imagery [25], infrared detection [26], satellite-component detection [27], landslide monitoring [28], and fruit classification [29]. However, most still rely on generic feature modeling and have not incorporated the dual characteristics of SLR targets—Gaussian energy distribution and extremely small scale—for structural optimization, resulting in suboptimal performance.
- Overly simple and redundant feature fusion strategies:Current networks such as YOLO suffer from limited inter-layer interaction and channel utilization. Although fusion modules like FPN, PANet, and CSP/C2f have been introduced, they typically merge multi-scale features via simple concatenation or static weighting, without dynamic adaptation to differing energy distributions. This makes it difficult to jointly enhance small-target distinguishability. Furthermore, the abundance of redundant channels and convolutional operations—while enriching representation—substantially increases model size and computational overhead, hindering lightweight deployment.
1.2. Innovations and Contributions of This Paper
- Gaussian-Matched Convolution (GMConv): Convolution kernels of a traditional rectangular nature, characterized by uniform sensing, are inadequate in accurately matching the approximate Gaussian distribution characteristics of SLR targets. These targets are distinguished by “central energy concentration and rapid outward decay,” as analyzed in Section 2.2. To address this issue, the GMConv module was developed. This module employs a synergistic mechanism of multi-directional heterogeneous sampling and dynamic channel calibration to adaptively capture Gaussian energy distribution features, enhancing central response while effectively suppressing background noise.
- Cross-Stage Partial Pyramid Convolution (CSPPC): In order to achieve model lightweighting, this paper introduces partial convolution (Partial Conv) mechanisms and a pyramid structure into the neck network, constructing the CSPPC module. This design effectively reduces model parameters and computational complexity while enhancing the network’s multi-scale fusion capabilities.
- Cross-Feature Attention (CFA): The CFA module has been developed to address the limitations of traditional detectors in making independent predictions at each level and severing cross-scale feature correlations. The module integrates shallow-layer, high-resolution details with deep-layer semantic information through cross-layer feature map fusion and local attention mechanisms, significantly enhancing multi-scale object detection capabilities.
- The construction of the SLR-CCD Dataset: Addressing the paucity of data in this domain, this paper establishes the first SLR-CCD dataset covering complex noise and multi-scale features, providing a robust data foundation for model training and algorithm validation.
2. Dataset Construction and Objective Analysis
2.1. Dataset Construction
2.2. Characterization of the Dataset
3. Method
3.1. GMF-Net Overall Architecture
- Gaussian-Matched Convolution (GMConv): The GMConv module is designed based on the Gaussian energy distribution characteristics analyzed in Section 2.2. Unlike conventional convolutions with uniform receptive fields, GMConv employs a synergistic mechanism of multi-directional heterogeneous sampling and dynamic channel calibration. This design precisely models the spatial energy decay of the target (i.e., dense center and sparse periphery), thereby adaptively enhancing target responses across channels while effectively suppressing background noise.
- Cross-Stage Partial Pyramidal Convolutional Network (CSPPC): The CSPPC module constructs the feature fusion neck of the network. It first decouples input features into modeling and information streams by channel, then expands multi-scale receptive fields through pyramid-cascaded partial convolutions (Partial Conv). This architecture achieves a 30% reduction in parameters and a 25% decrease in GFLOPs, maintaining competitive detection accuracy with significantly improved computational efficiency.
- Cross-Feature Attention (CFA): To enhance fusion efficiency in the multi-scale detection head, the CFA module dynamically integrates shallow high-resolution features (P3) with deep semantic features (P4) using a local attention mask. This mechanism establishes an adaptive balance between fine-grained spatial details and global contextual semantics, resulting in more precise bounding box regression and improved robustness in small-target detection.
3.2. Gaussian Matching Convolution (GMConv)
GMConv Feature Map Visualization
3.3. Lightweight Neck Network: CSPPC
CSPPC Improvement Comparison
3.4. Cross-Feature Fusion Attention CFA
4. Experiment
4.1. Experimental Environment and Implementation Details
4.2. Experimental Evaluation Criteria
4.3. Comparison Test—Comparing Different Backbone Network
Comprehensive Performance Analysis
- Optimal Global Detection Accuracy: GMConv achieves the highest precision (P = 90.0%) and recall (R = 85.3%), representing improvements of 4.8 and 6.5 percentage points, respectively, over the baseline CSPDarknet53 (P = 85.2%, R = 78.8%). Concurrently, its mAP@50 reaches 91.8%, which is 7.8 percentage points higher than CSPDarknet53 and significantly superior to PConv (89.9%) and ShuffleNetV1 (89.1%), demonstrating the strongest global detection capability.
- Accuracy–Efficiency Trade-off: In the trade-off between parameters and computational cost, GMConv demonstrates outstanding efficiency. Its parameter count is 3.0 M, which is on par with the baseline model and PConv. In terms of computation, its GFLOPs are 4.3 G, only a minor increase compared to the baseline (4.0 G) and PConv (4.2 G). This minimal computational overhead (+0.3 GFLOPS) is exchanged for a substantial +7.8% increase in mAP@50, proving that the module is extremely efficient.
4.4. Ablation Experiments
Ablation Experiments Conclusion
- Single-module Performance
- GMConv (Group 2 vs. 1): GMConv yields the most significant gains, with precision up by 4.8%, recall up by 6.5%, mAP50 up by 7.8%, and mAP50–95 up by 7.1%, while only increasing computation by 0.3 GFLOPS.
- CSPPC (Group 3 vs. 1): Achieves lightweight design alongside performance improvement: Precision +3.4%, recall +2.3%, mAP50 +3.1%, mAP50–95 +1.5%. Model size reduced to 2.1 M parameters and 3.0 GFLOPS.
- CFA (Group 4 vs. 1): Improves accuracy, with precision +1.7%, recall +1.0%, mAP50 +3.5%, mAP50–95 +2.5%; GFLOPS slightly reduced to 3.8 GFLOPS.
- Multi-module Collaboration
- Two-module Combinations
- -
- GMConv + CSPPC (Group 6): Maintains a lightweight footprint (2.2 M parameters) while further boosting mAP50 to 91.1% and mAP50–95 to 50.4%, enhancing small object detection.
- -
- CSPPC + CFA (Group 5): Although the smallest model (2.2 M parameters), it shows a slight drop in recall and mAP@50–95, highlighting the critical role of GMConv for fine-feature extraction.
- Three-module Fusion: Integrating GMConv, CSPPC, and CFA (Group 7) achieves the best balance: precision 93.0%, recall 85.6%, mAP50 93.1%, mAP50–95 52.4%; only 2.2 M parameters and 2.9 GFLOPS. Compared to baseline YOLOv8, it substantially improves detection accuracy while reducing both computation and model size.
- Overall Conclusions
- GMConv alone provides the largest boost, significantly enhancing small object detection capability.
- CSPPC and CFA each contribute significantly to lightweight design and detection capabilities.
- The combination of all three modules delivers the highest detection accuracy and recall, with an extremely compact model and excellent real-time performance, validating the overall effectiveness of the GMF-Net design.
4.5. Comparison Experiments of Different Models
- Optimal Detection Accuracy and Robustness
- GMF-Net: GMF-Net ranks first in all key metrics: precision P reaches 93.0%, mAP50 reaches 93.1%, with both being the highest values; recall R is 85.6%, on par with YOLOv9 and far superior to other detectors. This balance is critical for practical SLR applications. High recall is essential to prevent the loss of valuable satellite passes (avoiding False Negatives), while high precision is a requisite to minimizing false tracking commands triggered by background noise (avoiding False Positives). Unlike baseline models that exhibit trade-offs—such as YOLOv5 (High P, Moderate R) which risks data loss, or YOLOv9 (High R, Lower P) which introduces noise—GMF-Net achieves the optimal equilibrium, ensuring both data completeness and tracking efficiency.
- Transformer-Based RT-DETR: RT-DETR has inherent advantages in context modeling, its mAP50 is below 87% due to extremely small target sizes and complex background noise in CCD images, limiting its detection performance.
- Two-Stage Methods (Faster R-CNN, SSD300): Performs poorly in SLR small object detection, demonstrating limited adaptability to SLR small object detection tasks.
- Lightweight model and easy deployment
- GMF-Net: Achieves optimal accuracy with only 2.2 million parameters and 2.9 GFLOPs of computational complexity, outperforming mainstream lightweight detectors such as YOLOv9/v10/v11.
- RT-DETR: Requires over 100M parameters and hundreds of GFLOPs of computational overhead, making it highly unsuitable for deployment on edge devices. Our model demonstrates a clear advantage in resource-constrained scenarios.
4.6. Statistical Significance and Reproducibility Analysis
4.7. Visualization Experiments
4.7.1. Detection Result Visualization
4.7.2. Failure Case Analysis
- False Positives (FPs): In the baseline YOLOv8 results, FPs primarily originate from high-intensity background noise clusters (e.g., star clutter or hot pixels) that structurally mimic the intensity peaks of small targets. The standard convolution kernels struggle to differentiate these noise artifacts from valid Gaussian targets. In contrast, GMF-Net effectively suppresses these FPs by utilizing the GMConv module, which is physically grounded in the Gaussian energy distribution, allowing the network to filter out irregular noise patterns that do not match the target’s physical morphology.
- False Negatives (FNs): FNs predominantly occur when targets exhibit extremely low signal-to-noise ratios (SNRs), causing them to blend into the background readout noise. The baseline model fails to extract these faint features due to insufficient feature enhancement. GMF-Net mitigates this through the synergistic effect of GMConv (which enhances central feature response) and the CFA module (which fuses multi-scale semantic information), thereby significantly improving the recall rate for ultra-weak targets.
4.7.3. Heatmap Response Analysis
- GMF-Net produces prominent activations only at true target locations with a high signal-to-noise ratio.
- YOLOv8 shows general "heating" with less distinct target peaks, often responding to background textures.
- For extremely weak targets, GMF-Net remains stable and concentrated, while YOLOv8 gives almost no response or only scattered weak signals.
- For clustered targets, GMF-Net’s hotspots align closely with true positions, while YOLOv8 generates multiple false alarms in background regions.
- For low-contrast isolated points, GMF-Net maintains concentrated peaks (albeit weaker ones), indicating greater robustness; YOLOv8’s responses are scattered and prone to misses.
4.7.4. Generalization Experiment
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Long, M.-L.; Zhang, H.-F.; Men, L.-L.; Wu, Z.-B.; Deng, H.-R.; Qin, S.; Zhang, Z.-P. Satellite Laser Ranging at 10 kHz Repetition Rate in All Day. J. Infrared Millim. Waves 2020, 39, 778–785. [Google Scholar] [CrossRef]
- Ding, R.; Wu, Z.; Deng, H.; Tang, K.; Zhang, Z. Research and Design of High Automation Satellite Laser Ranging System. Laser Infrared 2017, 47, 1102–1107. [Google Scholar]
- Wang, X.; Xiao, W.; Li, W.; Yang, X.; Li, Z. Progress of the Satellite Laser Ranging System of National Time Service Center in Xi’an. J. Time Freq. 2024, 47, 268–276. [Google Scholar]
- Degnan, J.J. Satellite Laser Ranging: Current Status and Future Prospects. IEEE Trans. Geosci. Remote Sens. 1985, 23, 398–413. [Google Scholar] [CrossRef]
- Altamimi, Z.; Rebischung, P.; Metivier, L.; Collilieux, X. ITRF2014: A New Release of the International Terrestrial Reference Frame Modeling Nonlinear Station Motions. J. Geophys. Res.-Solid Earth 2016, 121, 6109–6131. [Google Scholar] [CrossRef]
- Sosnica, K.; Thaller, D.; Dach, R.; Steigenberger, P.; Beutler, G.; Arnold, D.; Jaeggi, A. Satellite Laser Ranging to GPS and GLONASS. J. Geod. 2015, 89, 725–743. [Google Scholar] [CrossRef]
- Kong, Y.; Sun, B.; Yang, X.; Cao, F.; He, Z.; Yang, H. Precision Analysis of BeiDou Broadcast Ephemeris by Using SLR Data. Geomat. Inf. Sci. Wuhan Univ. 2017, 42, 831–837. [Google Scholar]
- Lin, H.; Wu, Z.; Zheng, M.; Long, M.; Geng, R.; Yu, R.; Zhang, Z. Research and Application of Picosecond Accuracy Time Delay Calibration for Satellite Laser Ranging System. Infrared Laser Eng. 2023, 52, 20230070-1. [Google Scholar]
- Wang, J.; Qi, K.; Wang, S.; Gao, R.; Li, P.; Yang, R.; Liu, H.; Luo, Z. Advance and Prospect in the Study of Laser Interferometry Technology for Space Gravitational Wave Detection. Sci. Sin. Phys. Mech. Astron. 2024, 54, 270405. [Google Scholar] [CrossRef]
- Magruder, L.A.; Farrell, S.L.; Neuenschwander, A.; Duncanson, L.; Csatho, B.; Kacimi, S.; Fricker, H.A. Monitoring Earth’s Climate Variables with Satellite Laser Altimetry. Nat. Rev. Earth Environ. 2024, 5, 120–136. [Google Scholar] [CrossRef]
- Steindorfer, M.A.; Wang, P.; Koidl, F.; Kirchner, G. Space Debris and Satellite Laser Ranging Combined Using a Megahertz System. Nat. Commun. 2025, 16, 575. [Google Scholar] [CrossRef] [PubMed]
- Papageorgiou, C.P.; Oren, M.; Poggio, T. A General Framework for Object Detection. In Proceedings of the Sixth International Conference on Computer Vision, Bombay, India, 7 January 1998; IEEE: Piscataway, NJ, USA, 1998; pp. 555–562. [Google Scholar] [CrossRef]
- Viola, P.; Jones, M. Rapid Object Detection Using a Boosted Cascade of Simple Features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Kauai, HI, USA, 8–14 December 2001, Jacobs, A., Baldwin, T., Eds.; IEEE: Piscataway, NJ, USA, 2001; Volume 1, pp. 511–518. [Google Scholar] [CrossRef]
- Dalal, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition; San Diego, CA, USA, 20–25 June 2005, Schmid, C., Soatto, S., Tomasi, C., Eds.; IEEE: Piscataway, NJ, USA, 2005; Volume 1, pp. 886–893. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, 23–28 June 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 580–587. [Google Scholar] [CrossRef]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1440–1448. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Proceedings of the Advances in Neural Information Processing Systems 28 (NIPS 2015); Montreal, QC, Canada, 7–12 December 2015, Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R., Eds.; Volume 28.
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the Computer Vision—ECCV 2016, PT I; Amsterdam, The Netherlands, 11–14 October 2016, Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Volume 9905, pp. 21–37. [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 779–788. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the 30TH IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), Honolulu, HI, USA, 21–26 July 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 6517–6525. [Google Scholar] [CrossRef]
- Wang, C.Y.; Yeh, J.; Liao, H.Y.M. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. In Proceedings of the Computer Vision—ECCV 2024, PT XXXI; Milan, Italy, 29 September–4 October 2024, Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G., Eds.; Volume 15089, pp. 1–21. [CrossRef]
- Farhadi, A.; Redmon, J. Yolov3: An Incremental Improvement. In Proceedings of the Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; Springer: Berlin/Heidelberg, Germany, 2018; Volume 1804, pp. 1–6. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J. Yolov10: Real-time End-to-End Object Detection. Adv. Neural Inf. Process. Syst. 2024, 37, 107984–108011. [Google Scholar]
- Cao, J.; Bao, W.; Shang, H.; Yuan, M.; Cheng, Q. GCL-YOLO: A GhostConv-Based Lightweight YOLO Network for UAV Small Object Detection. Remote Sens. 2023, 15, 4932. [Google Scholar] [CrossRef]
- Cao, L.; Wang, Q.; Luo, Y.; Hou, Y.; Cao, J.; Zheng, W. YOLO-TSL: A Lightweight Target Detection Algorithm for UAV Infrared Images Based on Triplet Attention and Slim-neck. Infrared Phys. Technol. 2024, 141, 105487. [Google Scholar] [CrossRef]
- Tang, Z.; Zhang, W.; Li, J.; Liu, R.; Xu, Y.; Chen, S.; Fang, Z.; Zhao, F. LTSCD-YOLO: A Lightweight Algorithm for Detecting Typical Satellite Components Based on Improved YOLOv8. Remote Sens. 2024, 16, 3101. [Google Scholar] [CrossRef]
- Yang, Y.; Miao, Z.; Zhang, H.; Wang, B.; Wu, L. Lightweight Attention-Guided YOLO With Level Set Layer for Landslide Detection From Optical Satellite Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 3543–3559. [Google Scholar] [CrossRef]
- Song, H.; Ma, B.; Shang, Y.; Wen, Y.; Zhang, S. Detection of Young Apple Fruits Based on YOLO V7-ECA Model. Trans. Chin. Soc. Agric. Mach. 2023, 54, 233–242. [Google Scholar]
- Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable Convolutional Networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 764–773. [Google Scholar]
- Yang, J.; Liu, S.; Wu, J.; Su, X.; Hai, N.; Huang, X. Pinwheel-Shaped Convolution and Scale-Based Dynamic Loss for Infrared Small Target Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; Volume 39, pp. 9202–9210. [Google Scholar]
- Chen, J.; Kao, S.H.; He, H.; Zhuo, W.; Wen, S.; Lee, C.H.; Chan, S.H.G. Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 12021–12031. [Google Scholar] [CrossRef]
- Tang, Y.; Han, K.; Guo, J.; Xu, C.; Xu, C.; Wang, Y. GhostNetV2: Enhance Cheap Operation with Long-Range Attention. In Proceedings of the Advances in Neural Information Processing Systems 35 (NEURIPS 2022); New Orleans, LA, USA, 28 November–9 December 2022, Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A., Eds.; Neural Information Processing Systems Foundation, Inc. (NeurIPS): San Diego, CA, USA, 2022. [Google Scholar]
- Andrew, G.; Menglong, Z. Efficient Convolutional Neural Networks for Mobile Vision Applications. Mobilenets 2017, 10, 151. [Google Scholar]
- Zhang, X.; Zhou, X.; Lin, M.; Sun, J. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 6848–6856. [Google Scholar] [CrossRef]
- Ma, X.; Dai, X.; Bai, Y.; Wang, Y.; Fu, Y. Rewrite the Stars. arXiv 2024, arXiv:2403.19967. [Google Scholar] [CrossRef]
- Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. arXiv 2022, arXiv:2209.02976. [Google Scholar] [CrossRef]
- Wang, C.Y.; Yeh, I.H.; Liao, H.Y.M. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. arXiv 2024, arXiv:2402.13616. [Google Scholar] [CrossRef]
- Khanam, R.; Hussain, M. YOLOv11: An Overview of the Key Architectural Enhancements. arXiv 2024, arXiv:2410.17725. [Google Scholar] [CrossRef]
- Tian, Y.; Ye, Q.; Doermann, D. YOLOv12: Attention-Centric Real-Time Object Detectors. arXiv 2025, arXiv:2502.12524. [Google Scholar] [CrossRef]
- Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Liu, Y.; Chen, J. DETRs Beat YOLOs on Real-time Object Detection. arXiv 2024, arXiv:2304.08069. [Google Scholar] [CrossRef]
- Zhang, M.; Zhang, R.; Yang, Y.; Bai, H.; Zhang, J.; Guo, J. ISNet: Shape Matters for Infrared Small Target Detection. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 867–876. [Google Scholar] [CrossRef]











| Operating System | Ubuntu 22.04 |
|---|---|
| GPU | RTX 4090 (24 GB) |
| CPU | 16 VCPU Intel(R) Xeon(R) Platinum 8352V CPU @ 2.10 GHz |
| Memory | 120 GB |
| Programming Languages | Python 3.10 |
| Frameworks | PyTorch 2.1.0 + CUDA 12.1 |
| IDE | JupyterLab |
| Backbone Network | P% | R% | mAP50% | mAP50–95% | Param (M) | GFLOPS (G) |
|---|---|---|---|---|---|---|
| CSPDarknet53 (baseline) | 85.2 | 78.8 | 84.0 | 44.0 | 3.0 | 4.0 |
| Fastnet [32] | 88.7 | 84.9 | 91.1 | 51.4 | 15.2 | 18.5 |
| Ghostnet [33] | 86.2 | 80.3 | 85.8 | 42.2 | 6.3 | 4.3 |
| MobileNetV1 [34] | 87.3 | 84.3 | 89.9 | 52.4 | 6.1 | 7.8 |
| Pconv [31] | 88.9 | 84.0 | 89.9 | 49.2 | 3.0 | 4.2 |
| shufflenet_v1 [35] | 88.5 | 82.1 | 89.1 | 50.6 | 3.5 | 3.2 |
| Starnet [36] | 87.2 | 81.3 | 87.5 | 48.1 | 3.0 | 8.3 |
| Ours (GMConv) | 90.0 | 85.3 | 91.8 | 51.1 | 3.0 | 4.3 |
| Group | GMConv | CSPPC | CFA | P% | R% | mAP50% | mAP50–95% | Params (M) | GFLOPS (G) |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 85.2 | 78.8 | 84.0 | 44.0 | 3.0 | 4.0 | |||
| 2 | ✓ | 90.0 | 85.3 | 91.8 | 51.1 | 3.0 | 4.3 | ||
| 3 | ✓ | 88.6 | 81.1 | 87.1 | 45.5 | 2.1 | 3.0 | ||
| 4 | ✓ | 86.9 | 79.8 | 87.5 | 46.5 | 3.0 | 3.8 | ||
| 5 | ✓ | ✓ | 90.0 | 81.3 | 89.4 | 45.4 | 2.2 | 2.7 | |
| 6 | ✓ | ✓ | 91.4 | 81.9 | 91.1 | 50.4 | 2.2 | 3.2 | |
| 7 | ✓ | ✓ | ✓ | 93.0 | 85.6 | 93.1 | 52.4 | 2.2 | 2.9 |
| Mode | P% | R% | mAP50% | mAP50–95% | Params (M) | GFLOPS/G |
|---|---|---|---|---|---|---|
| YOLOv5 | 91.9 | 83.4 | 90.2 | 53.0 | 2.5 | 3.5 |
| YOLOv6 [37] | 86.8 | 78.9 | 84.2 | 44.0 | 4.2 | 5.9 |
| YOLOv9 [38] | 90.9 | 85.6 | 91.1 | 52.4 | 2.0 | 3.8 |
| YOLOv10 [24] | 88.1 | 81.2 | 89.1 | 51.7 | 2.7 | 4.1 |
| YOLOv11 [39] | 88.7 | 81.4 | 89.7 | 48.9 | 2.6 | 3.2 |
| YOLOv12 [40] | 88.6 | 78.1 | 84.9 | 42.8 | 2.6 | 3.2 |
| Faster RCNN [17] | 29.4 | 6.3 | 2.96 | 0.1 | 137.1 | 370.2 |
| SSD300(Vgg) [18] | 0 | 0 | 0 | 0 | 26.2 | 62.7 |
| RT-DETR-resnet101 [41] | 88.4 | 82.4 | 86.4 | 39.1 | 60.9 | 186.2 |
| RT-DETR-l [41] | 83.2 | 79.6 | 81.8 | 40.9 | 63.1 | 103.4 |
| Ours (GMF-Net) | 93.0 | 85.6 | 93.1 | 52.4 | 2.2 | 2.9 |
| Metric | YOLOv8 | GMF-Net (Ours) | Mean | t-Value |
|---|---|---|---|---|
| mAP50 | * | (+0.093) | ||
| mAP50–95 | * | (+0.085) | ||
| Precision | * | (+0.075) | ||
| Recall | * | (+0.072) |
| Model | P% | R% | mAP50% | mAP50–95% |
|---|---|---|---|---|
| YOLOv8 (base) | 89.6 | 75.8 | 84.7 | 39.8 |
| GMF-Net | 93.4 | 77.6 | 85.3 | 42.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Zhu, W.; Gong, W.; Wang, Y.; Zhang, Y.; Hu, J. GMF-Net: A Gaussian-Matched Fusion Network for Weak Small Object Detection in Satellite Laser Ranging Imagery. Sensors 2026, 26, 407. https://doi.org/10.3390/s26020407
Zhu W, Gong W, Wang Y, Zhang Y, Hu J. GMF-Net: A Gaussian-Matched Fusion Network for Weak Small Object Detection in Satellite Laser Ranging Imagery. Sensors. 2026; 26(2):407. https://doi.org/10.3390/s26020407
Chicago/Turabian StyleZhu, Wei, Weiming Gong, Yong Wang, Yi Zhang, and Jinlong Hu. 2026. "GMF-Net: A Gaussian-Matched Fusion Network for Weak Small Object Detection in Satellite Laser Ranging Imagery" Sensors 26, no. 2: 407. https://doi.org/10.3390/s26020407
APA StyleZhu, W., Gong, W., Wang, Y., Zhang, Y., & Hu, J. (2026). GMF-Net: A Gaussian-Matched Fusion Network for Weak Small Object Detection in Satellite Laser Ranging Imagery. Sensors, 26(2), 407. https://doi.org/10.3390/s26020407

