Individual Planted Tree Seedling Detection from UAV Multimodal Data with the Alternate Scanning Fusion Method
Highlights
- A multimodal detection framework using UAV-based DOM and DSM data for individual planted tree seedling detection.
- A novel SSM-based multimodal fusion module for feature-level fusion in the multimodal detection algorithm.
- Improved accuracy and robustness of individual planted tree seedling detection through feature fusion of DOM and DSM data.
- Improved accuracy and efficiency of multimodal detection algorithm through the SSM-based fusion module featuring global feature fusion and linear computational complexity.
Abstract
1. Introduction
- (1)
- We propose a multimodal framework that integrates DOM and DSM for the detection of individual planted tree seedlings, significantly improving the detection accuracy in forested scenarios compared to unimodal detection systems.
- (2)
- We develop an ASF module, which is a SSM-based multimodal fusion network enabling linear-complexity global feature fusion. The ASF modules are embedded into a dual-backbone YOLOv5 framework for feature-level fusion to enhance end-to-end multimodal detection.
- (3)
- We collect and establish the PTS dataset, which covers 96 hectares of forest area and contains high-resolution aerial DOM and DSM data, for the task of detecting planted tree seedlings. This dataset is specifically designed for the training and evaluation of multimodal object detection algorithms.
- (4)
- Within the YOLOv5 framework, our ASF outperforms existing representative fusion methods for multimodal object detection, achieving superior detection performance on both the PTS dataset and the public VEDAI [20] benchmark.
2. Related Work
2.1. Fusion Methods for Multimodal Object Detection
2.2. SSM-Based Models
3. Methodology
3.1. Overview of Multimodal Individual Tree Seedling Detection Framework
3.2. Preliminaries of SSM
3.3. ASF: Alternate Scanning Fusion
4. Materials and Experimental Setup
4.1. Datasets
4.1.1. PTS: Planted Tree Seedlings
4.1.2. VEDAI: Vehicle Detection in Aerial Imagery
4.2. Implementation Details
4.3. Evaluation Metrics
5. Results and Discussion
5.1. Ablation Study
5.1.1. Necessity of Multimodality and ASF
5.1.2. Necessity of Alternate Scanning
5.1.3. Ablation Study on ASF Module Components
5.2. Hyperparameter Selection for Individual Tree Seedling Detection
5.3. RGB vs. NIR-R-G: Performance on Individual Tree Seedling Detection Task
5.4. Comparisons with Previous Methods
6. Conclusions and Future Work
- (1)
- The current dataset is limited to a district-level region and has a narrow temporal coverage. Therefore, the model’s generalization capability across different forest types, larger spatial scales, and varying seasonal conditions has not yet been fully verified.
- (2)
- The current detection task is still at an early stage of information extraction for planted tree seedlings, focusing mainly on locating and delineating individual targets, while attributes such as species, health status, and growth conditions have not yet been addressed.
- (3)
- The interpretability of the proposed ASF method regarding the mechanisms of 518 feature fusion remains limited.
- (1)
- We will expand the dataset in both spatial and temporal dimensions to support studies on model generalization across diverse forest environments and seasonal conditions.
- (2)
- We will build upon the findings of this study to develop more refined methods for extracting detailed attributes of tree seedlings, supporting practical forestry applications such as pest monitoring and plantation optimization.
- (3)
- We will explore the interpretability of the ASF module in multimodal feature fusion and integrate the insights into practical remote sensing tasks to achieve more task-adaptive fusion strategies.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Teng, M.; Ouaknine, A.; Laliberté, E.; Bengio, Y.; Rolnick, D.; Larochelle, H. Bringing SAM to new heights: Leveraging elevation data for tree crown segmentation from drone imagery. arXiv 2025, arXiv:2506.04970. [Google Scholar] [CrossRef]
- Lett, S.; Dorrepaal, E. Global drivers of tree seedling establishment at alpine treelines in a changing climate. Funct. Ecol. 2018, 32, 1666–1680. [Google Scholar] [CrossRef]
- Browne, L.; Markesteijn, L.; Engelbrecht, B.M.; Jones, F.A.; Lewis, O.T.; Manzané-Pinzón, E.; Wright, S.J.; Comita, L.S. Increased mortality of tropical tree seedlings during the extreme 2015–16 El Niño. Glob. Chang. Biol. 2021, 27, 5043–5053. [Google Scholar] [CrossRef]
- Ibáñez, T.S.; Wardle, D.A.; Gundale, M.J.; Nilsson, M.C. Effects of soil abiotic and biotic factors on tree seedling regeneration following a boreal forest wildfire. Ecosystems 2022, 25, 471–487. [Google Scholar] [CrossRef]
- Holl, K.D.; Zahawi, R.A.; Cole, R.J.; Ostertag, R.; Cordell, S. Planting seedlings in tree islands versus plantations as a large-scale tropical forest restoration strategy. Restor. Ecol. 2011, 19, 470–479. [Google Scholar] [CrossRef]
- Brancalion, P.H.; Holl, K.D. Guidance for successful tree planting initiatives. J. Appl. Ecol. 2020, 57, 2349–2361. [Google Scholar] [CrossRef]
- Hyyppa, J. Detecting and estimating attributes for single trees using laser scanner. Photogramm. J. Finl. 1999, 16, 27–42. [Google Scholar]
- Li, W.; Guo, Q.; Jakubowski, M.K.; Kelly, M. A new method for segmenting individual trees from the lidar point cloud. Photogramm. Eng. Remote Sens. 2012, 78, 75–84. [Google Scholar] [CrossRef]
- Næsset, E.; Nelson, R. Using airborne laser scanning to monitor tree migration in the boreal–alpine transition zone. Remote Sens. Environ. 2007, 110, 357–369. [Google Scholar] [CrossRef]
- Stumberg, N.; Bollandsås, O.M.; Gobakken, T.; Næsset, E. Automatic detection of small single trees in the forest-tundra ecotone using airborne laser scanning. Remote Sens. 2014, 6, 10152–10170. [Google Scholar] [CrossRef]
- Pearse, G.D.; Tan, A.Y.; Watt, M.S.; Franz, M.O.; Dash, J.P. Detecting and mapping tree seedlings in UAV imagery using convolutional neural networks and field-verified data. ISPRS J. Photogramm. Remote Sens. 2020, 168, 156–169. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Ecke, S.; Dempewolf, J.; Frey, J.; Schwaller, A.; Endres, E.; Klemmt, H.J.; Tiede, D.; Seifert, T. UAV-based forest health monitoring: A systematic review. Remote Sens. 2022, 14, 3205. [Google Scholar] [CrossRef]
- Jarahizadeh, S.; Salehi, B. A comparative analysis of UAV photogrammetric software performance for forest 3D modeling: A case study using AgiSoft photoscan, PIX4DMapper, and DJI Terra. Sensors 2024, 24, 286. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Gu, A.; Goel, K.; Ré, C. Efficiently modeling long sequences with structured state spaces. arXiv 2021, arXiv:2111.00396. [Google Scholar]
- Chen, H.; Song, J.; Han, C.; Xia, J.; Yokoya, N. ChangeMamba: Remote sensing change detection with spatiotemporal state space model. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–20. [Google Scholar] [CrossRef]
- Jocher, G.; Stoken, A.; Borovec, J.; Chaurasia, A.; Changyu, L.; Hogan, A.; Hajek, J.; Diaconu, L.; Kwon, Y.; Defretin, Y.; et al. ultralytics/yolov5: V5. 0-YOLOv5-P6 1280 models, AWS, Supervise. ly and YouTube integrations. Zenodo. 2021. Available online: https://github.com/ultralytics/yolov5/releases/tag/v5.0 (accessed on 12 April 2021).
- Razakarivony, S.; Jurie, F. Vehicle detection in aerial imagery: A small target detection benchmark. J. Vis. Commun. Image Represent. 2016, 34, 187–203. [Google Scholar] [CrossRef]
- Li, J.; Hong, D.; Gao, L.; Yao, J.; Zheng, K.; Zhang, B.; Chanussot, J. Deep learning in multimodal remote sensing data fusion: A comprehensive review. Int. J. Appl. Earth Obs. Geoinf. 2022, 112, 102926. [Google Scholar] [CrossRef]
- Gómez-Chova, L.; Tuia, D.; Moser, G.; Camps-Valls, G. Multimodal classification of remote sensing images: A review and future directions. Proc. IEEE 2015, 103, 1560–1584. [Google Scholar] [CrossRef]
- Zhang, J.; Lei, J.; Xie, W.; Fang, Z.; Li, Y.; Du, Q. SuperYOLO: Super resolution assisted object detection in multimodal remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–15. [Google Scholar] [CrossRef]
- Liu, J.; Zhang, S.; Wang, S.; Metaxas, D.N. Multispectral deep neural networks for pedestrian detection. arXiv 2016, arXiv:1611.02644. [Google Scholar] [CrossRef]
- Cao, Z.; Yang, H.; Zhao, J.; Guo, S.; Li, L. Attention fusion for one-stage multispectral pedestrian detection. Sensors 2021, 21, 4184. [Google Scholar] [CrossRef] [PubMed]
- Qingyun, F.; Zhaokui, W. Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery. Pattern Recognit. 2022, 130, 108786. [Google Scholar] [CrossRef]
- Qingyun, F.; Dapeng, H.; Zhaokui, W. Cross-modality fusion transformer for multispectral object detection. arXiv 2021, arXiv:2111.00273. [Google Scholar]
- Shen, J.; Chen, Y.; Liu, Y.; Zuo, X.; Fan, H.; Yang, W. ICAFusion: Iterative cross-attention guided feature fusion for multispectral object detection. Pattern Recognit. 2024, 145, 109913. [Google Scholar] [CrossRef]
- Fu, D.Y.; Dao, T.; Saab, K.K.; Thomas, A.W.; Rudra, A.; Ré, C. Hungry hungry hippos: Towards language modeling with state space models. arXiv 2022, arXiv:2212.14052. [Google Scholar]
- Gu, A.; Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. arXiv 2023, arXiv:2312.00752. [Google Scholar] [CrossRef]
- Liu, Y.; Tian, Y.; Zhao, Y.; Yu, H.; Xie, L.; Wang, Y.; Ye, Q.; Jiao, J.; Liu, Y. Vmamba: Visual state space model. Adv. Neural Inf. Process. Syst. 2024, 37, 103031–103063. [Google Scholar]
- Xie, X.; Cui, Y.; Tan, T.; Zheng, X.; Yu, Z. Fusionmamba: Dynamic feature enhancement for multimodal image fusion with mamba. Vis. Intell. 2024, 2, 37. [Google Scholar] [CrossRef]
- He, X.; Cao, K.; Zhang, J.; Yan, K.; Wang, Y.; Li, R.; Xie, C.; Hong, D.; Zhou, M. Pan-mamba: Effective pan-sharpening with state space model. Inf. Fusion 2025, 115, 102779. [Google Scholar] [CrossRef]
- Yang, Y.; Ma, C.; Yao, J.; Zhong, Z.; Zhang, Y.; Wang, Y. Remamber: Referring image segmentation with mamba twister. In Proceedings of the European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024; Springer: Berlin/Heidelberg, Germany, 2024; pp. 108–126. [Google Scholar]
- Dong, Z.; Beedu, A.; Sheinkopf, J.; Essa, I. Mamba fusion: Learning actions through questioning. arXiv 2024, arXiv:2409.11513. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Song, A.; Zhao, Z.; Xiong, Q.; Guo, J. Lightweight the focus module in yolov5 by dilated convolution. In Proceedings of the 2022 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications (CVIDL & ICCEA), Changchun, China, 20–22 May 2022; pp. 111–114. [Google Scholar]
- Quan, J.; Deng, Y. Enhancing YOLOv3 Object Detection: An In-Depth Analysis of C3 Module Integrated Architecture. In Proceedings of the 2024 9th International Conference on Image, Vision and Computing (ICIVC), Suzhou, China, 15–17 July 2024; pp. 117–121. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Rothe, R.; Guillaumin, M.; Van Gool, L. Non-maximum suppression for object detection by passing messages between windows. In Proceedings of the Asian Conference on Computer Vision, Singapore, 1–5 November 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 290–306. [Google Scholar]
- Schuster, M.; Paliwal, K.K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 1997, 45, 2673–2681. [Google Scholar] [CrossRef]
- Ba, J.L.; Kiros, J.R.; Hinton, G.E. Layer normalization. arXiv 2016, arXiv:1607.06450. [Google Scholar] [CrossRef]
- Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
- Elfwing, S.; Uchibe, E.; Doya, K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw. 2018, 107, 3–11. [Google Scholar] [CrossRef]
- Maguire, D.J. ArcGIS: General purpose GIS software system. In Encyclopedia of GIS; Springer: Berlin/Heidelberg, Germany, 2008; pp. 25–31. [Google Scholar]
- Zhao, Z.; Fan, C.; Liu, L. Zenodo, Version 1.1.0; Geo SAM: A QGIS Plugin Using Segment Anything Model (SAM) to Accelerate Geospatial Image Segmentation. 2023. Available online: https://zenodo.org/records/8191039 (accessed on 2 November 2025).
- Moyroud, N.; Portet, F. Introduction to QGIS. QGIS Generic Tools 2018, 1, 1–17. [Google Scholar]
- Chen, K.; Wang, J.; Pang, J.; Cao, Y.; Xiong, Y.; Li, X.; Sun, S.; Feng, W.; Liu, Z.; Xu, J.; et al. MMDetection: Open mmlab detection toolbox and benchmark. arXiv 2019, arXiv:1906.07155. [Google Scholar] [CrossRef]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Cai, Z.; Vasconcelos, N. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6154–6162. [Google Scholar]
- Zhang, H.; Li, F.; Liu, S.; Zhang, L.; Su, H.; Zhu, J.; Ni, L.M.; Shum, H.Y. Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv 2022, arXiv:2203.03605. [Google Scholar]
- Zhang, S.; Wang, X.; Wang, J.; Pang, J.; Lyu, C.; Zhang, W.; Luo, P.; Chen, K. Dense distinct query for end-to-end object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 7329–7338. [Google Scholar]
- Khanam, R.; Hussain, M. Yolov11: An overview of the key architectural enhancements. arXiv 2024, arXiv:2410.17725. [Google Scholar] [CrossRef]
- Wang, Z.; Li, C.; Xu, H.; Zhu, X.; Li, H. MambaYOLO: A Simple Baseline for Object Detection with State Space Model. Proc. AAAI Conf. Artif. Intell. 2025, 39, 8205–8213. [Google Scholar]










| Method | Modality | AP50 | AP75 | AP50:95 |
|---|---|---|---|---|
| Faster R-CNN [12] | DOM | 59.3 | 23.9 | 28.1 |
| DSM | 51.0 | 23.4 | 26.1 | |
| Cascade R-CNN [52] | DOM | 61.2 | 28.1 | 30.9 |
| DSM | 50.5 | 23.5 | 26.3 | |
| DINO [53] | DOM | 63.3 | 20.9 | 27.2 |
| DSM | 54.9 | 24.0 | 27.1 | |
| DDQ-DETR [54] | DOM | 65.5 | 24.5 | 29.7 |
| DSM | 55.9 | 25.2 | 28.3 | |
| RetinaNet [51] | DOM | 55.4 | 15.8 | 23.8 |
| DSM | 43.0 | 19.6 | 21.8 | |
| YOLOv5 [19] | DOM | 63.5 | 27.1 | 30.7 |
| DSM | 55.9 | 25.6 | 28.4 | |
| YOLOv5 (Add) | DOM | 63.6 | 27.8 | 30.4 |
| DSM | 56.3 | 25.8 | 28.6 | |
| DOM + DSM | 70.2 | 37.9 | 37.5 | |
| YOLOv5 (1 × 1 Conv) | DOM | 63.4 | 24.4 | 29.2 |
| DSM | 56.6 | 25.3 | 28.5 | |
| DOM + DSM | 70.0 | 37.5 | 37.7 | |
| ASFYOLO | DOM | 65.1 | 25.8 | 30.6 |
| DSM | 58.1 | 26.4 | 29.3 | |
| DOM + DSM | 72.6 | 39.8 | 39.7 |
| Method | Dataset | mAP50 | mAP75 | mAP50:95 |
|---|---|---|---|---|
| AS | PTS | 72.6 | 39.8 | 39.7 |
| CS | PTS | 71.8 | 40.4 | 39.9 |
| PS | PTS | 72.1 | 38.4 | 38.9 |
| AS | VEDAI | 80.7 | 54.6 | 46.6 |
| CS | VEDAI | 77.7 | 54.7 | 48.4 |
| PS | VEDAI | 77.5 | 52.2 | 46.0 |
| Removed Component | AP50 | AP75 | AP50:95 |
|---|---|---|---|
| Residual Connection 1 | 72.1 | 39.1 | 39.3 |
| Residual Connection 2 | 72.1 | 38.1 | 38.6 |
| Channel Swapping 1 | 71.7 | 38.7 | 38.9 |
| Channel Swapping 2 | 71.5 | 39.0 | 38.9 |
| SS2D | 70.7 | 37.7 | 37.8 |
| None | 72.6 | 39.8 | 39.7 |
| Scale | AP50 | AP75 | AP50:95 | Params | GFLOPs |
|---|---|---|---|---|---|
| Small | 72.6 | 39.8 | 39.7 | 25.07 | 14.64 |
| Medium | 71.5 | 38.7 | 39.0 | 64.42 | 44.12 |
| Large | 70.7 | 37.2 | 38.0 | 128.82 | 98.87 |
| Extra-large | 70.3 | 37.9 | 37.9 | 223.64 | 186.76 |
| Batch Size | Learning Rate | AP50 | AP75 | AP50:95 |
|---|---|---|---|---|
| 8 | 0.005 | 72.3 | 40.3 | 39.9 |
| 8 | 0.01 | 72.3 | 40.1 | 39.6 |
| 8 | 0.02 | 72.3 | 40.1 | 39.6 |
| 16 | 0.005 | 72.0 | 38.6 | 39.1 |
| 16 | 0.01 | 72.1 | 39.7 | 39.7 |
| 16 | 0.02 | 72.0 | 40.0 | 39.2 |
| 32 | 0.005 | 71.6 | 38.9 | 38.7 |
| 32 | 0.01 | 72.6 | 39.8 | 39.7 |
| 32 | 0.02 | 72.3 | 39.0 | 39.4 |
| DOM Input | Modality | AP50 | AP75 | AP50:95 |
|---|---|---|---|---|
| RGB | DOM | 56.7 | 6.4 | 19.2 |
| DSM | 46.6 | 10.7 | 17.1 | |
| DOM + DSM | 58.6 | 10.6 | 20.3 | |
| NIR-R-G | DOM | 55.9 | 5.3 | 18.3 |
| DSM | 46.6 | 10.7 | 17.1 | |
| DOM + DSM | 59.8 | 12.0 | 21.6 |
| Method | Modality | AP50 | AP75 | AP50:95 | Params | GFLOPS |
|---|---|---|---|---|---|---|
| YOLOv11 [55] | DOM | 63.3 | - | 33.1 | 9.43 | 21.50 |
| DSM | 55.7 | - | 29.3 | |||
| DDQ-DETR [54] | DOM | 65.5 | 24.5 | 29.7 | – | – |
| DSM | 55.9 | 25.2 | 28.3 | |||
| MambaYOLO [56] | DOM | 62.7 | - | 31.4 | 5.98 | 13.60 |
| DSM | 57.3 | - | 30.8 | |||
| SuperYOLO [23] | DOM | 63.7 | 25.0 | 29.5 | 7.07 | 9.12 |
| DSM | 50.9 | 26.2 | 28.8 | |||
| DOM + DSM | 70.5 | 38.6 | 38.2 | |||
| YOLOv5 (Add) | DOM | 63.2 | 24.2 | 29.0 | 11.28 | 13.75 |
| DSM | 56.2 | 26.2 | 28.7 | |||
| DOM + DSM | 70.2 | 37.9 | 37.5 | |||
| YOLOfusion [26] | DOM | 63.3 | 25.5 | 29.6 | 11.36 | 13.80 |
| DSM | 56.7 | 25.1 | 28.2 | |||
| DOM + DSM | 71.3 | 38.1 | 38.3 | |||
| CFT [27] | DOM | 65.1 | 25.7 | 29.8 | 44.40 | 17.99 |
| DSM | 57.5 | 27.1 | 29.3 | |||
| DOM + DSM | 71.6 | 39.6 | 39.1 | |||
| ICAFusion [28] | DOM | 63.8 | - | 29.3 | 20.18 | 15.08 |
| DSM | 56.2 | - | 27.6 | |||
| DOM + DSM | 70.8 | - | 38.0 | |||
| ASFYOLO | DOM | 65.1 | 25.8 | 30.6 | 25.07 | 14.64 |
| DSM | 58.1 | 26.4 | 29.3 | |||
| DOM + DSM | 72.6 | 39.8 | 39.7 |
| Method | Car | Truck | Pickup | Tractor | Camp | Boat | Plane | Van | mAP50 | mAP75 | mAP50:95 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| SuperYOLO [23] | 87.4 | 68.1 | 79.4 | 73.4 | 72.1 | 46.7 | 72.5 | 86.0 | 73.2 | 55.1 | 45.3 |
| YOLOv5 (Add) | 86.5 | 64.5 | 78.1 | 74.1 | 70.5 | 55.4 | 68.3 | 96.6 | 74.2 | 54.3 | 45.7 |
| YOLOFusion [26] | 87.3 | 64.0 | 79.7 | 74.9 | 73.3 | 54.2 | 70.5 | 97.2 | 75.1 | 53.5 | 45.6 |
| CFT [27] | 88.1 | 63.3 | 80.7 | 76.4 | 73.7 | 55.6 | 73.0 | 94.9 | 75.7 | 52.2 | 45.4 |
| ICAfusion [28] | 87.3 | 57.7 | 80.5 | 73.8 | 70.1 | 51.7 | 71.4 | 98.9 | 73.9 | 45.2 | 42.7 |
| ASFYOLO | 89.9 | 65.7 | 83.8 | 77.7 | 77.2 | 56.2 | 75.0 | 93.5 | 77.4 | 53.1 | 45.8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Qi, T.; Liu, Y.; Tan, J.; Yin, P.; Huang, C.; Zhou, Z.; Li, Z. Individual Planted Tree Seedling Detection from UAV Multimodal Data with the Alternate Scanning Fusion Method. Remote Sens. 2025, 17, 3650. https://doi.org/10.3390/rs17213650
Qi T, Liu Y, Tan J, Yin P, Huang C, Zhou Z, Li Z. Individual Planted Tree Seedling Detection from UAV Multimodal Data with the Alternate Scanning Fusion Method. Remote Sensing. 2025; 17(21):3650. https://doi.org/10.3390/rs17213650
Chicago/Turabian StyleQi, Taoming, Yaokai Liu, Junxiang Tan, Pengyu Yin, Changping Huang, Zengguang Zhou, and Ziyang Li. 2025. "Individual Planted Tree Seedling Detection from UAV Multimodal Data with the Alternate Scanning Fusion Method" Remote Sensing 17, no. 21: 3650. https://doi.org/10.3390/rs17213650
APA StyleQi, T., Liu, Y., Tan, J., Yin, P., Huang, C., Zhou, Z., & Li, Z. (2025). Individual Planted Tree Seedling Detection from UAV Multimodal Data with the Alternate Scanning Fusion Method. Remote Sensing, 17(21), 3650. https://doi.org/10.3390/rs17213650

