Fine-Scale Grassland Classification Using UAV-Based Multi-Sensor Image Fusion and Deep Learning
Abstract
Highlights
- MSPF-Net showed top quality with sharp edges when fusing RGB/MS/TIR/LiDAR.
- UNet++ + SE with Focal Dice achieved boundary-coherent grassland segmentation with more accuracy.
- Enabling boundary-coherent grassland monitoring and planning at parcel level.
- Providing a reproducible pipeline and modality value ranking (CHM > indices > RGB > TIR) for multimodal mapping.
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Data Collection and Preprocessing
2.2.1. Ground Data Collection
2.2.2. UAV Data Acquisition and Processing
2.3. Multi-Sensor Image Fusion
2.3.1. Resolution Unification and Blocking Strategy
2.3.2. Core Modules for Deep Feature Fusion
2.3.3. Architecture of Four Fusion Networks
2.3.4. Metrics for Fusion Quality Assessment
2.3.5. Training Strategy for the Fusion Networks
2.4. Image Segmentation Based on Deep Learning
2.4.1. Training Data and Labeling Strategy
2.4.2. Segmentation Networks and Attention Mechanism Design
2.4.3. Loss Function Design
2.4.4. Training Strategy and Experimental Setup
2.4.5. Evaluation Metrics
3. Results
3.1. Image Fusion Quality Evaluation
3.2. Model Comparison and Ablation Experiments
3.2.1. Overall Performance of the Combination of Multi-Model Multi-Attention Mechanisms
3.2.2. Loss Functions: Training Dynamics and Test Performance
3.2.3. Specific Category Accuracy Assessment
3.2.4. Attention Mechanism Ablation Study
3.2.5. Segmentation Aware Modality Occlusion Sensitivity
3.3. Panoramic Grassland Mapping Verification
4. Discussion
4.1. The Technical Value of Multiscale Fusion
4.2. Channel Attention Gain in High Resolution Segmentation
4.3. Analysis of Method Limitations
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Bardgett, R.D.; Bullock, J.M.; Lavorel, S.; Manning, P.; Schaffner, U.; Ostle, N.; Chomel, M.; Durigan, G.; Fry, E.L.; Johnson, D.; et al. Combatting Global Grassland Degradation. Nat. Rev. Earth Environ. 2021, 2, 720–735. [Google Scholar] [CrossRef]
- Zhang, M.; Sun, J.; Wang, Y.; Li, Y.; Duo, J. State-of-the-Art and Challenges in Global Grassland Degradation Studies. Geogr. Sustain. 2025, 6, 100229. [Google Scholar] [CrossRef]
- Yu, L.; Liu, Y.; Shen, M.; Yu, Z.; Li, X.; Liu, H.; Lyne, V.; Jiang, M.; Wu, C. Extreme Hydroclimates Amplify the Biophysical Effects of Advanced Green-up in Temperate China. Agric. For. Meteorol. 2025, 363, 110421. [Google Scholar] [CrossRef]
- Stumpf, F.; Schneider, M.K.; Keller, A.; Mayr, A.; Rentschler, T.; Meuli, R.G.; Schaepman, M.; Liebisch, F. Spatial Monitoring of Grassland Management Using Multi-Temporal Satellite Imagery. Ecol. Indic. 2020, 113, 106201. [Google Scholar] [CrossRef]
- Dujakovic, A.; Watzig, C.; Schaumberger, A.; Klingler, A.; Atzberger, C.; Vuolo, F. Enhancing Grassland Cut Detection Using Sentinel-2 Time Series through Integration of Sentinel-1 SAR and Weather Data. Remote Sens. Appl. Soc. Environ. 2025, 37, 101453. [Google Scholar] [CrossRef]
- Bartold, M.; Kluczek, M.; Wróblewski, K.; Dąbrowska-Zielińska, K.; Goliński, P.; Golińska, B. Mapping Management Intensity Types in Grasslands with Synergistic Use of Sentinel-1 and Sentinel-2 Satellite Images. Sci. Rep. 2024, 14, 32066. [Google Scholar] [CrossRef]
- Luo, Y.; Guan, K.; Peng, J.; Wang, S.; Huang, Y. STAIR 2.0: A Generic and Automatic Algorithm to Fuse Modis, Landsat, and Sentinel-2 to Generate 10 m, Daily, and Cloud-/Gap-Free Surface Reflectance Product. Remote Sens. 2020, 12, 3209. [Google Scholar] [CrossRef]
- Díaz-Ireland, G.; Gülçin, D.; Lopez-Sanchez, A.; Pla, E.; Burton, J.; Velázquez, J. Classification of Protected Grassland Habitats Using Deep Learning Architectures on Sentinel-2 Satellite Imagery Data. Int. J. Appl. Earth Obs. Geoinf. 2024, 134, 104221. [Google Scholar] [CrossRef]
- Rivas, H.; Touchais, H.; Thierion, V.; Millet, J.; Curtet, L.; Fauvel, M. Nationwide Operational Mapping of Grassland First Mowing Dates Combining Machine Learning and Sentinel-2 Time Series. Remote Sens. Environ. 2024, 315, 114476. [Google Scholar] [CrossRef]
- Holtgrave, A.-K.; Lobert, F.; Erasmi, S.; Röder, N.; Kleinschmit, B. Grassland Mowing Event Detection Using Combined Optical, SAR, and Weather Time Series. Remote Sens. Environ. 2023, 295, 113680. [Google Scholar] [CrossRef]
- Ghamisi, P.; Rasti, B.; Yokoya, N.; Wang, Q.; Hofle, B.; Bruzzone, L.; Bovolo, F.; Chi, M.; Anders, K.; Gloaguen, R. Multisource and Multitemporal Data Fusion in Remote Sensing. arXiv 2018, arXiv:1812.08287. [Google Scholar] [CrossRef]
- Abdollahi, A.; Liu, Y.; Pradhan, B.; Huete, A.; Dikshit, A.; Tran, N.N. Short-Time-Series Grassland Mapping Using Sentinel-2 Imagery and Deep Learning-Based Architecture. Egypt. J. Remote Sens. Space Sci. 2022, 25, 673–685. [Google Scholar] [CrossRef]
- Tsardanidis, I.; Koukos, A.; Sitokonstantinou, V.; Drivas, T.; Kontoes, C. Cloud Gap-Filling with Deep Learning for Improved Grassland Monitoring. Comput. Electron. Agric. 2025, 230, 109732. [Google Scholar] [CrossRef]
- Lu, B.; He, Y. Species Classification Using Unmanned Aerial Vehicle (UAV)-Acquired High Spatial Resolution Imagery in a Heterogeneous Grassland. ISPRS J. Photogramm. Remote Sens. 2017, 128, 73–85. [Google Scholar] [CrossRef]
- Nahrstedt, K.; Reuter, T.; Trautz, D.; Waske, B.; Jarmer, T. Classifying Stand Compositions in Clover Grass Based on High-Resolution Multispectral UAV Images. Remote Sens. 2024, 16, 2684. [Google Scholar] [CrossRef]
- Šrollerů, A.; Potůčková, M. Evaluating the Applicability of High-Density UAV LiDAR Data for Monitoring Tundra Grassland Vegetation. Int. J. Remote Sens. 2025, 46, 42–76. [Google Scholar] [CrossRef]
- Michez, A.; Piégay, H.; Lisein, J.; Claessens, H.; Lejeune, P. Classification of Riparian Forest Species and Health Condition Using Multi-Temporal and Hyperspatial Imagery from Unmanned Aerial System. Environ. Monit. Assess. 2016, 188, 146. [Google Scholar] [CrossRef] [PubMed]
- Hütt, C.; Isselstein, J.; Komainda, M.; Schöttker, O.; Sturm, A. UAV LiDAR-Based Grassland Biomass Estimation for Precision Livestock Management. J. Appl. Remote Sens. 2024, 18, 017502. [Google Scholar] [CrossRef]
- Li, R.; Zheng, S.; Zhang, C.; Duan, C.; Wang, L.; Atkinson, P.M. ABCNet: Attentive Bilateral Contextual Network for Efficient Semantic Segmentation of Fine-Resolution Remotely Sensed Imagery. ISPRS J. Photogramm. Remote Sens. 2021, 181, 84–98. [Google Scholar] [CrossRef]
- Wang, L.; Li, R.; Zhang, C.; Fang, S.; Duan, C.; Meng, X.; Atkinson, P.M. UNetFormer: A UNet-like Transformer for Efficient Semantic Segmentation of Remote Sensing Urban Scene Imagery. ISPRS J. Photogramm. Remote Sens. 2022, 190, 196–214. [Google Scholar] [CrossRef]
- Fan, X.; Zhou, W.; Qian, X.; Yan, W. Progressive Adjacent-Layer Coordination Symmetric Cascade Network for Semantic Segmentation of Multimodal Remote Sensing Images. Expert Syst. Appl. 2024, 238, 121999. [Google Scholar] [CrossRef]
- Zhao, C.; Liu, H.; Su, N.; Xu, C.; Yan, Y.; Feng, S. TMTNet: A Transformer-Based Multimodality Information Transfer Network for Hyperspectral Object Tracking. Remote Sens. 2023, 15, 1107. [Google Scholar] [CrossRef]
- Wang, S.; Hu, Q.; Wang, S.; Zhao, P.; Li, J.; Ai, M. Category Attention Guided Network for Semantic Segmentation of Fine-Resolution Remote Sensing Images. Int. J. Appl. Earth Obs. Geoinf. 2024, 127, 103661. [Google Scholar] [CrossRef]
- Samadzadegan, F.; Toosi, A.; Dadrass Javan, F. A Critical Review on Multi-Sensor and Multi-Platform Remote Sensing Data Fusion Approaches: Current Status and Prospects. Int. J. Remote Sens. 2025, 46, 1327–1402. [Google Scholar] [CrossRef]
- Wang, S.; Zhou, Q. Multi-Source Fusion Enhanced Feature Segmentation in Remote Sensing Imagery. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2024, 10, 395–401. [Google Scholar] [CrossRef]
- Wang, X.; Hu, Z.; Shi, S.; Hou, M.; Xu, L.; Zhang, X. A Deep Learning Method for Optimizing Semantic Segmentation Accuracy of Remote Sensing Images Based on Improved UNet. Sci. Rep. 2023, 13, 7600. [Google Scholar] [CrossRef] [PubMed]
- Sun, L.; Zou, H.; Wei, J.; Cao, X.; He, S.; Li, M.; Liu, S. Semantic Segmentation of High-Resolution Remote Sensing Images Based on Sparse Self-Attention and Feature Alignment. Remote Sens. 2023, 15, 1598. [Google Scholar] [CrossRef]
- Landgraf, S.; Hillemann, M.; Wursthorn, K.; Ulrich, M. U-CE: Uncertainty-Aware Cross-Entropy for Semantic Segmentation. arXiv 2023, arXiv:2307.09947. [Google Scholar]
- Liu, T.; Yu, L.; Yan, Z.; Li, X.; Bu, K.; Yang, J. Enhanced Climate Mitigation Feedbacks by Wetland Vegetation in Semi-arid Compared to Humid Regions. Geophys. Res. Lett. 2025, 52, e2025GL115242. [Google Scholar] [CrossRef]
- Zhang, Y.; Tian, Y.; Kong, Y.; Zhong, B.; Fu, Y. Residual Dense Network for Image Super-Resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2472–2481. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. Cbam: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. Unet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation. IEEE Trans. Med. Imaging 2019, 39, 1856–1867. [Google Scholar] [CrossRef]
- Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid Scene Parsing Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2881–2890. [Google Scholar]
- Lin, T.-Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Milletari, F.; Navab, N.; Ahmadi, S.-A. V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. In Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 25–28 October 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 565–571. [Google Scholar]
- Jadon, S. A Survey of Loss Functions for Semantic Segmentation. In Proceedings of the 2020 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Via del Mar, Chile, 27–29 October 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–7. [Google Scholar]
- Everingham, M.; Eslami, S.A.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The Pascal Visual Object Classes Challenge: A Retrospective. Int. J. Comput. Vis. 2015, 111, 98–136. [Google Scholar]
- Minaee, S.; Boykov, Y.; Porikli, F.; Plaza, A.; Kehtarnavaz, N.; Terzopoulos, D. Image Segmentation Using Deep Learning: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 3523–3542. [Google Scholar] [CrossRef] [PubMed]
- Lu, D.; Cheng, S.; Wang, L.; Song, S. Multi-Scale Feature Progressive Fusion Network for Remote Sensing Image Change Detection. Sci. Rep. 2022, 12, 11968. [Google Scholar] [CrossRef]
- Wang, J.; Chen, T.; Zheng, L.; Tie, J.; Zhang, Y.; Chen, P.; Luo, Z.; Song, Q. A Multi-Scale Remote Sensing Semantic Segmentation Model with Boundary Enhancement Based on UNetFormer. Sci. Rep. 2025, 15, 14737. [Google Scholar] [CrossRef]
- Chang, J.; He, X.; Li, P.; Tian, T.; Cheng, X.; Qiao, M.; Zhou, T.; Zhang, B.; Chang, Z.; Fan, T. Multi-Scale Attention Network for Building Extraction from High-Resolution Remote Sensing Images. Sensors 2024, 24, 1010. [Google Scholar]
- Wang, X.; Feng, Y.; Song, R.; Mu, Z.; Song, C. Multi-Attentive Hierarchical Dense Fusion Net for Fusion Classification of Hyperspectral and LiDAR Data. Inf. Fusion 2022, 82, 1–18. [Google Scholar]
- Aburaed, N.; Al-Saad, M.; Alkhatib, M.Q.; Zitouni, M.S.; Almansoori, S.; Al-Ahmad, H. Semantic Segmentation of Remote Sensing Imagery Using AN Enhanced Encoder-Decoder Architecture. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2023, 10, 1015–1020. [Google Scholar]
- Jiang, J.; Feng, X.; Huang, H. Semantic Segmentation of Remote Sensing Images Based on Dual-channel Attention Mechanism. IET Image Process. 2024, 18, 2346–2356. [Google Scholar]
- Duan, S.; Zhao, J.; Huang, X.; Zhao, S. Semantic Segmentation of Remote Sensing Data Based on Channel Attention and Feature Information Entropy. Sensors 2024, 24, 1324. [Google Scholar] [CrossRef]
- Zhang, E.; Liu, J.; Cao, A.; Sun, Z.; Zhang, H.; Wang, H.; Sun, L.; Song, M. RS-SAM: Integrating Multi-Scale Information for Enhanced Remote Sensing Image Segmentation. In Proceedings of the Asian Conference on Computer Vision, Hanoi, Vietnam, 8–12 December 2024; pp. 994–1010. [Google Scholar]
- Liu, M.; Dan, J.; Lu, Z.; Yu, Y.; Li, Y.; Li, X. CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation. arXiv 2024, arXiv:2405.10530. [Google Scholar]
- Ma, X.; Zhang, X.; Pun, M.-O.; Huang, B. MANet: Fine-Tuning Segment Anything Model for Multimodal Remote Sensing Semantic Segmentation. arXiv 2024, arXiv:2410.11160. [Google Scholar] [CrossRef]
- Yu, L.; Liu, Y.; Liu, T.; Yan, F. Impact of Recent Vegetation Greening on Temperature and Precipitation over China. Agric. For. Meteorol. 2020, 295, 108197. [Google Scholar] [CrossRef]
- Zhan, Z.; Ren, H.; Xia, M.; Lin, H.; Wang, X.; Li, X. Amfnet: Attention-Guided Multi-Scale Fusion Network for Bi-Temporal Change Detection in Remote Sensing Images. Remote Sens. 2024, 16, 1765. [Google Scholar] [CrossRef]
Fusion | RCAF-Net | DAEF-Net | DAF-Net | MSPF-Net | Unfused |
---|---|---|---|---|---|
Entropy | 6.80 | 6.79 | 6.80 | 6.84 | 6.68 |
Spatial Frequency | 15.52 | 15.52 | 15.54 | 15.56 | 15.33 |
Mean Gradient | 12.51 | 12.51 | 12.53 | 12.54 | 12.28 |
Variance of Laplacian | 4996.84 | 5006.41 | 5000.51 | 5022.92 | 4940.76 |
Tenengrad | 76.17 | 76.12 | 76.35 | 76.42 | 75.12 |
RMS | 30.68 | 30.56 | 30.84 | 31.22 | 30.18 |
Model | mIoU (%) | OA (%) | Precision (%) | Recall (%) | F1-Score (%) | Kappa (%) |
---|---|---|---|---|---|---|
DeepLabV3+_SE | 70.53 | 79.17 | 75.27 | 72.72 | 73.97 | 72.44 |
DeepLabV3+_ECA | 68.28 | 78.84 | 75.27 | 72.51 | 73.86 | 70.02 |
DeepLabV3+_CBAM | 68.10 | 78.72 | 75.10 | 72.43 | 73.74 | 69.85 |
UNet++_SE | 77.68 | 86.98 | 83.16 | 79.87 | 81.48 | 82.47 |
UNet++_ECA | 72.97 | 84.32 | 79.82 | 75.58 | 77.64 | 74.68 |
UNet++_CBAM | 72.45 | 83.59 | 79.23 | 75.32 | 77.23 | 73.85 |
PSPNet_SE | 57.41 | 67.65 | 54.75 | 57.68 | 56.18 | 54.15 |
PSPNet_ECA | 54.76 | 69.26 | 63.05 | 59.36 | 61.15 | 53.39 |
PSPNet_CBAM | 53.36 | 66.21 | 59.61 | 60.50 | 60.05 | 51.28 |
FPN_SE | 62.70 | 76.94 | 70.10 | 68.11 | 69.09 | 63.81 |
FPN_ECA | 62.31 | 76.39 | 69.04 | 68.64 | 68.84 | 63.30 |
FPN_CBAM | 62.89 | 77.32 | 69.81 | 68.87 | 69.34 | 64.10 |
Loss | mIoU (%) | OA (%) | F1 (%) | Kappa (%) |
---|---|---|---|---|
Dice | 75.90 | 86.50 | 82.60 | 80.70 |
Focal | 76.40 | 86.90 | 83.10 | 81.20 |
Focal Dice | 77.70 | 87.50 | 84.20 | 82.50 |
Model | Index | Classification Accuracy (%) | ||||
---|---|---|---|---|---|---|
Leymus chinensis | Puccinellia distans | Phragmites australis | Bare Land | Others | ||
UNet++_SE | PA | 86.46 | 82.40 | 79.62 | 93.13 | 75.19 |
UA | 82.30 | 77.68 | 78.05 | 89.10 | 70.05 | |
UNet++_ECA | PA | 82.62 | 81.32 | 79.18 | 95.05 | 70.20 |
UA | 76.70 | 76.10 | 68.13 | 86.05 | 72.21 | |
UNet++_CBAM | PA | 81.16 | 83.40 | 78.85 | 92.33 | 61.83 |
UA | 77.20 | 75.51 | 72.10 | 84.40 | 67.32 |
Model | mIoU (%) | OA (%) | Precision (%) | Recall (%) | F1-Score (%) | Kappa (%) |
---|---|---|---|---|---|---|
UNet++ | 73.95 | 83.76 | 79.82 | 76.48 | 78.12 | 75.84 |
UNet++_SE | 77.68 | 86.98 | 83.16 | 79.87 | 81.48 | 82.47 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cai, Z.; Wen, C.; Bao, L.; Ma, H.; Yan, Z.; Li, J.; Gao, X.; Yu, L. Fine-Scale Grassland Classification Using UAV-Based Multi-Sensor Image Fusion and Deep Learning. Remote Sens. 2025, 17, 3190. https://doi.org/10.3390/rs17183190
Cai Z, Wen C, Bao L, Ma H, Yan Z, Li J, Gao X, Yu L. Fine-Scale Grassland Classification Using UAV-Based Multi-Sensor Image Fusion and Deep Learning. Remote Sensing. 2025; 17(18):3190. https://doi.org/10.3390/rs17183190
Chicago/Turabian StyleCai, Zhongquan, Changji Wen, Lun Bao, Hongyuan Ma, Zhuoran Yan, Jiaxuan Li, Xiaohong Gao, and Lingxue Yu. 2025. "Fine-Scale Grassland Classification Using UAV-Based Multi-Sensor Image Fusion and Deep Learning" Remote Sensing 17, no. 18: 3190. https://doi.org/10.3390/rs17183190
APA StyleCai, Z., Wen, C., Bao, L., Ma, H., Yan, Z., Li, J., Gao, X., & Yu, L. (2025). Fine-Scale Grassland Classification Using UAV-Based Multi-Sensor Image Fusion and Deep Learning. Remote Sensing, 17(18), 3190. https://doi.org/10.3390/rs17183190