An NSGA-II-XGBoost Machine Learning Approach for High-Precision Cropland Identification in Highland Areas: A Case Study of Xundian County, Yunnan, China
Highlights
- By integrating the NSGA-II optimization framework with the XGBoost algorithm, the proposed model markedly enhances both the accuracy and generalization capability of cropland recognition.
- It performs exceptionally well in distinguishing croplands from other land types—especially those with similar spectral characteristics or ambiguous boundaries—in plateau areas.
- This study effectively boosts cropland classification accuracy in high-altitude and complex terrain regions through the integration of spectral, radar, and topographic features (e.g., slope and elevation).
- When combined with the percentage-based method and texture features applied in Google Earth Engine (GEE), this data fusion strategy further confirms the critical role of topographic factors and other auxiliary features in high-precision cropland identification.
Abstract
1. Introduction
- (1)
- Clarification of key challenges in existing farmland mapping, particularly instability, spectral confusion, and generalization limitations in high-altitude, topographically fragmented regions.
- (2)
- Introduction of a multi-objective NSGA-II-XGBoost optimization framework that jointly optimizes accuracy and Kappa coefficient, effectively balancing precision and robustness under complex geomorphological conditions.
- (3)
- Integration of multi-source remote sensing data—including multispectral, SAR, topographic, texture, and time-series features—into a unified feature space, capturing the multidimensional heterogeneity of plateau farmland.
- (4)
- Demonstration of practical improvements showing that even modest accuracy gains (e.g., a 0.25% OA increase) correspond to significant enhancements in model stability, interpretability, and transferability, which are essential for operational farmland monitoring and land management in mountainous regions.
2. Study Area and Data Processing
2.1. Study Area
2.2. Data Sources and Preprocessing
2.2.1. Data Sources
2.2.2. Data Preprocessing
3. Research Methods
3.1. Construction of Feature Dataset
3.1.1. Feature Selection
3.1.2. Temporal Features for Classification
3.1.3. Feature Selection and Dimensionality Reduction
3.2. Classifier Construction
3.3. Accuracy Evaluation Methods
4. Results and Analysis
4.1. Accuracy Analysis of Classification Algorithms
4.2. Feature Analysis
4.2.1. Overall Analysis
4.2.2. Feature Dependence Analysis
5. Discussion
5.1. Comparison with the Third National Land Survey Data
5.2. Comparative Performance and Model Advantages
5.3. Applicability and Generalization Analysis
5.4. Limitations and Future Perspectives
- (1)
- Integrating high-resolution topographic and spectral data to enhance detection of small terraces and steep-slope cropland.
- (2)
- Developing spatio-temporal adaptive optimization frameworks that enable automatic parameter adjustment across heterogeneous landscapes.
- (3)
- Incorporating explainable and uncertainty-aware AI approaches to improve model transparency and support operational decision-making.
6. Conclusions
- (1)
- Model performance: The NSGA-II–XGBoost model significantly improved classification accuracy and generalization through multi-objective optimization of hyperparameters and feature combinations. Compared with RF, SVM, TABM and standard XGBoost, it achieved superior accuracy, computational efficiency, and robustness, particularly in identifying cropland with complex boundaries or spectral similarity.
- (2)
- Feature contributions: Feature importance analysis showed that spectral, radar, and topographic variables all contributed substantially to cropland recognition. Terrain factors such as slope and elevation were especially influential in plateau and hilly regions. Percentile-based features (e.g., VH_p5) effectively captured detailed feature distributions, improving classification reliability.
- (3)
- Sample strategy: The integration of multiple land cover products and uniform random sampling enhanced the balance and representativeness of training data, leading to improved model performance.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Shi, S.; Han, Y.; Yu, W.; Cao, Y.; Cai, W.; Yang, P.; Wu, W.; Yu, Q. Spatio-temporal differences and factors influencing intensivecropland use in the Huang-Huai-Hai Plain. J. Geogr. Sci. 2018, 28, 1626–1640. [Google Scholar] [CrossRef]
- Wang, J.; Lin, Y.; Glendinning, A.; Xu, Y. Land-use changes and land policies evolution in China’s urbanization processes. Land Use Policy 2018, 75, 375–387. [Google Scholar] [CrossRef]
- Liu, Y.; Zhou, Y. Reflections on China’s food security and land use policy under rapid urbanization. Land Use Policy 2021, 109, 105699. [Google Scholar] [CrossRef]
- Liu, F.; Zhang, Z.; Zhao, X.; Wang, X.; Zuo, L.; Wen, Q.; Yi, L.; Xu, J.; Hu, S.; Liu, B. Chinese cropland losses due to urban expansion in the past four decades. Sci. Total. Environ. 2019, 650, 847–857. [Google Scholar] [CrossRef]
- Chen, G.; Zhao, J.; Duan, X.; Tang, B.; Zuo, L.; Wang, X.; Guo, Q. Spatial Quantification of Cropland Soil Erosion Dynamics in the Yunnan Plateau Based on Sampling Survey and Multi-Source LUCC Data. Remote Sens. 2024, 16, 977. [Google Scholar] [CrossRef]
- Xue, S.; Fang, Z.; van Riper, C.; He, W.; Li, X.; Zhang, F.; Wang, T.; Cheng, C.; Zhou, Q.; Huang, Z.; et al. Ensuring China’s food security in a geographical shift of its grain production: Driving factors, threats, and solutions. Resour. Conserv. Recycl. 2024, 210, 107845. [Google Scholar]
- Chen, Z.; Shi, D. Spatial structure characteristics of slope farmland quality in Plateau mountain area: A case study of Yunnan Province, China. Sustainability 2020, 12, 7230. [Google Scholar] [CrossRef]
- Ye, S.; Ren, S.; Song, C.; Du, Z.; Wang, K.; Du, B.; Cheng, F.; Zhu, D. Spatial pattern of cultivated land fragmentation in mainland China: Characteristics, dominant factors, and countermeasures. Land Use Policy 2024, 139, 107070. [Google Scholar] [CrossRef]
- Zhao, S.; Yin, M. Change of urban and rural construction land and driving factors of arable land occupation. PLoS ONE 2023, 18, e0286248. [Google Scholar]
- Sumbo, D.K.; Anane, G.K.; Inkoom, D.K.B. ‘Peri-urbanisation and loss of arable land’: Indigenes’ farmland access challenges and adaptation strategies in Kumasi and Wa, Ghana. Land Use Policy 2023, 126, 106534. [Google Scholar] [CrossRef]
- Lü, G.; Batty, M.; Strobl, J.; Lin, H.; Zhu, A.X.; Chen, M. Reflections and speculations on the progress in Geographic Information Systems (GIS): A geographic perspective. Int. J. Geogr. Inf. Sci. 2019, 33, 346–367. [Google Scholar] [CrossRef]
- Pandey, P.C.; Pandey, M. Highlighting the role of agriculture and geospatial technology in food security and sustainable development goals. Sustain. Dev. 2023, 31, 3175–3195. [Google Scholar] [CrossRef]
- Shen, Q.; Deng, H.; Wen, X.; Chen, Z.; Xu, H. Statistical texture learning method for monitoring abandoned suburban cropland based on high-resolution remote sensing and deep learning. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 3060–3069. [Google Scholar] [CrossRef]
- Abdi, A.M. Land cover and land use classification performance of machine learning algorithms in a boreal landscape using Sentinel-2 data. GISci. Remote Sens. 2020, 57, 1–20. [Google Scholar] [CrossRef]
- Shao, Z.; Ahmad, M.N.; Javed, A. Comparison of random forest and XGBoost classifiers using integrated optical and SAR features for map** urban impervious surface. Remote Sens. 2024, 16, 665. [Google Scholar] [CrossRef]
- Wang, N.; Naz, I.; Aslam, R.W.; Quddoos, A.; Soufan, W.; Raza, D.; Ishaq, T.; Ahmed, B. Spatio-Temporal Dynamics of Rangeland Transformation using machine learning algorithms and Remote Sensing data. Rangel. Ecol. Manag. 2024, 94, 106–118. [Google Scholar] [CrossRef]
- Yi, Z.; Jia, L.; Chen, Q.; Jiang, M.; Zhou, D.; Zeng, Y. Early-season crop identification in the Shiyang River Basin using a deep learning algorithm and time-series Sentinel-2 data. Remote Sens. 2022, 14, 5625. [Google Scholar] [CrossRef]
- Wang, X.; Fang, S.; Yang, Y.; Du, J.; Wu, H. A New Method for Crop Type Mapping at the Regional Scale Using Multi-Source and Multi-Temporal Sentinel Imagery. Remote Sens. 2023, 15, 2466. [Google Scholar] [CrossRef]
- Liu, Q.; Wu, Z.; Cui, N.; Jin, X.; Zhu, S.; Jiang, S.; Zhao, L.; Gong, D. Estimation of soil moisture using multi-source remote sensing and machine learning algorithms in farming land of Northern China. Remote Sens. 2023, 15, 4214. [Google Scholar] [CrossRef]
- Hao, Q.; Zhang, T.; Cheng, X.; He, P.; Zhu, X.; Chen, Y. GIS-based non-grain cultivated land susceptibility prediction using data mining methods. Sci. Rep. 2024, 14, 4433. [Google Scholar] [CrossRef]
- Sun, X.; Zhou, C.; Xie, J.; Ouyang, Z.; Luo, Y. SRTM DEM correction based on PSO-DBN model in vegetated mountain areas. Forests 2023, 14, 1985. [Google Scholar] [CrossRef]
- Liang, K.; Yang, G.; Zuo, Y.; Chen, J.; Sun, W.; Meng, X.; Chen, B. A Novel Method for Cloud and Cloud Shadow Detection Based on the Maximum and Minimum Values of Sentinel-2 Time Series Images. Remote Sens. 2024, 16, 1392. [Google Scholar]
- Vollrath, A.; Mullissa, A.; Reiche, J. Angular-based radiometric slope correction for Sentinel-1 on google earth engine. Remote Sens. 2020, 12, 1867. [Google Scholar] [CrossRef]
- Choi, H.; Jeong, J. Speckle noise reduction technique for SAR images using statistical characteristics of speckle noise and discrete wavelet transform. Remote Sens. 2019, 11, 1184. [Google Scholar] [CrossRef]
- Shi, C.; Zuo, X.; Zhang, J.; Zhu, D.; Li, Y.; Bu, J. Accuracy Assessment of Geometric-Distortion Identification Methods for Sentinel-1 Synthetic Aperture Radar Imagery in Highland Mountainous Regions. Sensors 2024, 24, 2834. [Google Scholar] [CrossRef] [PubMed]
- Tucker, C.J. Red and photographic infrared linear combinations for monitoring vegetation. Remote Sens. Environ. 1979, 8, 127–150. [Google Scholar] [CrossRef]
- Huete, A.; Didan, K.; Miura, T.; Rodriguez, E.P.; Gao, X.; Ferreira, L.G. Overview of the radiometric and biophysical performance of the MODIS vegetation indices. Remote Sens. Environ. 2002, 83, 195–213. [Google Scholar] [CrossRef]
- Huete, A.R. A soil adjusted vegetation index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
- Kulkarni, K.; Vijaya, P.A. NDBI based prediction of land use land cover change. J. Indian Soc. Remote Sens. 2021, 49, 2523–2537. [Google Scholar]
- Tavus, B.; Kocaman, S.; Gokceoglu, C. Flood damage assessment with Sentinel-1 and Sentinel-2 data after Sardoba dam break with GLCM features and Random Forest method. Sci. Total. Environ. 2022, 816, 151585. [Google Scholar] [CrossRef]
- Wang, S.; Feng, W.; Quan, Y.; Li, Q.; Dauphin, G.; Huang, W.; Li, J.; Xing, M. A heterogeneous double ensemble algorithm for soybean planting area extraction in Google Earth Engine. Comput. Electron. Agric. 2022, 197, 106955. [Google Scholar] [CrossRef]
- Zeng, H.; Wu, B.; Wang, S.; Musakwa, W.; Tian, F.; Mashimbye, Z.E.; Poona, N.; Syndey, M. A synthesizing land-cover classification method based on Google Earth engine: A case study in Nzhelele and Levhuvu Catchments, South Africa. Chin. Geogr. Sci. 2020, 30, 397–409. [Google Scholar] [CrossRef]
- Dou, J.; Yunus, A.P.; Bui, D.T.; Merghadi, A.; Sahana, M.; Zhu, Z.; Chen, C.-W.; Khosravi, K.; Yang, Y.; Pham, B.T. Assessment of advanced random forest and decision tree algorithms for modeling rainfall-induced landslide susceptibility in the Izu-Oshima Volcanic Island, Japan. Sci. Total. Environ. 2019, 662, 332–346. [Google Scholar] [CrossRef] [PubMed]
- Peng, C.; Cheng, J.; Cheng, Q. A supervised learning model for high-dimensional and large-scale data. ACM Trans. Intell. Syst. Technol. 2016, 8, 1–23. [Google Scholar] [CrossRef]
- Tang, S. The box office prediction model based on the optimized XGBoost algorithm in the context of film marketing and distribution. PLoS ONE 2024, 19, e0309227. [Google Scholar] [CrossRef]
- Gorishniy, Y.; Kotelnikov, A.; Babenko, A. Tabm: Advancing tabular deep learning with parameter-efficient ensembling. arXiv 2024, arXiv:2410.24210. [Google Scholar]
- Gorishniy, Y.; Rubachev, I.; Khrulkov, V.; Babenko, A. Revisiting deep learning models for tabular data. Adv. Neural Inf. Process. Syst. 2021, 34, 18932–18943. [Google Scholar]
- Huang, D.; Zhou, Z.; Zhang, Z.; Dai, Q.; Lu, H.; Li, Y.; Huang, Y. Land Use/Land Cover Remote Sensing Classification in Complex Subtropical Karst Environments: Challenges, Methodological Review, and Research Frontiers. Appl. Sci. 2025, 15, 9641. [Google Scholar] [CrossRef]
- Gao, L.; Luo, J.; Xia, L.; Wu, T.; Sun, Y.; Liu, H. Topographic constrained land cover classification in mountain areas using fully convolutional network. Int. J. Remote Sens. 2019, 40, 7127–7152. [Google Scholar] [CrossRef]
- Xu, S. Employing Optical and SAR Imagery for Enhanced Mapping of Vegetation and Crops in Challenging Environments. Ph.D. Thesis, The Hong Kong Polytechnic University, Hong Kong, China, 2024. [Google Scholar]
- Li, Y.; Zhao, R.; Wang, Y. Mapping Ratoon Rice Fields Based on SAR Time Series and Phenology Data in Cloudy Regions. Remote Sens. 2024, 16, 2703. [Google Scholar] [CrossRef]
- Ma, A.; Wan, Y.; Zhong, Y.; Wang, J.; Zhang, L. SceneNet: Remote sensing scene classification deep learning network using multi-objective neural evolution architecture search. ISPRS J. Photogramm. Remote Sens. 2021, 172, 171–188. [Google Scholar]
- Liu, Z.; Xu, X.; Qiao, P.; Li, D. Acceleration for deep reinforcement learning using parallel and distributed computing: A survey. ACM Comput. Surv. 2024, 57, 1–35. [Google Scholar] [CrossRef]
- Verma, S.; Pant, M.; Snasel, V. A comprehensive review on NSGA-II for multi-objective combinatorial optimization problems. IEEE Access 2021, 9, 57757–57791. [Google Scholar]
- Zhang, R.; Alemazkoor, N. Multi-fidelity machine learning for uncertainty quantification and optimization. J. Mach. Learn. Model. Comput. 2024, 5, 77–94. [Google Scholar] [CrossRef]
- Wang, N.; Guan, Y.; Wang, Y.; Fang, Q.; Li, Z.; Dong, J.; Luo, J. Automated Extraction of Impervious Surface Area Using Hyper–Local Samples from Multi-Source Data Fusion Across Economic–Geographic Zones. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 22602–22619. [Google Scholar]
- Qiao, H.; Wang, T.; Wang, P.; Qiao, S.; Zhang, L. A time-distributed spatiotemporal feature learning method for machine health monitoring with multi-sensor time series. Sensors 2018, 18, 2932. [Google Scholar] [CrossRef] [PubMed]
- Jiang, H.; Ku, M.; Zhou, X.; Zheng, Q.; Liu, Y.; Xu, J.; Li, D.; Wang, C.; Wei, J.; Zhang, J.; et al. CropLayer: A high-accuracy 2-meter resolution cropland mapping dataset for China in 2020 derived from Mapbox and Google satellite imagery using data-driven approaches. Earth Syst. Sci. Data Discuss. 2025, 17, 6703–6729. [Google Scholar]
- Fan, Q.; Jiang, M.; Huang, W.; Jiang, Q. Considering spatiotemporal evolutionary information in dynamic multi-objective optimisation. CAAI Trans. Intell. Technol. 2023, 140, 109741. [Google Scholar]
- Tomsett, R.; Preece, A.; Braines, D.; Cerutti, F.; Chakraborty, S.; Srivastava, M.; Pearson, G.; Kaplan, L. Rapid trust calibration through interpretable and uncertainty-aware AI. Patterns 2020, 1, 100049. [Google Scholar] [CrossRef]










| Data Types | Data Name | Resolution | Year | Data Sources |
|---|---|---|---|---|
| Sentinel-2 | Blue (B2) | 10 m | 2020 | https://developers.google.cn/earth-engine/datasets/catalog/sentinel-2?hl=zh-cn (accessed on 5 July 2025) |
| Green (B3) | ||||
| Red (B4) | ||||
| Red edge 1 (B5) | 20 m | |||
| Red edge 2 (B6) | ||||
| Red edge 3 (B7) | ||||
| Near-Infrared (B8) | 10 m | |||
| Narrow Near-Infrared (B8A) | 20 m | |||
| Short-Wave Infrared 2 (B11) | ||||
| Short-Wave Infrared 3 (B12) | ||||
| Sentinel-1 | VV VH | 10 m | Sentinel-1 SAR GRD: C-band Synthetic Aperture Radar Ground Range Detected, log scaling|Earth Engine Data Catalog|Google for Developers | |
| LULC | ESA | 10 m | https://developers.google.com/earth-engine/datasets/catalog/ESA_WorldCover_v100 (accessed on 5 July 2025) | |
| ESRI | 10 m | https://www.arcgis.com/home/item.html?id=cfcb7609de5f478eb7666240902d4d3d (accessed on 5 July 2025) | ||
| CRLC | 10 m | https://github.com/LiuGalaxy/CRLC?tab=readme-ov-file (accessed on 5 July 2025) | ||
| Topographic data | SRTM | 30 m | 2007 | 30-Meter SRTM Elevation Data Downloader |
| Classifier | OA% | kappa | F1 | Recall |
|---|---|---|---|---|
| RF | 94.33 | 0.88 | 0.94 | 0.94 |
| SVM | 94.33 | 0.88 | 0.94 | 0.92 |
| TABM | 94.88 | 0.90 | 0.95 | 0.95 |
| XGBoost | 95.50 | 0.90 | 0.95 | 0.96 |
| NSGA-II-XGBoost | 95.75 | 0.91 | 0.96 | 0.96 |
| Model | OA | Precision | Recall | F1 | IoU | BIoU |
|---|---|---|---|---|---|---|
| RF | 0.7424 | 0.7265 | 0.4064 | 0.5271 | 0.3446 | 0.2729 |
| SVM | 0.7577 | 0.7603 | 0.4167 | 0.5383 | 0.3683 | 0.2867 |
| XGBoost | 0.774 | 0.6494 | 0.7249 | 0.6851 | 0.521 | 0.2738 |
| TABM | 0.828 | 0.7171 | 0.8137 | 0.7624 | 0.616 | 0.2974 |
| NSGA-II-XGBoost | 0.8325 | 0.7321 | 0.798 | 0.7637 | 0.6177 | 0.3049 |
| Model | SPLIT | SHDI |
|---|---|---|
| Ground Truth | 2.5119 | 0.6404 |
| NSGA-II-XGBoost | 3.2952 | 0.6589 |
| XGBoost | 3.5034 | 0.6636 |
| TABM | 4.2615 | 0.6665 |
| SVM | 4.5612 | 0.6803 |
| RF | 4.6595 | 0.7793 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Chen, G.; Wang, Z.; Gui, S.; Zhao, J.; Wang, Y.; Li, L. An NSGA-II-XGBoost Machine Learning Approach for High-Precision Cropland Identification in Highland Areas: A Case Study of Xundian County, Yunnan, China. Remote Sens. 2026, 18, 81. https://doi.org/10.3390/rs18010081
Chen G, Wang Z, Gui S, Zhao J, Wang Y, Li L. An NSGA-II-XGBoost Machine Learning Approach for High-Precision Cropland Identification in Highland Areas: A Case Study of Xundian County, Yunnan, China. Remote Sensing. 2026; 18(1):81. https://doi.org/10.3390/rs18010081
Chicago/Turabian StyleChen, Guoping, Zhimin Wang, Side Gui, Junsan Zhao, Yandong Wang, and Lei Li. 2026. "An NSGA-II-XGBoost Machine Learning Approach for High-Precision Cropland Identification in Highland Areas: A Case Study of Xundian County, Yunnan, China" Remote Sensing 18, no. 1: 81. https://doi.org/10.3390/rs18010081
APA StyleChen, G., Wang, Z., Gui, S., Zhao, J., Wang, Y., & Li, L. (2026). An NSGA-II-XGBoost Machine Learning Approach for High-Precision Cropland Identification in Highland Areas: A Case Study of Xundian County, Yunnan, China. Remote Sensing, 18(1), 81. https://doi.org/10.3390/rs18010081

