Predicting Multiple Traits of Rice and Cotton Across Varieties and Regions Using Multi-Source Data and a Meta-Hybrid Regression Ensemble

Yu Qin; Moughal Tauqir; Xiang Yu; Xin Zheng; Xin Jiang; Nuo Xu; Jiahua Zhang

doi:10.3390/s26020375

,

and

¹

Remote Sensing Information and Digital Earth Center, College of Computer Science and Technology, Qingdao University, Qingdao 266071, China

²

Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

³

Department of Biological and Agriculture Engineering, University of California, Davis, CA 95616, USA

^*

Author to whom correspondence should be addressed.

Sensors2026, 26(2), 375;https://doi.org/10.3390/s26020375

This article belongs to the Topic Digital Agriculture, Smart Farming and Crop Monitoring

Version Notes

Order Reprints

Abstract

Timely and accurate prediction of crop traits is critical for precision breeding and regional agricultural production. Previous studies have primarily focused on single crop yield traits, neglecting other crop traits and variety-specific analyses. To address this issue, we employed a Meta-Hybrid Regression Ensemble (MHRE) approach by using multiple machine learning (ML) approaches as base learners, integrating regional multi-year, multi-variety crop field trials with satellite remote sensing indices, meteorological and phenological data to predict major crop traits. Results demonstrated MHRE’s optimal performance for rice and cotton, significantly outperforming individual models (RF, XGBoost, CatBoost, and LightGBM). Specifically, for rice crop, MHRE achieved highest accuracy for yield trait (R² = 0.78, RMSE = 0.59 t ha⁻¹) compared to the best individual model (XGBoost: R² = 0.76, RMSE = 0.61 t ha⁻¹); traits like effective spike also showed strong predictability (R² = 0.64, RMSE = 27.81 10,000·spike ha⁻¹). Similarly, for cotton, MHRE substantially improved yield trait prediction (R² = 0.82, RMSE = 0.33 t ha⁻¹) compared to the best individual model (RF: R² = 0.77, RMSE = 0.36 t ha⁻¹); bolls per plant accuracy was highest (R² = 0.93, RMSE = 2.27 bolls plant⁻¹). Moreover, rigorous validation confirmed that crop-specific MHRE models are robust across five rice and three cotton varietal groups and are applicable across six distinct regions in China. Furthermore, we applied the SHAP (SHapley Additive exPlanations) method to analyze the growth stages and key environmental factors affecting major traits. Our study illustrates a practical framework for regional-scale crop traits prediction by fusing multi-source data and ensemble machine learning, offering new insights for precision agriculture and crop management.

Keywords:

crop traits prediction; crop varieties; Meta-Hybrid Regression Ensemble; multi-source data fusion; machine learning; remote sensing

Predicting Multiple Traits of Rice and Cotton Across Varieties and Regions Using Multi-Source Data and a Meta-Hybrid Regression Ensemble

Abstract

Article Metrics

Citations

Article Access Statistics