Next Article in Journal
Effect of Nitrogen, Phosphorus and Potassium Fertilization Management on Plant and Soil Properties in Grasslands with Varying Salinity–Alkalinity
Next Article in Special Issue
Integrating Envirotyping and Phenomics for AI-Enabled Multi-Environment Genomic Prediction in Crop Breeding
Previous Article in Journal
Effectiveness of Foliar Silicon Fertilisation on Quality Attributes of Highbush Blueberry (Vaccinium corymbosum L.)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

High-Throughput Evaluation of Cotton Drought Tolerance Using UAV Multispectral Imagery and XGBoost-Based Machine Learning

1
Cotton Research Institute, Xinjiang Academy of Agricultural and Reclamation Science/Northwest Inland Region Key Laboratory of Cotton Biology and Genetic Breeding, Ministry of Agriculture and Rural Affairs, Shihezi 832000, China
2
College of Modern Agriculture and Food Engineering, Xinjiang Vocational University of Technology, Kashgar Prefecture 844004, China
3
College of Computer Science and Information Engineering, Anyang Institute of Technology, Anyang 455000, China
4
National Key Laboratory of Crop Genetic Improvement/College of Agriculture, Huazhong Agricultural University, Wuhan 430070, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Agronomy 2026, 16(5), 526; https://doi.org/10.3390/agronomy16050526
Submission received: 29 December 2025 / Revised: 19 February 2026 / Accepted: 25 February 2026 / Published: 28 February 2026

Abstract

Drought stress severely constrains cotton yield and fiber quality, but conventional evaluation methods are inefficient and time-consuming. To address this, we developed a high-throughput, non-destructive phenotyping framework by integrating UAV-based multispectral remote sensing with machine learning, using 225 upland cotton (Gossypium hirsutum L.) accessions. The accessions were subjected to well-watered (CK) and drought stress (DS) treatments at the flowering and boll-setting stage. Canopy multispectral imagery (Green/Red/Red_edge/Near-infrared bands) was acquired via DJI Mavic 3 Multispectral UAV, and 16 vegetation indices (VIs) were derived. Concurrently, 15 agronomic and fiber quality traits were measured to calculate drought resistance coefficients (DRCs), which were used for principal component analysis (PCA) and comprehensive drought tolerance index (D) construction. Hierarchical clustering categorized the accessions into 6 drought tolerance grades (Groups I–VI). Variable importance analysis identified GNDVI, NGRVI, and NDRE as the most drought-sensitive VIs (% IncMSE > 11). Among four regression models (LR, KNN, LGBM, XGBoost), XGBoost achieved the best performance for D prediction (test set: R2 = 0.785, RMSE = 0.032, MAE = 0.024). This study demonstrates that UAV multispectral data coupled with XGBoost enables accurate, efficient drought tolerance assessment, providing a robust tool for high-throughput germplasm screening and smart agricultural management.

1. Introduction

Water scarcity has emerged as a critical bottleneck constraining global agricultural sustainability. It is estimated that approximately 40% of the world’s arable land is subjected to varying degrees of drought stress, and under the influence of climate change, both the frequency and intensity of drought events continue to escalate [1]. Drought not only directly impairs key physiological processes in crops, such as photosynthesis and nutrient uptake [2], but also causes an average annual yield loss of up to 50% in major staple crops worldwide. Over the past decade, drought-induced economic losses in global food production have exceeded USD 30 billion. In China, Xinjiang Uygur Autonomous Region serves as the nation’s premier production base for high-quality cotton. In 2022, cotton was cultivated on 2.4969 million hectares in Xinjiang, yielding 5.391 million metric tons, accounting for 90.2% of the country’s total cotton output. The region’s cotton industry is not only a cornerstone of local economic development but also vital to the security and stability of China’s textile supply chain. However, Xinjiang features a typical temperate continental arid climate, with mean annual precipitation ranging from merely 28.9 to 194.7 mm, while potential evaporation exceeds 2000 mm annually [3,4]. Compounding this water imbalance, the allocation of water resources to domestic, industrial, and ecological sectors has been steadily increasing in recent years, leading to a continuous decline in water availability for agriculture. Consequently, cotton production in Xinjiang faces increasingly severe water stress, heightening the urgency for innovative strategies to enhance drought resilience.
Cotton is particularly vulnerable to water deficit during the flowering and boll-setting stage (July–August), which represents a critical phenological window. This period coincides with peak flowering, boll development, fiber elongation, and maximum transpirational water loss; thus, water availability directly governs boll number, boll weight, and fiber quality [5]. From a physiological perspective, even short-term water stress rapidly triggers stomatal closure in cotton leaves, leading to significant reductions in net photosynthetic rate (Pn), stomatal conductance (gs), and intercellular CO2 concentration (Ci), thereby impairing photoassimilate accumulation [6]. Prolonged drought further disrupts chloroplast ultrastructure and compromises key photosynthetic enzymes, accelerating premature senescence. Under such conditions, boll-setting intensity declines by 30–50%, and final boll number is reduced by 20–35%, resulting in concurrent losses in both yield and fiber quality [7,8]. For instance, in Xinjiang cotton-growing regions, drought stress lasting more than 15 days during the flowering-boll stage can reduce lint yield by 15–25% and decrease fiber breaking strength by 1.5–2.0 cN/tex [9]. Consequently, the development of accurate and efficient methodologies for evaluating drought tolerance across cotton germplasm collections is of strategic importance, not only for accelerating the breeding of drought-resilient cultivars but also for enabling precision irrigation management and ensuring the long-term sustainability of cotton production in water-scarce regions such as Xinjiang.
In the field of cotton drought tolerance evaluation, conventional approaches have primarily relied on field-based phenotypic observations and laboratory-based physiological and biochemical assays. These methods typically focus on two categories of key indicators: (i) field-expressed phenotypic traits, such as wilting severity, leaf rolling index, plant height growth rate, biomass accumulation, and final yield; and (ii) physiological and biochemical parameters, including leaf relative water content (LWC), SPAD chlorophyll meter readings, activities of antioxidant enzymes—such as peroxidase (POD) and superoxide dismutase (SOD)—and malondialdehyde (MDA) content. These latter metrics serve as indirect proxies for the plant’s water stress response and oxidative defense capacity [7,8]. Although these traditional methods can partially capture genotypic differences in drought tolerance, they are inherently constrained by high labor and resource demands, prolonged assessment cycles, and limited digital integration. Data acquisition and analysis often depend on manual recording and subjective interpretation, which hampers seamless interoperability with modern breeding databases. Consequently, these limitations significantly impede the digitization and scalability of drought-resilient cotton breeding programs [8].
With the advancement of remote sensing technologies and intelligent equipment, crop phenotyping has evolved from ground-based point measurements to aerial surface-based monitoring. Satellite remote sensing emerged among the earliest applications in agriculture, leveraging multispectral data from platforms such as MODIS and Landsat satellites to facilitate regional-scale macro-monitoring of crop growth conditions and water status [10,11,12,13,14,15]. However, MODIS data have a spatial resolution of only 250–1000 m, while Landsat data, though offering a higher resolution of 30 m, suffer from a revisit period of up to 16 days, which is insufficient for capturing rapid drought stress responses during the flowering and boll-setting stage. Moreover, satellite imagery is susceptible to weather factors such as cloud cover and atmospheric scattering, leading to discontinuous monitoring.
The emergence of UAV-based remote sensing offers an ideal solution to these limitations. Compared to satellite platforms, UAVs offer three key advantages: high spatial resolution, flexible and short data acquisition intervals, and non-destructive large-area monitoring. Equipped with multispectral cameras, UAVs can acquire data at resolutions ranging from 2 to 20 cm. For instance, a single cotton field (10 ha) can be imaged within 1–2 h, enabling the calculation of vegetation indices (e.g., NDVI, WDRVI, RVI), which facilitate the non-destructive estimation of critical parameters related to cotton water status (leaf water content, root zone soil moisture), and biomass accumulation. This approach enhances monitoring efficiency by 50–100 times compared to traditional methods [16]. Zhang et al. [17] demonstrated the rapid diagnosis of water stress during the flowering and boll-setting stage using UAV-derived thermal infrared imagery to calculate canopy temperature standard deviation (CTSD). Their model achieved a coefficient of determination (R2) of 0.884. Similarly, Yan et al. [18] developed models to estimate SPAD values and leaf water content in cotton using radial basis function algorithms based on UAV multispectral data. These models yielded R2 values of 0.8488 and 0.9366, respectively, with RMSE of only 2.005 and 0.930. These studies provide robust technical support for the quantitative evaluation of cotton drought tolerance.
Building upon these advances, this study focuses on the flowering and boll-setting stage—cotton’s peak water sensitivity period—and aims to establish a high-throughput digital framework for drought tolerance screening in cotton using UAV-based multispectral remote sensing. The central hypothesis of this study is that optimal vegetation indices (VIs) derived from UAV multispectral imagery can effectively characterize cotton drought tolerance at the canopy level, and integrating these VIs into machine learning algorithms will enable accurate prediction of comprehensive drought tolerance in cotton. The specific aim is to develop an end-to-end digital pipeline that realizes rapid, non-destructive, and scalable high-throughput phenotyping for cotton drought tolerance, providing technical support for germplasm screening and breeding.

2. Materials and Methods

This study presents an integrated workflow for the acquisition and preprocessing of UAV-based data to construct a cotton drought tolerance prediction dataset. The platform employed was the DJI Mavic 3 Multispectral UAV (DJI Technology Co., Ltd., Shenzhen, China), selected for its high spatial resolution, spectral fidelity, and operational flexibility in field phenotyping. During the flowering and boll-setting stage, multispectral imagery was collected concurrently with ground-truth phenotypic measurements—including yield-related traits (e.g., plant height, boll number, seed cotton yield) and fiber quality parameters-across experimental plots under both well-watered and drought-stressed conditions (Figure 1A).The raw imagery underwent a standardized preprocessing pipeline comprising three key steps: (i) radiometric correction, using onboard active sunlight sensors and calibrated reflectance panels to convert digital numbers to surface reflectance; (ii) geometric correction, through structure-from-motion (SfM) photogrammetry to generate georeferenced orthomosaics with sub-decimeter accuracy; and (iii) extraction of cotton canopy regions of interest (ROIs), achieved by combining NDVI-based vegetation masking (threshold > 0.3) with manual plot delineation to exclude soil background and non-target elements. Subsequently, a machine learning–based predictive modeling framework was developed by fusing multispectral-derived vegetation indices with multiple regression algorithms—including linear regression (LR), k-nearest neighbors (KNN), LightGBM (LGBM), and XGBoost—to enable real-time inference of cotton drought tolerance directly from field-collected UAV data. Model performance was rigorously evaluated using three standard metrics: the coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE) (Figure 1B).

2.1. Materials and Experimental Design

The study utilized 225 upland cotton accessions sourced from the germplasm repository of Xinjiang Agricultural Reclamation Academy. Among these accessions, two originated from the United States, two from the former Soviet Union, and the remainder from various regions within China, including Xinjiang, Henan, Hebei, Jiangsu, and 13 other provinces and municipalities (Table S1).
The drought tolerance experiment was conducted from April to October 2024 at the Experimental Station of Xinjiang Agricultural Reclamation Academy, China (86°0′3.920″ E, 44°18′23.911″ N) (Figure 2A). This site features an arid climate with negligible natural precipitation during the growing season and sandy loam soil. A completely randomized block design (CRBD) was employed, consisting of two treatments: DS and CK, each with two replicates. Mulched drip irrigation, with a membrane width of 2.08 m. 6 rows of cotton are planted on one mulch. Each accession was planted in two rows, with a row length of 4.0 m, row spacing of 0.67 m, and plant spacing of 0.10 m. Drip irrigation beneath plastic mulch was used for water management. Seeds were sown in mid-April. During the flowering and boll-setting stage, the CK treatment received regular irrigation every 10 days (10 h per irrigation event, with an application rate of 35 m3 per mu per irrigation), while the DS treatment experienced two periods of water deficit, The soil percent moisture content of the CK and DS treatments was measured using the five-point sampling method, which was 38.65% and 23.22% respectively (0–60 cm of soil). Soil moisture content in the 0–60 cm layer was determined using the gravimetric method combined with the five-point sampling approach, yielding values of 38.65% for the well-watered control (CK) and 23.22% for the drought-stressed (DS) treatment. Specifically, soil samples were collected from three replicates per treatment. The gravimetric measurement followed standard procedures: an aluminum box was first dried in an oven at 105 °C ± 5 °C and weighed (m0); approximately 10 g of fresh soil was then added to the box, and the combined weight was recorded as m1; the box with soil was returned to the oven and dried until constant weight was achieved, after which the final weight (m2) was recorded. Soil moisture content (%) was calculated as ×100. High-throughput phenotypic data collection via drones was conducted three days before the end of the water control period for both CK and DS treatments. Following the drought stress period, normal irrigation was resumed for all plots according to the same schedule as the CK treatment.

2.2. High-Throughput Data Acquisition

Data collection encompassed both canopy-level phenotypic data and UAV-based multispectral imagery from both DS and CK treatments. The experiment utilized the DJI Mavic 3 Multispectral UAV (DJI Technology Co., Ltd., Shenzhen, China), which features one visible light sensor and four multispectral sensors centered at 560 nm (green), 650 nm (red), 730 nm (red edge), and 840 nm (near-infrared) wavelengths. This platform supports high-resolution imaging and precise geolocation, fulfilling the requirements for detailed data analysis. Additionally, it is equipped with a TimeSync system that ensures centimeter-level positioning accuracy, thereby enhancing the spatial alignment of captured images. The integrated top-of-aircraft irradiance sensor collects solar irradiance data, enabling radiometric calibration to mitigate environmental light variations and improve the consistency and accuracy of data collected at different times. To ensure data quality, multispectral image acquisition was conducted under clear, windless conditions around midday. The UAV followed pre-programmed flight paths autonomously to complete image capture. Specific flight parameters are summarized in Table 1.

2.3. Cotton Phenotypic Trait Measurements

A suite of agronomic and yield-related traits was measured in each experimental plot, including: plant height (PH, cm); fruiting branch number per plant (FBN); non-effective fruiting branch number per plant (Non-FBN); boll number per plant (BN); height of the first fruiting node (HFNFB, cm); and node position of the first fruiting branch (FNFB). At maturity, all naturally opened bolls within each plot were manually harvested. After ginning and seed-cotton processing, the following yield components were calculated seedcotton yield (SY, kg ha−1), boll weight (BW, g), lint percentage (LP, %), and seed index (SI, g; defined as the weight of 100 seeds). Fiber quality traits were analyzed by China Colored Cotton Co., Ltd. (Ürümqi, China) using an HFT9000 high-volume instrument system (HVITM). The measured parameters included fiber length (FL, mm), fiber uniformity (FU, %), micronaire (FM), fiber strength (FS, cN/tex), and fiber elongation (FE, %) (Figure 1A).

2.4. Image Processing

The raw multispectral images were subjected to preprocessing using Pix4Dmapper 1.86.0 (SA, Lausanne, Switzerland; https://pix4d.com/, accessed on 11 January 2025). Prior to the image stitching process, geometric refinement was carried out by utilizing ground-surveyed ground control points (GCPs). This refinement step aimed to rectify lens distortion and improve spatial accuracy, thereby facilitating the generation of a standardized digital orthophoto map (DOM). Subsequently, radiometric calibration was implemented using the reflectance panel method. In this method, pixel digital numbers were transformed into surface reflectance values. This transformation was based on in situ measurements obtained from calibrated white reference panels that were deployed during each flight. As a result, atmospherically corrected, reflectance-calibrated imagery was produced across all spectral bands. Finally, plot-level reflectance data extraction was conducted using ArcGIS 10.3.1 (Esri Inc., Redlands, CA, USA; https://www.esri.com/en-us/arcgis/about-arcgis, accessed on 19 January 2025). Vector boundaries, which were predefined according to the field layout and corresponded to experimental plots, were overlaid onto the DOM. This overlay operation precisely delineated the regions of interest. The masked reflectance dataset generated through this process served as the fundamental basis for subsequent computation of vegetation indices and phenotypic analysis. The complete image processing workflow is depicted in Figure 2B.

2.5. Selection of Vegetation Indices

Single-band reflectance values retrieved from multispectral imagery were combined with sixteen vegetation indices (VIs), and the Increase in Mean Squared Error percentage (IncMSE%) metric in the random forest algorithm was employed to assess variable importance. This metric quantifies the percentage increase in the model’s prediction error (Mean Squared Error, MSE) when the values of a given predictor variable are randomly permuted across out-of-bag samples, where a higher IncMSE% value denotes a greater contribution of the corresponding variable to model performance. Five indices with a strong correlation with the D value were thus identified for further model construction and predictive analysis. Model inversion and drought stress prediction for cotton were subsequently conducted using these optimally selected vegetation indices. The mathematical formulas for calculating the aforementioned vegetation indices are presented in Table 2.

2.6. Drought Tolerance Evaluation in Cotton

Descriptive statistics, including maximum, minimum, mean, standard deviation, and coefficient of variation, were computed for all ground truth phenotypic traits using R version 4.3.0 (R Core Team, Boston, MA, USA, 2020). The drought resistance coefficient (DRC) for each trait was calculated as:
D R C = ( Y d r o u g h t   s t r e s s ) / ( Y c o n t r o l )
The comprehensive drought tolerance index (D) for upland cotton was subsequently calculated using SPSS software (version 31.0; IBM Corp., Armonk, NY, USA), based on the sequential integration of P C A , W i and U ( X i ) .
W i = R i / R i i = 1 ,   2 ,   ,   n
where Ri denotes the contribution rate of the i-th principal component, defined as the proportion of total variance explained by that component.
U ( X i ) = ( X i X min ) / ( X max X min ) i = 1 ,   2 ,   ,   n
where Xi denotes the score of the i-th principal component, Xmax: maximum score of the i-th principal component, Xmin: minimum score of the i-th principal component.
D = ( U i × W i ) i = 1 ,   2 ,   ,   n

2.7. Analytical Methods

To achieve precise prediction of the cotton drought tolerance index (D)—a continuous phenotypic trait—we framed the problem as a supervised regression task and conducted a comprehensive evaluation of four classical machine learning regression algorithms: Linear Regression (LR), k-Nearest Neighbors (KNN), Light Gradient Boosting Machine (LGBM), and XGBoost. Model training and evaluation were performed on a carefully curated dataset, with samples randomly partitioned into training (70%) and testing (30%) sets using a fixed random seed (random_state = 42) to ensure reproducibility. To optimize hyperparameters while mitigating overfitting, 5-fold cross-validation was applied exclusively to the training set (via GridSearchCV in scikit-learn v1.7.2), with the root mean square error (RMSE) minimized as the primary tuning objective. This procedure ensured no data leakage, preserving the test set as an independent benchmark for final model assessment. The hyperparameter settings for the different models are as follows:
-
LR: fit_intercept = True, normalize = False;
-
KNN: n_neighbors = 7, metric = ‘manhattan’;
-
LGBM: learning_rate = 0.08, max_depth = 5, n_estimators = 500, subsample = 0.8;
-
XGBoost: learning_rate = 0.1, max_depth = 6, n_estimators = 600, colsample_bytree = 0.7, reg_alpha = 0.1, reg_lambda = 0.2.
The 5-fold cross-validation was stratified by drought tolerance grade (Groups I–VI) to maintain consistent grade distribution across folds, avoiding bias from unbalanced class distribution. All analyses were performed using Python 3.9.16, with key libraries and versions: scikit-learn 1.7.2, XGBoost 2.0.3, LightGBM 4.1.0, Pandas 2.1.4, NumPy 1.26.0. Through extensive comparative analysis across multiple experimental iterations, the algorithm with the highest suitability for predicting the D value was pinpointed, establishing a sturdy and reliable modeling framework for future cotton drought tolerance assessments. The detailed implementation workflow is visually presented in Figure 2C.

2.8. Model Evaluation Metrics

In this study, model performance was evaluated using three complementary metrics: the coefficient of determination (R2), the root mean square error (RMSE), and the mean absolute error (MAE). Among these, R2 served as the primary indicator of overall model fit, representing the proportion of variance in the observed target variable that is explained by the model predictions. Values of R2 closer to 1 indicate stronger explanatory power and better agreement between predicted and measured values. RMSE and MAE provide additional perspectives on prediction accuracy by quantifying the average magnitude of errors in the same units as the target variable; lower values of both metrics signify higher predictive precision. Together, a high R2 combined with low RMSE and MAE reflects accurate, reliable, and well-calibrated model performance. All statistical analyses and model evaluations were implemented in Python, primarily using the scikit-learn (sklearn) library (version 1.7.2).

3. Results

3.1. Comprehensive Evaluation of Drought Resistance in Cotton

For the statistical analysis of the data presented in Table 3, a paired samples t-test was applied to compare significant differences between the control (CK) and drought stress (DS) treatments. The unit of replication was biological, with three independent experimental plots assigned to each treatment; all measurements within each plot were averaged to form one replicate for statistical testing. The significance level was set at p < 0.05 for all analyses. The t-test indicates, under drought stress, yield-related traits including plant height (PH), fruit branch number (FBN), boll number (BN), seed cotton yield (SY), boll weight (BW), and lint percentage (LP) were significantly decreased (p < 0.001), while stem diameter (SI) showed a significant reduction (p < 0.05). For plant-type-related traits, the node number of the first fruiting branch and the height of the first fruiting branch node were significantly increased under water stress (p < 0.001). Regarding fiber quality-related traits, fiber length (FL), fiber uniformity (FU), fiber strength (FS), and fiber elongation (FE) were significantly decreased under water stress (p < 0.001), whereas fiber maturity (FM) was significantly increased. In summary, water deficit at the flowering and boll-setting stage significantly inhibited cotton yield formation and fiber quality development (Table 3). Drought Resistance Coefficients (DRC) were calculated for correlation analysis, which revealed that BW exhibited an extremely significant positive correlation with SY (p < 0.001) but an extremely significant negative correlation with LP (p < 0.001). Additionally, FM was extremely significantly negatively correlated with FL and FS (p < 0.001), while FL showed an extremely significant positive correlation with FS and FE (p < 0.001) (Figure S1). To further evaluate cotton drought resistance, principal component analysis (PCA) was performed using DRC, identifying 6 principal components (PCs) with a cumulative variance contribution rate of 67.32% (KMO = 0.63, Sig < 0.001, indicating that the data are suitable for principal component analysis. Principal components were selected based on eigenvalues greater than 1). Comprehensive drought resistance evaluation values (D) were sequentially calculated using Equations (1)–(3) for subsequent model construction (Tables S1 and S2) [35,36,37].

3.2. Cluster Evaluation of Drought Resistance in Cotton

Cluster analysis was performed on 225 cotton accessions, and the results indicated that these materials could be clearly classified into 6 drought resistance grades (Figure S2 and Table S1). The number and proportion of materials in each grade are summarized as follows: highly drought-resistant grade (Grade I): 14 accessions, accounting for 6.22% of the total; drought-resistant grade (Grade II): 47 accessions, representing 20.89% of the total; moderately drought-resistant grade (Grade III): 58 accessions, accounting for 25.78% of the total; drought-sensitive grade (Grade IV): 45 accessions, representing 20.00% of the total; highly drought-sensitive grade (Grade V): 47 accessions, accounting for 20.89% of the total; and extremely drought-sensitive grade (Grade VI): 14 accessions, representing 6.22% of the total (Table S1). From the perspective of grade distribution characteristics, the 6 drought resistance grades exhibited a gradient distribution pattern of “more in the middle and fewer at both ends”. Specifically, the moderately drought-resistant grade (Grade III) had the highest proportion, while the highly drought-resistant grade (Grade I) and extremely drought-sensitive grade (Grade VI) had the lowest and identical proportions. Additionally, the drought-resistant grade (Grade II) and highly drought-sensitive grade (Grade V) shared the same proportion. This distribution pattern aligns with the biological principle observed in phenotypic evaluations of plant stress resistance, wherein the majority of materials display intermediate phenotypes, whereas only a minority exhibit extreme phenotypic expressions.

3.3. Feature Selection of Vegetation Indices for Drought Resistance in Cotton

Multispectral images were employed to gather data from the green (G), red (R), red-edge (RE), and near-infrared (NIR) spectral bands, as depicted in Figure 3A. Leveraging these bands, a total of 16 typical vegetation indices (VIs) were extracted, namely GNDVI, NGRVI, NDRE, MSAVI, SAVI, NLI, TDVI, IPVI, LCI, NDVI, OSAVI, WDRVI, RVI, DVI, EVI, and MSRI. These indices served as multi-dimensional indicators, facilitating the quantitative analysis of vegetation characteristics. The results, as shown in Figure 3B, revealed that vegetation indices such as NGRVI (with a % IncMSE of 20.12), GNDVI (17.84), NDRE (11.81), SAVI (11.61), and NLI (11.09) made substantial contributions to the model, as evidenced by their relatively high % IncMSE values. These indices played pivotal roles in characterizing drought stress. On the contrary, indices including OSAVI (4.81), MSAVI (4.48), DVI (4.14), and WDRVI (0.67) had low % IncMSE values and were of insufficient significance, indicating a lower capacity to explain vegetation characteristics under drought stress conditions. In conclusion, through the derivation of multiple types of VIs from multispectral bands and the integration of variable importance analysis, the core values of significant indices in vegetation monitoring were identified. This study offers a scientific foundation for the selection of vegetation indices in subsequent applications, such as precision agriculture and vegetation stress assessment, thereby contributing to the advancement of relevant fields.

3.4. Accuracy Evaluation of Prediction Models

In this study, four distinct models were employed to predict the target variable: linear regression (LR), k-nearest neighbors (KNN), light gradient boosting machine (LGBM), and XGBoost. The performance of these models was assessed using the coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE) on both the training and test sets. For the LR model, on the training set, the R2, RMSE, and MAE values were 0.549, 0.050, and 0.041, respectively. On the test set, the R2 value dropped to 0.523, while the RMSE and MAE were 0.040 and 0.033, respectively. Although the data points in the scatter plot were distributed around the diagonal line, the degree of dispersion was relatively high. This suggests that the LR model had a limited capacity to fit linear relationships between variables and faced challenges in capturing complex non-linear correlations (Figure 4A). The KNN model outperformed the LR model. On the training set, it achieved an R2 of 0.747, an RMSE of 0.034, and an MAE of 0.023. The data points were closer to the diagonal, reflecting its advantage in fitting local patterns. However, the model’s performance significantly declined on the test set, with an R2 of 0.664, an increased RMSE of 0.047, and an MAE of 0.029. This indicates that the KNN model had a certain degree of overfitting tendency, resulting in limited generalization ability when dealing with data outside the training set (Figure 4B). Notably, the LGBM model demonstrated a substantial improvement in performance. On the training set, it obtained an R2 of 0.858, an RMSE of 0.026, and an MAE of 0.019. The data points were closely distributed around the diagonal, indicating excellent fitting performance. On the test set, the R2 was 0.761, the RMSE was 0.037, and the MAE was 0.026. Although these values were slightly lower than those on the training set, the model still maintained high accuracy, showcasing strong generalization ability (Figure 4C). The XGB model achieved the optimal overall performance. Its training set metrics were comparable to those of the LGBM model, with an R2 of 0.868, an RMSE of 0.026, and an MAE of 0.018 (Figure 4D). On the test set, the XGBoost model reached an R2 of 0.785, with the RMSE reduced to 0.032 and the MAE of 0.022. It outperformed the other three models across all evaluation indicators. Additionally, the data points exhibited minimal dispersion, indicating that the XGBoost model achieved a better balance between fitting complex relationships and controlling overfitting (Figure 4E). Overall, the XGBoost model demonstrated the best performance on the test set and can be recommended as the optimal model for predicting cotton drought resistance.

3.5. Evaluation of Cluster Prediction Performance

An in-depth analysis of the line charts and scatter plots presented in Figure 5 revealed that the predicted and measured data for the six clusters exhibited a generally high level of consistency. The correlation coefficient (R2) ranged from 0.67 to 0.89, which serves as a strong indicator of the model’s robust predictive capacity. In particular, Cluster II (R2 = 0.89) and Cluster V (R2 = 0.88) demonstrated the most outstanding fitting performance (Figure 5B,E). The predicted values in these clusters were in excellent agreement with the measured values. In the scatter plots corresponding to these clusters, the data points were densely clustered around the y = x line, and the regression slope was very close to 1. This clearly illustrates that the model possesses high accuracy and stability when applied to these two clusters. For Clusters I, III, and VI, the R2 values were 0.71, 0.70, and 0.80, respectively (Figure 5A,C,F). These values indicate a good fitting performance, although there were some minor deviations. It should be noted that Cluster IV had the lowest R2 value (R2 = 0.67) (Figure 5D). In this cluster, the prediction errors were relatively large, and the data points were scattered, suggesting that this cluster was significantly influenced by noise or non-linear factors (Figure 5J). An examination of the trend line equations revealed that the regression coefficients varied considerably among the different clusters. Moreover, some clusters exhibited intercept shifts, which implies the potential presence of systematic bias in the model. In summary, the model demonstrated a strong predictive ability for the majority of the clusters, especially for Clusters II and V (Figure 5H,K). However, for Cluster IV, further optimization of the algorithm or the incorporation of additional features is necessary to enhance the prediction accuracy.

4. Discussion

4.1. Coupled Effects of Drought Stress on Cotton Phenotypic Traits and Canopy Spectral Characteristics

Drought, recognized as one of the most severe abiotic stress factors, exerts a substantial inhibitory effect on the growth and development of cotton. Our research findings demonstrate that under drought stress, yield traits such as plant height (PH), fruit branch number (FBN), boll number (BN), seed cotton yield (SY), boll weight (BW), lint percentage (LP), and stem diameter (SI) experience a significant decline. Conversely, the number of non-fruiting branches (Non-FBN) shows a marked increase. Simultaneously, prolonged water deficit results in notable reductions in fiber quality traits, including fiber length (FL), fiber strength (FS), fiber uniformity (FU), and fiber elongation (FE), along with a significant rise in fiber maturity (FM) (p < 0.001) [9]. Water deficit inflicts damage on the chloroplast structure of cotton, accelerates the degradation of chlorophyll, and impairs the formation of yield. Moreover, it disrupts the elongation of fiber cells and the thickening of the secondary wall, thereby compromising fiber quality [38,39]. It is noteworthy that these physiological changes induce spectral signals at the canopy level that can be detected through remote sensing. Based on multispectral imagery acquired from an unmanned aerial vehicle (UAV), this study reveals that under drought stress, the reflectance of the red band exhibits a slight increase, while the reflectance of the near-infrared (NIR) band decreases significantly. This leads to a systematic shift in vegetation indices (VIs). Further variable importance analysis indicates that GNDVI, NGRVI, and NDRE are the most sensitive to drought responses, with % IncMSE values of 17.84, 20.12, and 11.81, respectively. GNDVI, which incorporates the green band (560 nm), is highly responsive to changes in chlorophyll content [21]. NGRVI reflects the photosynthetic activity of the canopy through the ratio of the NIR to green bands. NDRE, based on the red edge band (730 nm), is capable of effectively detecting subtle changes in chlorophyll concentration and the internal structure of leaves [23,27]. Collectively, these three indices delineate the core physiological pathway activated by drought stress, which initiates with reduced photosynthetic capacity, progresses through chlorophyll degradation, and ultimately results in impaired biomass accumulation. In contrast, traditional indices such as NDVI demonstrated relatively low contributions in this study, mainly due to their susceptibility to interference from the soil background or canopy saturation effects. Under severe drought stress, NDVI tends to saturate due to reduced sensitivity to further declines in canopy greenness, while during early growth stages or under sparse vegetation cover, its values are strongly confounded by soil background reflectance. Consequently, vegetation indices less sensitive to soil effects or specifically designed for stressed conditions outperformed NDVI in our study.

4.2. Generality of the Drought Resistance Evaluation Model

The generalizability of prediction models is essential for translating experimental findings into large-scale field applications. In this study, the developed extreme gradient boosting (XGBoost) model demonstrated robust performance under sandy loam soil conditions in Shihezi (test set: R2 = 0.785, RMSE = 0.032). This strong generalization capability stems from two key attributes: (1) the genetically diverse training population enhanced model adaptability to genotype-specific responses; and (2) the selected core vegetation indices (GNDVI, NGRVI, NDRE) capture universal physiological responses of cotton to drought stress, conferring broad applicability across cultivars.
However, the model’s applicability exhibits clear limitations. First, performance is critically dependent on growth stage. Constructed exclusively with data from the cotton flowering and boll-setting stage, the model leverages conditions specific to this phenological phase. During this stage, canopy closure typically exceeds 70%, effectively minimizing soil background interference, while drought stress impacts on photosynthesis and boll formation are maximally expressed [10,40,41,42]. Second, environmental adaptability is currently constrained. Validation was performed solely within the Shihezi reclamation area. As climate and soil texture vary regionally, these factors may alter spectral-phenotypic relationships [43,44,45,46]. Future experiments will be conducted under other environments, soil types, and growth stages to further enhance the model’s generalizability.
Notably, the UAV-based multispectral drought resistance evaluation model (with XGBoost as the optimal model) demonstrated high stability across cotton varieties from different sources. This stability is primarily attributed to the variety-wide universality of the key VIs (GNDVI, NGRVI, NDRE): these indices capture common canopy changes induced by drought stress. Although different varieties exhibit variations in the intensity of physiological responses to drought, the underlying patterns of spectral characteristic changes remain consistent. The flowering and boll-setting stage is particularly suitable for model application because: (1) canopy closure exceeds 70%, reducing soil background interference; and (2) drought-induced impacts on photosynthesis and boll formation are significantly reflected in canopy spectra, enabling the model to stably capture these associations (XGBoost model R2 = 0.785 at the flowering and boll-setting stage).

5. Conclusions

This study established a high-throughput, non-destructive framework for evaluating cotton drought tolerance at the flowering and boll-setting stage, using UAV multispectral remote sensing and XGBoost. Key conclusions are as follows: (1) Drought stress significantly reduced yield-related traits (PH, FBN, BN, SY) and fiber quality (FL, FS, FU, FE), while increasing Non-FBN and FM; (2) GNDVI, NGRVI, and NDRE were identified as the most drought-sensitive VIs, effectively capturing chlorophyll degradation and photosynthetic capacity reduction; (3) The XGBoost model achieved the highest prediction accuracy for the comprehensive drought tolerance index (D) (test set: R2 = 0.785, RMSE = 0.032); (4) 225 cotton accessions were classified into 6 drought tolerance grades, with high consistency (R2 > 0.88) with the membership function method. This framework enables rapid and accurate drought tolerance assessment, providing technical support for high-throughput germplasm screening, precision irrigation, and smart cotton breeding. However, the single environmental condition limits the universality of this model. In the subsequent experiments, multi-year and multi-site verification will be conducted to enhance the universality of the model.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agronomy16050526/s1, Table S1:This study involved 225 phenotypic traits of Upland Cotton. Table S2: Principal Component Analysis of Drought Resistance Coefficients of Upland Cotton. Figure S1: Correlation of drought resistance coefficient. Figure S2: D-value clustering figure.

Author Contributions

Methodology, F.Z., T.Y., W.W., W.H., G.W., X.W., X.Y. and Y.Y.; software, F.Z. and T.Y.; investigation, F.Z., T.Y., W.H., J.Q., X.K., L.L., A.S. and F.W.; writing—original draft preparation, F.Z. and T.Y.; writing—review and editing, F.Z., T.Y., X.W., X.Y. and Y.Y.; resources, F.Z., X.W. and Y.Y.; data curation, F.Z., T.Y., W.H., J.Q., X.K., L.L., A.S. and F.W.; formal analysis, F.Z.; visualization, T.Y. and F.Z.; conceptualization, X.W., X.Y. and Y.Y.; project administration, Y.Y.; All authors have read and agreed to the published version of the manuscript.

Funding

This project was supported by the Corps Key Areas of Science and Technology Research Program (2024AB001 and 2022TSYCTD0022) and the Xinjiang Academy of Agricultural and Reclamation Sciences Research Program (2024YSRC07).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zhang, W.; Xu, H.; Duan, X.; Hu, J.; Li, J.; Zhao, L.; Ma, Y.P. Characterizing the Leaf Transcriptome of Chrysanthemum rhombifolium (Ling et C. Shih), a Drought Resistant, Endemic Plant from China. Front. Genet. 2021, 12, 625985. [Google Scholar] [CrossRef]
  2. Hao, P.; Lin, B.; Ren, Y. How Antioxidants, Osmoregulation, Genes and Metabolites Regulate the Late Seeding Tolerance of Rapeseeds under Low-Temperature Stress. Antioxidants 2023, 12, 1915. [Google Scholar] [CrossRef] [PubMed]
  3. Maimaitijiang, M.; Ghulam, A.; Sidike, P.; Hartling, S.; Maimaitiyiming, M.; Peterson, K.; Shavers, E.; Fishman, J.; Peterson, J.; Kadam, S.; et al. Unmanned Aerial System (UAS)-based phenotyping of soybean using multi-sensor data fusion and extreme learning machine. ISPRS J. Photogramm. Remote Sens. 2017, 134, 43–58. [Google Scholar] [CrossRef]
  4. Zonta, J.H.; Brandão, Z.N.; Rodrigues, J.I.D.S.; Sofiatti, V. Cotton response to water deficits at deficits at different growth stages. Rev. Caatinga 2017, 30, 980–990. [Google Scholar] [CrossRef]
  5. Zafar, S.; Afzal, H.; Ijaz, A.; Mahmood, A.; Ayub, A.; Nayab, A.; Hussain, S.; UL-Hussan, M.; Sabir, M.A.; Zulfiqar, U.; et al. Cotton and drought stress: An updated overview for improving stress tolerance. S. Afr. J. Bot. 2023, 161, 258–268. [Google Scholar] [CrossRef]
  6. Zhang, H.; Ni, Z.; Chen, Q. Proteomic responses of drought-tolerant and drought-sensitive cotton varieties to drought stress. Mol. Genet. Genom. 2016, 291, 1293–1303. [Google Scholar] [CrossRef]
  7. Hasan, M.M.; Ma, F.; Prodhan, Z.H. Molecular and Physio-Biochemical Characterization of Cotton Species for Assessing Drought Stress Tolerance. Int. J. Mol. Sci. 2018, 19, 2636. [Google Scholar] [CrossRef]
  8. White, J.W.; Andrade-Sanchez, P.; Gore, M.A. Field-based phenomics for plant genetics research. Field Crop. Res. 2012, 133, 101–112. [Google Scholar] [CrossRef]
  9. Niaz, N.; Tang, C. Effect of surface water and underground water drip irrigation on cotton growth and yield under two different irrigation schemes. PLoS ONE 2022, 17, e0274574. [Google Scholar] [CrossRef]
  10. Maes, W.H.; Steppe, K. Perspectives for Remote Sensing with Unmanned Aerial Vehicles in Precision Agriculture. Comput. Electron. Agric. 2018, 24, 152–164. [Google Scholar] [CrossRef]
  11. Zhao, J.Q.; Yan, J.W.; Xue, T.J.; Wang, S.W.; Qiu, X.L.; Yao, X.; Tian, Y.C.; Zhu, Y.; Cao, W.X.; Zhang, X.H. A deep learning method for oriented and small wheat spike detection (OSWSDet) in UAV images. Comput. Electron. Agric. 2022, 198, 107087. [Google Scholar] [CrossRef]
  12. Zhao, L.C.; Guo, W.; Wang, J.; Wang, H.Z.; Duan, Y.L.; Wang, C.; Wu, W.B.; Shi, Y. An Efficient Method for Estimating Wheat Heading Dates Using UAV Images. Remote Sens. 2021, 13, 3067. [Google Scholar] [CrossRef]
  13. Zhu, W.J.; Feng, Z.K.; Dai, S.Y.; Zhang, P.P.; Wei, X.H. Using UAV Multispectral Remote Sensing with Appropriate Spatial Resolution and Machine Learning to Monitor Wheat Scab. Agriculture 2022, 12, 1785. [Google Scholar] [CrossRef]
  14. Thompson, A.L.; Thorp, K.R.; Matthew, C.; Pedro, A.S.; Heun, J.T.; Dyer, J.M.; White, J.W. Deploying a Proximal Sensing Cart to Identify Drought-Adaptive Traits in Upland Cotton for High-Throughput Phenotyping. Front. Plant Sci. 2018, 9, 507. [Google Scholar] [CrossRef]
  15. Ravichandran, P.; Singh, K.D.; Randhawa, H.S.; Dhariwal, R.; Sangha, J.S.; Ellert, B.; Wang, H.; Chegoonian, A.; Natarajan, M. High-Throughput Screening of Wheat Genotypes for Drought Tolerance Using Aerial Thermal Imagery. In Proceedings of the 2025 13th International Conference on Agro-Geoinformatics (Agro-Geoinformatics), Boulder, CO, USA, 7–10 July 2025. [Google Scholar]
  16. Ludovisi, R.; Tauro, F.; Salvati, R.; Khoury, S.; Scarascia, G.M.; Harfouche, A. UAV-Based Thermal Imaging for High-Throughput Field Phenotyping of Black Poplar Response to Drought. Front. Plant Sci. 2017, 8, 1681. [Google Scholar] [CrossRef]
  17. Zhang, Z.; Bian, J.; Han, W.; Fu, Q.; Chen, S.; Cui, T. Diagnosis of cotton water stress using canopy temperature characteristic parameters calculated from UAV thermal infrared images. Trans. Chin. Soc. Agric. Eng. 2018, 34, 77–83. [Google Scholar] [CrossRef]
  18. Yan, C.C.; Qu, Y.Y.; Chen, Q.J.; Wu, H.Q.; Zhang, B.; Peng, H.L.; Chen, Q. Estimation of cotton SPAD value and leaf water content based on UAV multispectral imagery. Trans. Chin. Soc. Agric. Eng. 2023, 39, 61–67. [Google Scholar] [CrossRef]
  19. Zhang, X.; Zhang, F.; Qi, Y.; Deng, L.; Wang, X.; Yang, S. New research methods for vegetation information extraction based on visible light remote sensing images from an unmanned aerial vehicle (UAV). Int. J. Appl. Earth Obs. Geoinf. 2019, 78, 215–226. [Google Scholar] [CrossRef]
  20. Rouse, J.W.; Haas, R.H.; Schell, J.A.; Deering, D.W. Monitoring Vegetation Systems in the Great Plains with ERTS; NASA: Washington, DC, USA, 1974.
  21. Gitelson, A.A.; Merzlyak, M.N. Remote estimation of chlorophyll content in higher plant leaves. Int. J. Remote Sens. 1997, 18, 2691–2697. [Google Scholar] [CrossRef]
  22. Gitelson, A.A. Wide Dynamic Range Vegetation Index for Remote Quantification of Biophysical Characteristics of Vegetation. J. Plant Physiol. 2004, 161, 165–173. [Google Scholar] [CrossRef]
  23. Gitelson, A.A.; Gritz, Y.; Merzlyak, M.N. Relationships between leaf chlorophyll content and spectral reflectance and algorithms for non-destructive chlorophyll assessment in higher plant leaves. J. Plant Physiol. 2003, 160, 271–282. [Google Scholar] [CrossRef] [PubMed]
  24. Qi, J.G.; Chehbouni, A.R.; Huete, A.R.; Kerr, Y.H.; Sorooshian, S. A Modified Soil Adjusted Vegetation Index. Remote Sens. Environ. 1994, 48, 119–126. [Google Scholar] [CrossRef]
  25. Crippen, R.E. Calculating the vegetation index faster. Remote Sens. Environ. 1990, 34, 71–73. [Google Scholar] [CrossRef]
  26. Leblanc, S.G.; Chen, J.M.; Fernandes, R.; Deering, D.W.; Conley, A. Methodology comparison for canopy structure parameters extraction from digital hemispherical photography in boreal forests. Agric. For. Meteorol. 2005, 129, 187–207. [Google Scholar] [CrossRef]
  27. Haboudane, D.; Miller, J.R.; Tremblay, N.; Zarco-Tejada, P.J.; Dextraze, L. Integrated narrow-band vegetation indices for prediction of crop chlorophyll content for application to precision agriculture. Remote Sens. Environ. 2002, 81, 416–426. [Google Scholar] [CrossRef]
  28. Miller, J.R.; Hare, E.W.; Wu, J. Quantitative characterization of the vegetation red edge reflectance 1. An inverted-Gaussian reflectance model. Int. J. Remote Sens. 1990, 11, 1755–1773. [Google Scholar] [CrossRef]
  29. Brando, V.E.; Dekker, A.G. Satellite Hyperspectral Remote Sensing for Estimating Estuarine and Coastal Water Quality. IEEE Trans. Geosci. Remote Sens. 2003, 41, 1378–1387. [Google Scholar] [CrossRef]
  30. Xue, J.R.; Su, B.F. Significant Remote Sensing Vegetation Indices: A Review of Developments and Applications. J. Sens. 2017, 2017, 1353691. [Google Scholar] [CrossRef]
  31. Huete, A.R. A soil-adjusted vegetation index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
  32. Rondeaux, G.; Steven, M.; Baret, F. Optimization of soil-adjusted vegetation indices. Remote Sens. Environ. 1996, 55, 95–107. [Google Scholar] [CrossRef]
  33. Wang, J.; Xu, R.; Ma, Y.; Miao, L.; Cai, R.; Chen, Y. The research of air pollution based on spectral features in leaf surface of Ficus microcarpa in Guangzhou, China. Environ. Monit. Assess. 2007, 142, 73–83. [Google Scholar] [CrossRef]
  34. Tucker, C.J. Red and Photographic Infrared Linear Combinations for Monitoring Vegetation. Remote Sens. Environ. 1979, 8, 127–150. [Google Scholar] [CrossRef]
  35. Yuan, Y.; Xing, H.; Zeng, W.; Xu, J.; Sun, X.Z. Genome-wide association and differential expression analysis of salt tolerance in Gossypium hirsutum L at the germination stage. BMC Plant Biol. 2019, 19, 394. [Google Scholar] [CrossRef] [PubMed]
  36. Yang, T.; Li, S.M.; Huang, Y.J.; Ren, D.; Cui, J.X.; Pang, B.; Gao, W.W. Comprehensive Evaluation of Natural Compound Salt Stress of Sea-Island Cotton Resources. J. Nucl. Agric. Sci. 2021, 35, 1507–1521. [Google Scholar] [CrossRef]
  37. Jiang, M.H.; Sun, F.L.; Yang, Y.; Wang, Y.Y.; Qu, Y.Y.; Chen, Q.J. Identification and evaluation of drought resistance of upland-island recombination inbred line population at blossoming and boll-forming stages. Arid Zone Res. 2020, 37, 1635–1643. [Google Scholar] [CrossRef]
  38. Liu, Z.W.; Zhang, P.; Wang, R. Effects of soil progressive drought during the flowering and boll-forming stage on gas exchange parameters and chlorophyll fluorescence characteristics of the subtending leaf to cotton boll. Ying Yong Sheng Tai Xue Bao 2014, 25, 3533–3539. [Google Scholar] [CrossRef]
  39. Sun, F.; Chen, Q.; Chen, Q. Screening of Key Drought Tolerance Indices for Cotton at the Flowering and Boll Setting Stage Using the Dimension Reduction Method. Front. Plant Sci. 2021, 12, 619926. [Google Scholar] [CrossRef]
  40. Sun, F.; Chen, Q.; Chen, Q. Yield-based drought tolerance index evaluates the drought tolerance of cotton germplasm lines under field conditions. PeerJ 2023, 11, e14367. [Google Scholar] [CrossRef]
  41. Wang, W.; Sun, N.; Zhao, K.; Song, J.K.; Fang, H.; Fan, G.Q.; Gao, Y.H.; Huang, T.R.; Ding, Y.D. Genome-wide association analysis of wheat stem traits using 55K microarrays. Front. Plant Sci. 2025, 16, 1635721. [Google Scholar] [CrossRef]
  42. Xie, M.; Wang, Z.; Huete, A.; Brown, L.A.; Wang, H.; Xie, Q.; Xu, X.; Ding, Y. Estimating Peanut Leaf Chlorophyll Content with Dorsiventral Leaf Adjusted Indices: Minimizing the Impact of Spectral Differences between Adaxial and Abaxial Leaf Surfaces. Remote Sens. 2019, 11, 2148. [Google Scholar] [CrossRef]
  43. Wang, W.; Sun, N.; Bai, B.; Wu, H.; Cheng, Y.K.; Geng, H.W.; Song, J.K.; Zhou, J.P.; Pang, Z.Y.; Qian, S.T.; et al. Prediction of wheat SPAD using integrated multispectral and support vector machines. Front. Plant Sci. 2024, 15, 1405068. [Google Scholar] [CrossRef]
  44. Wang, W.; Gao, X.; Cheng, Y.K.; Ren, Y.; Zhang, Z.H.; Wang, R.; Cao, J.M.; Geng, H.W. QTL Mapping of Leaf Area Index and Chlorophyll Content Based on UAV Remote Sensing in Wheat. Agriculture 2022, 12, 595. [Google Scholar] [CrossRef]
  45. Wang, W.; Cheng, Y.K.; Ren, Y.; Zhang, Z.H.; Geng, H.W. Prediction of Chlorophyll Content in Multi-Temporal Winter Wheat Based on Multispectral and Machine Learning. Front. Plant Sci. 2022, 13, 896408. [Google Scholar] [CrossRef]
  46. Liu, K.; Harrison, M.T.; Yan, H.L.; Liu, D.L.; Meinke, H.; Hoogenboom, G.; Wang, B.; Peng, B.; Guan, K.Y.; Jaegermeyr, J.; et al. Silver lining to a climate crisis in multiple prospects for alleviating crop waterlogging under future climates. Nat. Commun. 2023, 14, 765. [Google Scholar] [CrossRef]
Figure 1. Schematic overview of the end-to-end workflow for UAV-based prediction of cotton drought tolerance. (A) Workflow for manual data collection and UAV-based multispectral vegetation index extraction for drought-tolerant traits in cotton; (B) Feature Analysis, Construction and Validation Process of the D-value Prediction Model for Cotton Drought Tolerance.
Figure 1. Schematic overview of the end-to-end workflow for UAV-based prediction of cotton drought tolerance. (A) Workflow for manual data collection and UAV-based multispectral vegetation index extraction for drought-tolerant traits in cotton; (B) Feature Analysis, Construction and Validation Process of the D-value Prediction Model for Cotton Drought Tolerance.
Agronomy 16 00526 g001
Figure 2. Overview of the study area and the machine learning–based model development workflow. (A) Spatial Location Map of the Study Area and Field Trial Sites; (B) Process Framework for Multispectral Phenotyping Data Processing and Model Construction; (C) An Optimal Model Selection Framework Based on Multi-Machine Learning Algorithms.
Figure 2. Overview of the study area and the machine learning–based model development workflow. (A) Spatial Location Map of the Study Area and Field Trial Sites; (B) Process Framework for Multispectral Phenotyping Data Processing and Model Construction; (C) An Optimal Model Selection Framework Based on Multi-Machine Learning Algorithms.
Agronomy 16 00526 g002
Figure 3. Selection Process of Vegetation Indices. (A) Band composition of multispectral imagery and schematic diagram for extraction of 16 vegetation indices; (B) Significant Classification of Vegetation Indices and Radar Chart Analysis of Contribution.
Figure 3. Selection Process of Vegetation Indices. (A) Band composition of multispectral imagery and schematic diagram for extraction of 16 vegetation indices; (B) Significant Classification of Vegetation Indices and Radar Chart Analysis of Contribution.
Agronomy 16 00526 g003
Figure 4. Model Prediction Results. (A) Scatter distribution of predicted-value vs. true D-value under the LR model, along with evaluation metrics for the training/test set; (B) Scatter distribution of predicted D-value vs. true D-value under the KNN model, along with evaluation metrics for the training/test set; (C) Scatter distribution of predicted D-value vs. true D-value under the LGBM model, along with evaluation metrics for the training/test set; (D) Scatter distribution of predicted D-value vs. true D-value under the XGB model, along with evaluation metrics for the training/test set; (E) Comparison of R2, RMSE, and MAE metrics across LR, KNN, LGBM, and XGB models on the training and test sets.
Figure 4. Model Prediction Results. (A) Scatter distribution of predicted-value vs. true D-value under the LR model, along with evaluation metrics for the training/test set; (B) Scatter distribution of predicted D-value vs. true D-value under the KNN model, along with evaluation metrics for the training/test set; (C) Scatter distribution of predicted D-value vs. true D-value under the LGBM model, along with evaluation metrics for the training/test set; (D) Scatter distribution of predicted D-value vs. true D-value under the XGB model, along with evaluation metrics for the training/test set; (E) Comparison of R2, RMSE, and MAE metrics across LR, KNN, LGBM, and XGB models on the training and test sets.
Agronomy 16 00526 g004
Figure 5. Prediction Results of Different Subclusters in the Cluster Model. (AF) Linear fitting of Classes I, II, III, IV, V, and VI under the XGB model; (GL) Line plots of Classes I, II, III, IV, V, and VI under the XGB model.
Figure 5. Prediction Results of Different Subclusters in the Cluster Model. (AF) Linear fitting of Classes I, II, III, IV, V, and VI under the XGB model; (GL) Line plots of Classes I, II, III, IV, V, and VI under the XGB model.
Agronomy 16 00526 g005
Table 1. Parameters of UAV multispectral image acquisition.
Table 1. Parameters of UAV multispectral image acquisition.
ParameterParameter Values
Flight altitude30 m
Flight Speed5.4 km/h
Course overlap ratio80%
Lateral overlap rate70%
Spectral typeGreen, Red, Red_edge, and Nir
Table 2. Vegetation index and its calculation formula.
Table 2. Vegetation index and its calculation formula.
Vegetation IndexFormula to CalculateReference
NGRVINGRVI = (RGreenRRed)/(RGreen + RRed)[19]
NDVI N D V I = ( R N i r R Red ) / ( R N i r + R Red ) [20]
GNDVI G N D V I = ( R N i r R G r e e n ) / ( R N i r + R G r e e n ) [21]
WDRVI WDRVI = ( 0.1 R Nir R Red ) / ( 0.1 R Nir + R Red ) [22]
LCI LCI = R Nir / R R e d _ e d g e 1 [23]
MSAVI M S A V I = 2 R Nir + 1 ( 2 R Nir + 1 ) 2 8 ( R Nir R Red ) 2 [24]
IPVI IPVI = R Nir R Nir + R Red [25]
NLI NLI = R Nir 2 R Red R Nir 2 + R Red [26]
TDVI TDVI = 1.5 × R Nir R Green R Nir 2 + R Red + 0.5 [27]
MSRI MSRI = R R e d _ e d g e R Red 1 [28]
NDRE NDRE = R Nir R R e d _ e d g e R Nir + R R e d _ e d g e [29]
RERDVI R E R D V I = ( R N i r R R e d _ e d g e ) / ( R N i r + R R e d _ e d g e ) [30]
SAVI S A V I = 2.5 × ( R N i r R R e d ) / ( R N i r + R R e d + 0.5 ) [31]
OSAVI O S A V I = ( R N i r R R e d ) / ( R N i r + R R e d + 0.16 ) [32]
RVI R V I = R N i r / R R e d [33]
DVI D V I = R N i r R R e d [34]
Note: R G r e e n , R R e d , R R e d _ e d g e and R N i r respectively represent the reflectance of green band, red band, red edge band and near infrared band.
Table 3. Phenotypic Analysis of Gossypium hirsutum.
Table 3. Phenotypic Analysis of Gossypium hirsutum.
TraitsEnvMinMaxMeanCV (%)Sig
PH (cm)CK41.2080.5057.1611.95***
DS36.9072.2050.7113.13
FBNCK4.4010.307.1911.18***
DS2.508.705.9815.65
Non-FBNCK0.205.901.8536.47***
DS0.804.002.2326.13
BNCK3.109.005.5217.57***
DS1.808.703.7323.28
SY (Kg)CK0.803.001.7020.27***
DS0.501.801.1619.28
BW (g)CK3.107.605.4410.46***
DS3.806.605.199.43
LP (%)CK26.5050.4041.849.11***
DS22.3048.1040.399.47
SI (g)CK7.0015.2010.4211.58*
DS7.9014.6010.2910.29
HFNFB (cm)CK13.4030.8020.2213.60***
DS13.5033.9021.6014.87
FNFBCK2.006.704.7612.27***
DS2.508.904.9315.05
FL (mm)CK24.0031.4027.705.17***
DS23.3029.9026.584.56
FU (%)CK80.4087.6084.211.56***
DS79.6086.4083.071.52
FMCK2.705.504.539.56***
DS3.006.205.078.00
FS (cN/tex)CK24.3041.6031.068.70***
DS24.4037.5030.009.10
FE (%)CK6.306.906.621.82***
DS6.206.806.521.81
* and *** indicate significant differences at p < 0.05 and p < 0.001, respectively.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, F.; Yang, T.; Wang, W.; Han, W.; Wang, G.; Qiao, J.; Kong, X.; Liu, L.; Si, A.; Wang, F.; et al. High-Throughput Evaluation of Cotton Drought Tolerance Using UAV Multispectral Imagery and XGBoost-Based Machine Learning. Agronomy 2026, 16, 526. https://doi.org/10.3390/agronomy16050526

AMA Style

Zhao F, Yang T, Wang W, Han W, Wang G, Qiao J, Kong X, Liu L, Si A, Wang F, et al. High-Throughput Evaluation of Cotton Drought Tolerance Using UAV Multispectral Imagery and XGBoost-Based Machine Learning. Agronomy. 2026; 16(5):526. https://doi.org/10.3390/agronomy16050526

Chicago/Turabian Style

Zhao, Fuxiang, Tao Yang, Wei Wang, Wanli Han, Gang Wang, Jinxin Qiao, Xianhui Kong, Li Liu, Aijun Si, Fanlin Wang, and et al. 2026. "High-Throughput Evaluation of Cotton Drought Tolerance Using UAV Multispectral Imagery and XGBoost-Based Machine Learning" Agronomy 16, no. 5: 526. https://doi.org/10.3390/agronomy16050526

APA Style

Zhao, F., Yang, T., Wang, W., Han, W., Wang, G., Qiao, J., Kong, X., Liu, L., Si, A., Wang, F., Wang, X., Yang, X., & Yu, Y. (2026). High-Throughput Evaluation of Cotton Drought Tolerance Using UAV Multispectral Imagery and XGBoost-Based Machine Learning. Agronomy, 16(5), 526. https://doi.org/10.3390/agronomy16050526

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop