Developing an Uncrewed Aerial Vehicle (UAV)-Based Prediction Model for the Rice Harvest Index Using Machine Learning
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe research article has been formulated in a proper manner. This article contains new and significant information adequate to justify the publication. However, the reviewer has some suggestions, as listed below.
- The contributions should be given more clearly in the introduction. The authors should clearly point out the major contributions of this paper by using 3 to 5 brief bullet points.
- Use past tense in the conclusion section.
- The short intro about machine learning is missing in the Introduction section. It would be better if the authors can refer and cite the following papers to strengthen the content of machine learning in the Introduction section: "Machine learning models based on hyperspectral imaging for pre-harvest tomato fruit quality monitoring", "An Integrated Approach Based on Fuzzy Logic and Machine Learning Techniques for Reliable Wine Quality Prediction", "Using spectral vegetation indices and machine learning models for predicting the yield of sugar beet (Beta vulgaris L.) under different irrigation treatments", and "Augmented machine learning towards smart self-powered sensing systems".
- After the literature review, highlight in 9-15 lines what overall technical gaps are observed in existing works that led to the design of the proposed methodology.
Some sentences have grammatical errors. So, the authors need to check the grammatical errors throughout the paper.
Author Response
Dear reviewer:
Thank you for your valuable and specific suggestions. We fully agree that these suggestions will help further improve the clarity and rigor of the paper. We have revised the manuscript accordingly based on your comments. The specific response is as follows:
Comments 1: [The contributions should be given more clearly in the introduction. The authors should point out the major contributions of this paper by using 3 to 5 brief bullet points.]
Response 1: [Thank you for pointing this out. We agree with this comment, We made the following changes. This section is added at the end of the introduction: Addressing the limitations of traditional destructive and retrospective methods for determining harvest index (HI), this study introduces and validates an innovative framework enabling non-destructive, high-accuracy, early-season HI prediction in rice. Key innovations underpin this framework:
Firstly, it fuses multi-source temporal UAV remote sensing data, integrating Structure-from-Motion (SfM)-derived canopy height models (CHM; structural traits) with critical multispectral vegetation indices (e.g., TCARI, MTCI; physiological traits) into a comprehensive feature space.
Secondly, a robust four-stage feature selection cascade (Pearson-RFE-Lasso-XGBoost) was implemented to mitigate data redundancy, successfully identifying four pivotal predictors (TCARI, GRVI, MTCI, TO) from 26 initial variables and reducing dimensionality by 84.6%. Thirdly, incorporating data across key phenological stages (tillering, heading, maturity) captured essential temporal dynamics, enhancing predictive accuracy by 23% over single-time-point models.
Finally, a Stacking ensemble model yielded high prediction performance (R²=0.88), with SHAP analysis confirming the pronounced influence of physiological indices like MTCI and TCARI.
The primary contribution is an integrated methodology combining multi-modal sensing, stringent feature selection, and ensemble machine learning, thereby offering a novel approach for real-time, non-invasive HI assessment critical for advancing high-throughput phenotyping and precision agriculture applications.
Comments 2: [Use past tense in the conclusion section.]
Response 2: [Thank you for pointing this out. We agree with this comment, Thank you for your feedback regarding the use of the past tense in the conclusion. We have thoroughly revised the conclusion section to ensure all verbs describing the study’s completed actions and findings are in the past tense. Examples of key adjustments include: Original: "breaking through the limitation..." → ​Revised: "broke through the limitation..." Original: "provides an innovative solution..." → ​Revised: "provided an innovative solution..." Original: "Research shows that..." → ​Revised: "Research showed that...". Here are the updated conclusions. This study ​investigated​ the application of UAV remote sensing technology for predicting the crop harvest index (HI), overcoming the traditional constraint that HI ​could​ only be measured post-harvest. By extracting spectral features, including digital surface elevation and vegetation indices (e.g., TCARI, GRVI, MTCI, and TO) from UAV imagery, significant correlations ​were established​ between these variables and HI as well as aboveground biomass. During model development, multiple machine learning algorithms ​were systematically evaluated, and the Stacking ensemble learning model ​achieved​ superior predictive accuracy (R² = 0.88), significantly outperforming single-algorithm approaches. The integration of multi-source remote sensing features with multi-algorithm optimization ​enhanced​ prediction robustness, which ​underscored​ the practical potential of combining UAV-derived data with machine learning for agricultural remote sensing. This work ​provided​ a technical framework for screening crop varieties with high HI potential based on vegetation indices and ​offered​ actionable insights for precision agricultural practices, including fertilization, irrigation, and yield forecasting. These advancements ​contributed​ to the foundational knowledge required for intelligent and sustainable agricultural systems.]
Comments 3: [The short intro about machine learning is missing in the Introduction section. It would be better if the authors can refer and cite the following papers to strengthen the content of machine learning in the Introduction section: "Machine learning models based on hyperspectral imaging for pre-harvest tomato fruit quality monitoring", "An Integrated Approach Based on Fuzzy Logic and Machine Learning Techniques for Reliable Wine Quality Prediction", "Using spectral vegetation indices and machine learning models for predicting the yield of sugar beet (Beta vulgaris L.) under different irrigation treatments", and "Augmented machine learning towards smart self-powered sensing systems".]
Response 3: [Thank you for pointing this out. We agree with this comment, In response to the problem that the introduction lacks a brief introduction to machine learning, we added an introduction to machine learning and cited this article"Using spectral vegetation indices and machine learning models for predicting the yield of sugar beet (Beta vulgaris L.)and "Machine learning models based on hyperspectral imaging for pre-harvest tomato fruit quality monitoring". This is the modified content: Machine learning algorithms excel at handling high-dimensional, non-linear data, effectively mining complex patterns within UAV remote sensing imagery and their correlation with crop growth status. This study predicted sugar beet yield under varied irrigation using vegetation indices (OSAVI, SAVI, NDVI) and machine learning, with kNN models achieving high accuracy (testing R² up to 0.65)[9]. This study developed a handheld hyperspectral camera integrated with machine learning algorithms to non-destructively evaluate seven critical tomato quality parameters using five optimized spectral bands, achieving efficient and cost-effective pre-harvest quality assessment[11].]
Comments 4: [After the literature review, highlight in 9-15 lines what overall technical gaps are observed in existing works that led to the design of the proposed methodology.]
Response 4: [Thank you very much for this suggestion. To more clearly explain the motivation and methodological design of this study, we agreed to summarize the key technical gaps observed in existing research in a paragraph (limited to 9-15 lines) after the literature review and before introducing the objectives of this study. We have added the following summary:
While Unmanned Aerial Vehicle (UAV) remote sensing technology has demonstrated significant utility in domains such as crop yield estimation[13] and biomass assessment[7], its application to Harvest Index (HI) prediction remains underdeveloped. Firstly, at the feature engineering level, the selection of vegetation indices often lacks systematic optimization strategies. For instance, researchers frequently compute a large number of potential vegetation indices from high-dimensional remote sensing data[14]. However, failing to adequately address the potential high collinearity (redundancy) among these indices during subsequent modeling may compromise model robustness. Secondly, in the dimension of data fusion, the synergistic interaction mechanisms between canopy three-dimensional (3D) structural parameters and spectral characteristics in influencing HI variability have not been fully elucidated. Furthermore, in terms of temporal modeling, conventional methods predominantly rely on post-harvest measurements, which limits the in-depth understanding and in situ characterization of the dynamic physiological mechanisms governing HI formation. These limitations underscore the necessity for developing novel approaches. Consequently, this study aims to design a novel method that integrates multi-source remote sensing features with optimized machine learning strategies to achieve timely and accurate prediction of rice HI.
Addressing the limitations of traditional destructive and retrospective methods for determining harvest index (HI), this study introduces and validates an innovative framework enabling non-destructive, high-accuracy, early-season HI prediction in rice. Key innovations underpin this framework:
Firstly, it fuses multi-source temporal UAV remote sensing data, integrating Structure-from-Motion (SfM)-derived canopy height models (CHM; structural traits) with critical multispectral vegetation indices (e.g., TCARI, MTCI; physiological traits) into a comprehensive feature space.
Secondly, a robust four-stage feature selection cascade (Pearson-RFE-Lasso-XGBoost) was implemented to mitigate data redundancy, successfully identifying four pivotal predictors (TCARI, GRVI, MTCI, TO) from 26 initial variables and reducing dimensionality by 84.6%. Thirdly, incorporating data across key phenological stages (tillering, heading, maturity) captured essential temporal dynamics, enhancing predictive accuracy by 23% over single-time-point models.
Finally, a Stacking ensemble model yielded high prediction performance (R²=0.88), with SHAP analysis confirming the pronounced influence of physiological indices like MTCI and TCARI.
The primary contribution is an integrated methodology combining multi-modal sensing, stringent feature selection, and ensemble machine learning, thereby offering a novel approach for real-time, non-invasive HI assessment critical for advancing high-throughput phenotyping and precision agriculture applications.
Modification location: This paragraph has been added after the literature review in the Introduction section as a transition to introduce the specific objectives and method design ideas of this study.]
Author Response File: Author Response.docx
Reviewer 2 Report
Comments and Suggestions for AuthorsThis study presents an uncrewed aerial vehicle (UAV)-based machine learning model to predict the rice harvest index (HI) during the growth period, aiming to replace traditional post-harvest methods. Using multispectral and visible light images, the researchers extracted key features such as TCARI, GRVI, MTCI, and TO, which showed strong correlations with HI. They applied multiple machine learning algorithms and found that the Stacking ensemble model achieved the best performance with an R² of 0.88. The method offers a promising approach for high-throughput phenotyping, precision agriculture, and the selection of high-HI rice varieties. The topic is important in the area of UAV technique.
I recommend publication subject to some revisions, as detailed later in this review.
Here are some comments/queries.
- The study uses data from a single location and growing season. Would the model perform equally well in different climates, soil types, or under varied agronomic practices? Is the dataset size and environmental diversity sufficient to generalize the model across regions and seasons?
- Although Stacking showed the best performance among tested models, it’s unclear how it compares to state-of-the-art deep learning techniques, which may handle spatial and spectral complexities better. How well does the Stacking model perform compared to deep learning models (e.g., CNNs, Transformers) that are more commonly used in remote sensing?
- The study uses k-fold cross-validation, but would independent datasets from other trials help better assess generalization? Were any external validation datasets used to confirm model robustness beyond the internal test set?
- The authors mention filtering and normalization but don’t deeply address how UAV-specific noise was treated. How does this impact model accuracy? How were outliers and noise in UAV data (e.g., due to cloud cover, shadows, or plant occlusion) handled during preprocessing?
- What is the economic feasibility of implementing this UAV-based HI prediction at the farm scale?
- Despite using techniques like Lasso and RFE, was the sample size (63 plots) statistically adequate to train high-capacity models like XGBoost and CatBoost? Is there a risk of overfitting given the high dimensionality of input features vs. relatively small sample size?
Author Response
Dear reviewer:
Thank you for your valuable and specific suggestions. We fully agree that these suggestions will help further improve the clarity and rigor of the paper. We have revised the manuscript accordingly based on your comments. The specific response is as follows:
Comments 1: [ The study uses data from a single location and growing season. Would the model perform equally well in different climates, soil types, or under varied agronomic practices? Is the dataset size and environmental diversity sufficient to generalize the model across regions and seasons?]
Response 1 : [ Thank you very much for your comment. We acknowledge that data collection for this study was limited to a single location (Rice Research Institute Base, Baiyun District, Guangzhou City, Guangdong Province) and a single growing season (2024). This is indeed a limitation of the current study. The performance of the model may vary under different climatic conditions, soil types, and agronomic management practices, which is an inevitable result of the impact of environmental factors on crop physiological and spectral characteristics.
Considerations of the current study: Despite the single location, we selected 7 rice varieties with different genetic backgrounds and harvest index ranges (covering high, medium, and low HI representatives) and performed 3 replications under a randomized block design, aiming to capture a certain degree of genetic variability and provide a differentiated data basis for model construction. The genetic stability of rice germplasm resources significantly reduces the impact of environmental variation on phenotypic plasticity (G×E effect), allowing core agronomic traits to maintain high heritability (h²>0.75) in heterogeneous habitats (Huang, X., et al. (2010), which provides a biological basis for building environmental robustness prediction models. The high-standard farmland conditions and uniform routine water and fertilizer management at the study site help reduce environmental disturbances and more clearly reveal the intrinsic relationship between remote sensing characteristics and HI, which is necessary for the initial validation of the method.
In the revised manuscript, we will clearly point out this limitation in the "Discussion" section and suggest that the results should be extrapolated to other regions with caution. The following are the revised parts: Although the rice harvest index prediction method proposed herein demonstrates high predictive accuracy, several limitations. Firstly, the acquisition of remote sensing data is inherently subject to constraints imposed by meteorological conditions and sensor resolution capabilities, which can potentially impact data integrity and temporal consistency. Secondly, concerning model generalizability, while the current study incorporated seven rice varieties exhibiting significant genetic diversity (encompassing Indica and Japonica subspecies, and representative high, medium, and low HI lines) and utilized a randomized block design with triplicate replication—aiming to systematically analyze the genetic and phenotypic interplay between canopy characteristics and HI—and acknowledging the substantial heritability of key agronomic traits in rice[46], which partially mitigates the confounding influence of environment within this single setting. Nevertheless, the foundational dataset was derived exclusively from a single geographical locale (Baiyun District, Guangzhou) during one specific growing season (2024 early rice). This restricted spatio-temporal scope inherently limits the validated generalizability of the current model across diverse agroecological contexts (encompassing varying climates, soil types, management practices, and phenological cycles).
Consequently, future research should prioritize the enhancement of the model's generalization capacity and robustness. A principal objective involves substantially expanding the spatiotemporal and environmental diversity of the data sampling. Subsequent studies are planned to broaden the experimental scope through multi-locational and multi-seasonal trials encompassing diverse climatic zones (e.g., including tropical-subtropical transition regions), multiple growing seasons (e.g., incorporating early and late rice cycles), and a range of agronomic management practices (e.g., varying fertilization and irrigation regimes). Such expansion is anticipated not only to augment the dataset size but, more crucially, to facilitate the rigorous evaluation and enhancement of the model's adaptability to variations in geographical provenance, climatic conditions, and cultivation systems, thereby bolstering its broader applicability.
Furthermore, advancements on the technological front, such as the incorporation of higher-resolution hyperspectral data, hold significant promise for capturing more nuanced spectral information. Integrating these data with sophisticated deep learning algorithms (e.g., CNNs, Transformers) may potentially unlock more intricate patterns, thereby offering expanded scope for enhancing predictive accuracy. Concurrently, the exploration and development of multi-scale feature extraction frameworks, designed to synergistically integrate remote sensing information across diverse spatial and temporal resolutions, constitutes another pertinent avenue for achieving a more holistic characterization of crop physiological status.]
Huang XueHui, H.X., et al., Genome-wide association studies of 14 agronomic traits in rice landraces. 2010.
Comments 2: [Although Stacking showed the best performance among tested models, it’s unclear how it compares to state-of-the-art deep learning techniques, which may handle spatial and spectral complexities better. How well does the Stacking model perform compared to deep learning models (e.g., CNNs, Transformers) that are more commonly used in remote sensing?]
Response 2: [We thank the reviewer for this insightful comment regarding the comparison of our Stacking model with state-of-the-art deep learning (DL) techniques, such as Convolutional Neural Networks (CNNs) and Transformers. We concur that DL models possess significant capabilities, particularly in automatically extracting complex spatial and spectral features directly from remote sensing imagery and capturing non-linear spatial patterns.
Our modeling approach in this study was predicated on utilizing pre-extracted, plot-level features, including specific spectral indices (TCARI, GRVI, MTCI, TO) and canopy height, resulting in structured tabular data. For this specific data format, ensemble machine learning methods, such as the Stacking architecture employed, are often highly effective and were strategically chosen for several key reasons:
(1) Proven Performance on Tabular Data: Ensemble models frequently demonstrate state-of-the-art performance on structured, tabular datasets commonly encountered after feature engineering in remote sensing applications.
(2) Robustness with Moderate Sample Sizes: They generally exhibit greater robustness compared to data-hungry DL models when dealing with moderate sample sizes, mitigating the risk of overfitting, which can be a concern in field-based studies.
(3) Interpretability: These models offer enhanced interpretability (e.g., via SHAP values utilized in our study). This was valuable for identifying the specific contributions of key spectral (MTCI, TCARI) and structural features to HI prediction, thereby providing actionable agronomic insights.
Indeed, the implemented Stacking model, integrating the strengths of diverse base learners, achieved excellent predictive performance (R²=0.88) on our dataset. This validates its efficacy for this specific HI prediction task based on derived features.
We fully agree with the reviewer that exploring end-to-end DL models represents a valuable future direction for HI prediction. Future investigations could involve directly inputting UAV image patches (multispectral or hyperspectral) into architectures like CNNs or Transformers, potentially incorporating attention mechanisms, to enable automatic feature learning from raw pixel data. A direct and rigorous comparison between such DL approaches and the feature-based Stacking model presented here would be highly informative for determining optimal modeling strategies across different data modalities and application requirements.]
Comments 3: [The study uses k-fold cross-validation, but would independent datasets from other trials help better assess generalization? Were any external validation datasets used to confirm model robustness beyond the internal test set?]
Response 3: [ We thank the reviewer for raising this important point regarding the assessment of model generalization and the value of independent external validation. In the present study, we primarily employed k-fold cross-validation (specifically, 10-fold CV) as the principal method for internal model validation. This technique was chosen as it is a standard and effective approach for estimating the performance stability and mitigating potential bias arising from a single data partition, particularly when working with limited datasets like ours. K-fold CV provides a reasonably robust estimate of the model's likely performance on unseen data drawn from the same underlying distribution as the training set. We fully concur with the reviewer that external validation using entirely independent datasets (e.g., from different trial years, geographical locations, or distinct management regimes) provides a more rigorous and definitive assessment of a model's true generalization capability across diverse conditions. At the current stage of this study, we have not yet obtained a suitable independent external dataset for validation.
We acknowledge that external validation datasets were not available for the present study. Therefore, the current assessment relies solely on internal cross-validation results. As mentioned in our response to Comment 1, our future research plans include collecting data across multiple environments. A key objective of this expanded data collection effort is precisely to enable rigorous external validation in subsequent studies. We intend to utilize data from different growing seasons and/or geographical locations as independent test sets to comprehensively evaluate the robustness and practical applicability of the developed model under conditions distinct from the initial training data.]
Comments 4: [ The authors mention filtering and normalization, but don’t deeply address how UAV-specific noise was treated. How does this impact model accuracy? How were outliers and noise in UAV data (e.g., due to cloud cover, shadows, or plant occlusion) handled during preprocessing?]
Response 4:[ We thank the reviewer for emphasizing the need for clarity regarding the handling of UAV-specific noise and its impact on model accuracy. We concur that effectively addressing such noise is crucial for robust modeling. Our preprocessing workflow incorporated the following targeted measures:
(1) Minimized Illumination Variability, Cloud Effects, and Shadows: Optimized flight conditions were employed, specifically scheduling flights at solar noon (12:00-14:00) under clear skies to minimize shadow length and ensure stable illumination, thereby reducing shadow-related effects. Rigorous radiometric correction (using an onboard sensor and reflectance panel) converted DNs to surface reflectance, normalizing against temporal illumination changes and ensuring data comparability, thus yielding spectrally reliable features indicative of true canopy properties.
(2) Ensured Spatial Accuracy: Precise geometric correction utilizing RTK-GPS and GCPs delivered high spatial accuracy for orthomosaics and CHMs, critical for reliable feature extraction from accurately delineated Regions of Interest (ROIs).
(3) Mitigated Localized Noise, Minor Shadows, and Occlusion: Spatial aggregation via averaging valid pixel values (spectral, CHM) within ROIs inherently smoothed pixel-level noise and mitigated the influence of localized anomalies (e.g., minor shadows, occlusion, specular reflections), providing robust plot-level feature representations less sensitive to fine-scale variations.
(4) Identified and Addressed Feature-Level Outliers: Following aggregation, potential outliers in the plot-level features were identified using statistical methods, including Z-score analysis (±3σ threshold), the Interquartile Range (IQR) rule, and examination of box plots, cross-referenced with field records noting biological anomalies. Samples flagged as significant outliers, potentially stemming from residual noise or abnormalities, were subsequently removed to prevent disproportionate impacts on model training and evaluation.
Collectively, these steps systematically reduced UAV-specific noise and outlier effects, enhancing feature reliability and contributing directly to the model's high predictive accuracy (R²=0.88). We acknowledge potential residual noise impacts (e.g., from deeper shadows or complex canopy occlusion).
Future research will explore advanced techniques, including specific shadow detection/masking algorithms (e.g., the planned HSV-based approach), time-series filtering, and potentially end-to-end deep learning models (CNNs, Transformers) for direct image analysis.]
Comments 5: [ What is the economic feasibility of implementing this UAV-based HI prediction at the farm scale?]
Response 5 : [ We thank the reviewer for this pertinent question regarding the economic feasibility of implementing the proposed UAV-based HI prediction method at the farm scale. Evaluating the practical applicability and economic viability is indeed crucial for technology adoption. The feasibility hinges on a careful consideration of the associated costs versus the potential benefits:
Cost factors: Key cost factors include the capital outlay for appropriate UAV hardware (specifically, platforms equipped with multispectral sensors and RTK capabilities for accurate positioning) and licensing for specialized data processing software (e.g., photogrammetry suites like DJI Terra, Pix4D, and potentially GIS/remote sensing software like ArcGIS or ENVI). Depending on existing expertise, costs associated with personnel training or hiring professional services for flight operations and data analysis may also be incurred. It is important to note that the costs associated with UAV technology and related software have been generally declining, potentially improving future affordability.
The method proposed in this study can predict the harvest index non-destructively and quickly during the crop growth period. This is for: (a) Breeding projects: it can greatly accelerate the screening process of high HI and high-yield potential varieties, reduce the manpower and material costs of traditional destructive sampling, and improve breeding efficiency. (b) Precision agricultural management: early knowledge of HI potential can provide decision support for subsequent field management (such as variable fertilization and water resource regulation), optimize inputs, and may lead to yield increases or resource savings. (c) Yield prediction: Combined with biomass estimation, more accurate HI prediction can help improve the accuracy of final yield prediction at the regional or field level.
For large-scale commercial farms, especially high-value crops or seed production, the potential benefits brought by early, accurate prediction and management optimization may justify the investment in this technology. For breeding institutions, its efficiency improvement in high-throughput phenotyping may have significant economic value. As the technology matures and costs are further reduced, its economic feasibility for application in a wider range of farms will gradually increase.]
Comments 6:[ Despite using techniques like Lasso and RFE, was the sample size (63 plots) statistically adequate to train high-capacity models like XGBoost and CatBoost? Is there a risk of overfitting given the high dimensionality of input features vs. relatively small sample size?]
Response 6: [ We thank the reviewer for raising this pertinent concern regarding the statistical adequacy of the sample size (N=63 plots) for training potentially high-capacity models like XGBoost and CatBoost, and the associated risk of overfitting, particularly given the initial high dimensionality of remote sensing features. This is a critical consideration in machine learning modeling, and we implemented several strategies throughout our study design and analysis pipeline specifically to mitigate this risk:
(1) Rigorous, Multi-Stage Feature Selection: This was a crucial step to reduce input dimensionality before model training. We did not input all initially derived spectral indices (>20) directly into the models. Instead, a systematic four-stage feature selection process was employed (Pearson correlation → Recursive Feature Elimination (RFE) → Lasso Regression → Variance Inflation Factor (VIF) test to identify a parsimonious set of four core features (MTCI, TCARI, GRVI, TO) exhibiting high relevance to HI and low multicollinearity. This drastic dimensionality reduction significantly lowered the feature-to-sample ratio, making the sample size of N=63 statistically more viable for the subsequent modeling and substantially reducing the risk of the curse of dimensionality and overfitting.
(2) Utilization of Regularized Models: The chosen base learners for the Stacking model, namely XGBoost and CatBoost, possess inherent regularization mechanisms. These include L1/L2 penalties on weights, controls on tree complexity (e.g., max depth, min child weight), subsampling of data and features, and learning rate shrinkage (eta). These built-in features were leveraged during model tuning to actively control model complexity and prevent overfitting. Furthermore, Lasso regression, used in the feature selection pipeline, also contributes through its L1 regularization.
(3) Robust Performance Evaluation via Cross-Validation: As detailed previously, 10-fold cross-validation was used for model training and evaluation. This provides a more robust estimate of generalization performance than a single train-test split and is effective in detecting overfitting (i.e., identifying models with high training performance but poor performance on held-out folds). The observed consistency in performance metrics (e.g., RMSE stability across folds was ±0.002) during cross-validation provides empirical evidence against significant overfitting.
(4) Benefits of Stacking Ensemble: The Stacking architecture itself contributes to robustness. By combining predictions from multiple diverse base models, Stacking often yields improved generalization and reduced variance compared to relying on a single complex model, thereby offering an additional layer of protection against overfitting.
In conclusion, although the initial sample size is relatively limited, the effective feature dimension is greatly reduced through systematic feature engineering, and combined with cross-validation and the regularization mechanism inherent in the model, we aim to minimize the risk of overfitting. The consistency of the model in cross-validation and the high prediction accuracy (R²=0.88) achieved on the test set indicate that there is a strong potential relationship between the selected features and HI, and the model has good generalization ability within the current data range. However, we will further validate and train the model with a larger sample size in the future, which will be more conducive to ensuring its stability and reliability under a wider range of conditions.]
Author Response File: Author Response.docx