Non-Destructive Yield Prediction in Common Bean Using UAV-Based Spectral and Structural Metrics: Implications for Sustainable Crop Management
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe manuscript presents a well-designed and highly relevant study aimed at developing a non-destructive method for early yield prediction in common bean at the individual plant level by fusing UAV-based spectral data with ground-based canopy structure metrics. The primary strength of the paper lies in its innovative integration of these two distinct data sources to overcome the saturation issue of traditional vegetation indices (e.g., NDVI) in dense canopies. The specific suggestions for revision are as follows.
- The TRAC system is a ground-based instrument requiring manual measurements by encircling each plant individually. This contrasts sharply with the high-throughput data acquisition capabilities of UAVs. While the manuscript positions its framework within "high-throughput phenotyping" and "precision agriculture," the scalability of the current data collection method appears to be a bottleneck.
- In the Discussion section, could the authors elaborate on the future potential of using purely UAV-based technologies (e.g., combining multispectral imagery with LiDAR or Structure from Motion (SfM) point clouds) to derive similar canopy structural metrics? This would enable a truly high-throughput workflow.
- This study was conducted at a single site, with a single cultivar, and under idealized "non-limiting nitrogen" conditions. Although the authors acknowledge these limitations in Section 4.5, the discussion of their potential impact is somewhat insufficient. For instance, under nitrogen stress, the sensitivity of spectral indices like NDVI might become more pronounced, potentially leading to a fundamental shift in the ranking of variable importance within the model.
- The title of Table 3, "Performance comparison of predictive models," seems inaccurate. A more appropriate title would be "Coefficients and Significance of the Final Parsimonious Model" or something similar.
- Please ensure that all abbreviations are defined in full upon their first appearance in the main text, even if they are already listed in the Abbreviations section at the end of the manuscript.
- How was the "canopy height" metric measured? Was it measured manually in the field with a ruler? Please clarify this in the Methods section.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsIn tropical agricultural systems, early yield prediction for common beans (Phaseolus vulgaris L.) is crucial for enhancing productivity. The authors integrated canopy structure indices obtained via TRAC, drone-based multispectral measurements (NDVI, projected canopy area), and phenological variables collected from the R6 to R8 stages under non-limiting nitrogen conditions. We applied exploratory analysis (correlation analysis, variance inflation factor), dimensionality reduction (principal component analysis), and regularized regression (elastic net/Minimum Absolute Shrinkage and Selection Operator) combined with bootstrap stability selection to identify a concise and robust set of predictors. The final model comprised six variables explaining approximately 72% of the variability in plant-level grain yield, with acceptable error margins (root mean square error ~10.67 g; mean absolute error ~7.91 g). Results indicate that integrating early growth vigor, light interception, and canopy structure provides complementary insights beyond simple spectral indices. This non-destructive framework offers an efficient model for early yield estimation and supports site-specific management decisions with high spatial resolution in common bean. This approach contributes to resource-efficient crop management and supports sustainability goals in tropical agricultural systems. The article demonstrates both scientific rigor and practical applicability.
- The references in the introduction are insufficient; please add more.
- The authors randomly divided the data into 7:3 groups. Are the observed results overly coincidental? Please revise or address this issue.
- Provide the specific formulas for R², RMSE, and MAE.
- Indent the first line of row 240 by 2 characters.
- The conclusion is too long; please shorten it.
- Section 4.1 should incorporate references from other authors for support.
- Sections 4.5 and 4.6 can be merged into one section and streamlined.
- Consider incorporating additional vegetation indices and advanced algorithms.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for Authors1- The Introduction Section is too short and seems incomplete; there is very little literature review provided in this section. It must be improved.
2- Moreover, the paragraph starting with "This study was conducted under a controlled experimental design......." authors need to provide more information about this "controlled experimental design," etc.
3- In Sections 2.1 and 2.2, no weather information is available during R6–R8 (radiation/temperature/VPD) that could affect radiometry and TRAC. This Section provides only climatology.
4- The purpose of Figure 2 is not clear to the reviewer. Why is it added? Must be clearly specified in the label and in the text.
5- Section 2.2.1: Unusually wide spacing (1.0 × 1.5 m) is justified by the authors for monitoring, but the text does not discuss implications for canopy closure/generalizability.
6- In the same section, very high N (300 kg N ha⁻¹) was noted, but the fertilizer source/split/timing were omitted by the authors. Potential confounder; authors should add detail.
7- Section 2.2.2: Camera has only RED & NIR (MAPIR Survey2) NDVI only. The limitation (no red-edge) isn’t acknowledged here by the authors.
8- Georeferencing specifics are missing: number/location of GCPs and XY/Z RMSE not reported. (Section only states platform, overlaps, GSD, panel.). Moreover, the Illumination window is broad (10:00–14:00) with no sun-angle/irradiance logging or BRDF handling described.
9- Section 2.2.4: Inconsistency in predictor counts. Authors define 30 predictors (10 metrics × 3 stages) here, yet later authors mentioned the set was “reduced from 33 to 21.” Why??
10- Mixed evaluation protocols are described together. Both the 10-fold CV and the 70/30 split are stated without a single, consistent primary protocol. Lastly, Stability selection iterations conflict later (see Results). Here it says 1000 bootstraps; the Results say 100. Why these discrepancies?
11- In the Results Section:
- VIF values are extreme (e.g., PAI_E_R7≈1951; K_MEAN_R7≈1782), yet early OLS interpretations elsewhere risk being over-read. Emphasize penalized/nested modeling and avoid any OLS inference pre-filter.
- Typo/repetition: “values (VIF), whose extreme values (VIF)” line duplication.
- Two different performances for PCA_lm_k5 due to protocol change. Repeated CV gives R²≈0.59; 70/30 framework yields RMSE≈11.46 g. This is confusing unless authors explicitly anchor all models to one protocol first.
- Apparent vs. generalization performance conflated. You retrain on all data (n=101) and report R²≈0.72—this is an apparent fit; show out-of-sample for the fixed 6-feature set.
- Table label/content mismatch. “Table 3. Performance comparison of predictive models” actually lists standardized coefficients and p-values; authors should rename it (e.g., “Standardized coefficients for lm_top6”) and fix the comma in “FAPAR, R8”.
12- Sensor limitation not discussed. The MAPIR Survey2’s lack of red-edge/hyperspectral bands (stated in Methods) isn’t reflected in limitations/implications. Also, no quantification of each data source’s marginal value. A simple ablation (UAV-only vs. TRAC-only vs. combined) would support claims in this section. (Model comparison list is present but not ablation by source.)
13- Figures & Tables:
- Figures 1 and 2 are okay, but the authors should add scale bars/north arrow/GCP dots to improve reproducibility (site and processing are central).
- Figure 3 claims “GPS reference points,” but the text doesn’t quantify GCP accuracy elsewhere (authors should tie these together).
- Table 1 VIF extreme values merit a stronger cautionary note about interpreting OLS coefficients; currently split across sections with some duplication.
14- Authors should: Add missing UAV/geo-radiometric details (GCP count/accuracy, sun angle/irradiance logging, BRDF note) and TRAC acquisition parameters.
15- Suggestion to authors that they should publish data/code or at least a data dictionary to strengthen reproducibility.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for AuthorsThe paper estimates soybean yield based on UAV and TRAC data during key phenological periods. It has abundant research data, incorporates a large number of vegetation structure parameters, and possesses certain innovation. Below are some minor revision suggestions:
In the Introduction section, it is necessary to supplement the discussion on the phenological periods of soybeans and explain why these specific phenological periods are selected for soybean yield estimation. Lines 60-65 can be appropriately condensed, and the research significance should be stated in one or two sentences.
In the Methodology section: it is recommended to further add an introduction to the TRAC instrument. Supplement the introductions to LASSO and Elastic Net (algorithms). It is also suggested to add a research flow chart in the Methodology section.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 5 Report
Comments and Suggestions for AuthorsCongrats to the authors for a good and extensive work.
The article focuses on the development and validation of a parsimonious regression model to predict the individual yield of common bean plants using biophysical and structural canopy metrics at stages R6–R8. The new model proposes a reduced set of robust predictors applicable to phenotyping and precision agriculture.
The Introduction section is short and concise, and includes topics like: early yield prediction at the plant scale, UAV-mounted remote sensors, the combination of structural and optical metrics, the relationship between spectral indices and yield.
The research was conducted in a village of Colombia, as described in section 2.1.
The research methods are based on exploratory analyses (descriptive statistics, distribution assessments, and Pearson correlations with yield), dimensionality reduction (principal component analysis was applied and the first components were physiologically interpreted to complement the regularization analysis), and regularized regression (Elastic Net/LASSO), combined with bootstrap stability selection were applied to identify a parsimonious subset of robust predictors and finally to develop parsimonious predictive and more stable models.
The results show that combining early vigor, radiation interception, and canopy architecture provides complementary information beyond simple spectral indices. The correlation analysis revealed strong dependence among leaf density metrics.
Based on the stability analysis, a parsimonious model was defined using six predictors. This set captures signals of early vigor, photosynthetic efficiency during development, and structural attributes at maturity.
Finally, the bootstrap analysis (1000 replications) confirmed statistical robustness.
In the next section were discussed the phenological and structural drivers of yield, the relative importance of spectral and structural metrics, the physiological interpretation of the predictors, the methodological contribution and robustness of the approach, the limitations and generalization of the approach, the practical implications and future perspectives.
The conclusions show that the parsimonious model developed here stands out as a statistically robust and physiologically consistent tool for yield prediction and these are fully supported by the findings.
The study includes an average number of specialized references (28), the most of them being extremely current (85% from the last 5 years) and well chosen from research journals with a high impact factor.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 6 Report
Comments and Suggestions for AuthorsThe report developed and validated a parsimonious regression model to predict the yield of common bean plants in the Montenegro region of Colombia. The methods were discussed in great detail, including data collection, data cleaning, modeling procedures, and especially the PCA analysis. The report is well organized and well written, and its conclusions are well supported by the data.
It is ready for publication, except for two minor comments:
-
Line 130: How many flights per day were conducted? And what is the total number of flights?
-
Line 261: The first component is associated with canopy vigor and closure during R6–R7, which is very interesting. Were any similar observations reported elsewhere, or is this unique to this study?
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe responses show that the authors have understood the review comments well and have made substantial and insightful revisions in the manuscript that have greatly improved the quality of the paper.
Author Response
Thank you for your suggestion.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors have addressed all the comments raised previously and the article may be accepted for publication.
Author Response
Thank you for your suggestion.
Reviewer 3 Report
Comments and Suggestions for AuthorsNo More comments.
Author Response
Thank you for your suggestion.

