Spectral Index-Based Estimation of Total Nitrogen in Forage Maize: A Comparative Analysis of Machine Learning Algorithms
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsI have carefully evaluated the manuscript titled: Comparison of Machine Learning Algorithms for Estimating Total Nitrogen for Sustainable Forage Maize (Zea mays L.) Management in Northern Mexico. While the study addresses an important topic and presents valuable insights into the use of machine learning algorithms for estimating nitrogen levels in maize, several aspects require attention and improvement before it can be considered suitable for publication. Below are my criticisms, opinions, recommendations for improvement, and identification of weak aspects:
Title (Line 2):
Regarding the suggestion to change the title of the manuscript, it's crucial to consider how spectral indices play a significant role in the study's context and findings. While the current title emphasizes the comparison of machine learning algorithms for estimating total nitrogen in forage maize management, integrating spectral indices into the title can enhance its relevance and specificity.
For example: "Spectral Index-Based Estimation of Total Nitrogen in Forage Maize: A Comparative Analysis of Machine Learning Algorithms"
Line 102. Figure 1: it is important to highlight several deficiencies that hinder its effectiveness as a map representation. Firstly, the absence of essential elements such as a legend, north arrow, and scale significantly impairs the interpretability of the map. These components are vital for providing context and aiding readers in understanding the spatial distribution of the data. Furthermore, the visualization of coordinates appears to be inaccurate or unclear. Precise and correctly displayed coordinates are essential for georeferencing and spatial analysis.
A critical comment regarding Figures 2, 3, and 4 is that the number of samples (n=32) should be indicated in each figure. Including the sample size provides important context for the analysis and allows readers to assess the robustness and generalizability of the findings.
The visual clarity of figures 2-8 appears to be compromised, likely due to low resolution or inadequate image rendering. As a result, details in the plots may be difficult to discern, hindering the reader's ability to interpret the data accurately.
Regarding Figure 9, it is noted that the figure cannot be displayed correctly, indicating potential issues with font size and overall figure dimensions. Increasing the size of both the font and the figure itself can improve readability and ensure that all elements are legible, even when viewed at a larger scale.
Line 309-326. The manuscript lacks a comprehensive literature review. It would greatly benefit from a thorough discussion of previous studies on the use of machine learning algorithms for estimating nitrogen levels in crops, particularly maize. This would provide context for the current research and highlight its contribution to the existing body of knowledge.
The methodology section requires further elaboration to ensure reproducibility and transparency. Details such as the specific parameters selected of machine learning algorithms used, and criteria for selecting spectral indices should be provided. Additionally, information on the sample size (line 118), data collection process (Line 113), and validation methods is essential for assessing the robustness of the study.
Line 212: Enumerating equations, such as labeling them as Equation 1, Equation 2, etc., serves as a fundamental navigational aid for readers, enabling them to refer to specific equations easily throughout the text.
the section 2.7 (line 170) lacks clarity in elucidating the specific parameters and configurations employed for both ANN and RF models. Details such as the number of hidden layers, activation functions, and optimization algorithms utilized in the ANN model, as well as the number of decision trees and splitting criteria used in the RF model, are essential for understanding the modeling process comprehensively. Without this information, readers may struggle to assess the robustness and reproducibility of the models.
While the use of k-fold cross-validation for model evaluation is commendable (line 199), the section could benefit from a more explicit explanation of the rationale behind this approach. Although k-fold cross-validation is widely recognized as a standard practice for model validation, providing a justification for its selection and discussing potential alternatives or limitations would enhance the transparency and credibility of the methodology.
Additionally, the description of hyperparameter tuning could be further elaborated to provide clarity on the specific hyperparameters being optimized, the range of values tested, and the criteria used for selecting the optimal combination. Furthermore, addressing the potential challenges, such as overfitting, associated with hyperparameter tuning and the measures taken to mitigate these issues would enrich the discussion and provide valuable insights into the model optimization process.
The interpretation of results should be more thorough and nuanced. Rather than solely focusing on the percentage of variance explained by the models, the authors should discuss the practical implications of their findings and potential limitations. Additionally, a comparison of the performance of Random Forest and Neural Network algorithms in estimating nitrogen levels and the other agronomic aspects would provide valuable insights.
Por example:
Line 327: The application of Random Forest algorithms in our study, complemented by insights from previous research [49, 50, 51], proved instrumental in accurately estimating total nitrogen levels in forage maize. Our findings align with similar investigations in related agricultural contexts, where Random Forest models demonstrated efficacy in predicting key agricultural parameters [49, 52]. Additionally, the incorporation of Partial Least Squares-Discriminant Analysis (PLS-DA) and the Debiased Sparse Partial Correlation (DSPC) algorithm, as described by Rodriguez et al. [52] and Rey et al. [53] respectively, offers promising avenues for further refining our predictive models, our study underscores the interdisciplinary nature of agricultural research and highlights the potential for leveraging advanced statistical techniques to enhance sustainability and productivity in crop management.
[49]Olivares, B.O.; Vega, A.; Rueda Calderón, M.A.; Montenegro-Gracia, E.; Araya-Almán, M.; Marys, E. Prediction of Banana Production Using Epidemiological Parameters of Black Sigatoka: An Application with Random Forest. Sustainability 2022, 14, 14123. https://doi.org/10.3390/su142114123
[50]Rodríguez-Yzquierdo, G.; Campos, B.O.; Silva-Escobar, O.; González-Ulloa, A.; Soto-Suarez, M.; Betancourt-Vásquez, M. Mapping of the Susceptibility of Colombian Musaceae Lands to a Deadly Disease: Fusarium oxysporum f. sp. cubense Tropical Race 4. Horticulturae 2023, 9, 757. https://doi.org/10.3390/horticulturae9070757
[51]Vega, A.; Calderón, M.A.R.; Rey, J.C.; Lobo, D.; Gómez, J.A.; Landa, B.B. Identification of Soil Properties Associated with the Incidence of Banana Wilt Using Supervised Methods. Plants 2022, 11, 2070. https://doi.org/10.3390/plants11152070
[52]Rodríguez-Yzquierdo, G.; González-Ulloa, A.; León-Pacheco, R.; Gómez-Correa, J.C.; Yacomelo-Hernández, M.; Carrascal-Pérez, F.; Florez-Cordero, E.; Soto-Suárez, M.; Dita, M.; et al. Soil Predisposing Factors to Fusarium oxysporum f.sp Cubense Tropical Race 4 on Banana Crops of La Guajira, Colombia. Agronomy 2023, 13, 2588. https://doi.org/10.3390/agronomy13102588
[53]Rey, J.C.; Perichi, G.; Lobo, D. Relationship of Microbial Activity with Soil Properties in Banana Plantations in Venezuela. Sustainability 2022, 14, 13531. https://doi.org/10.3390/su14201353
Line 340: The conclusion is concise but could be strengthened by summarizing the key findings and their implications for sustainable maize management. Recommendations for future research directions, such as conducting analyses by phenological stage and increasing field data, should be expanded upon to guide future studies in this area.
Author Response
REVIEWER 1
Title (Line 2): Regarding the suggestion to change the title of the manuscript, it's crucial to consider how spectral indices play a significant role in the study's context and findings. While the current title emphasizes the comparison of machine learning algorithms for estimating total nitrogen in forage maize management, integrating spectral indices into the title can enhance its relevance and specificity. For example: "Spectral Index-Based Estimation of Total Nitrogen in Forage Maize: A Comparative Analysis of Machine Learning Algorithms"
Authors: The title was changed at the suggestion of the reviewer, to which the authors agree with the change, in order to give greater clarity and coherence to the study.
REVIEWER 1: Line 102. Figure 1: it is important to highlight several deficiencies that hinder its effectiveness as a map representation. Firstly, the absence of essential elements such as a legend, north arrow, and scale significantly impairs the interpretability of the map. These components are vital for providing context and aiding readers in understanding the spatial distribution of the data. Furthermore, the visualization of coordinates appears to be inaccurate or unclear. Precise and correctly displayed coordinates are essential for georeferencing and spatial analysis.
Authors: The map in figure 1 was corrected and the necessary information was added to spatially and geographically locate the reader, this suggestion is important and was fully addressed.
REVIEWER 1: A critical comment regarding Figures 2, 3, and 4 is that the number of samples (n=32) should be indicated in each figure. Including the sample size provides important context for the analysis and allows readers to assess the robustness and generalizability of the findings.
Authors: The figures were edited and the sample number was added to the figures where the sample was analyzed.
REVIE
WER 1: The visual clarity of figures 2-8 appears to be compromised, likely due to low resolution or inadequate image rendering. As a result, details in the plots may be difficult to discern, hindering the reader's ability to interpret the data accurately.
Authors: The quality of the figures was improved
REVIEWER 1: Regarding Figure 9, it is noted that the figure cannot be displayed correctly, indicating potential issues with font size and overall figure dimensions. Increasing the size of both the font and the figure itself can improve readability and ensure that all elements are legible, even when viewed at a larger scale.
Authors: figure 9 was enlarged and the quality was improved to make the information more precise.
REVIEWER 1: Line 309-326. The manuscript lacks a comprehensive literature review. It would greatly benefit from a thorough discussion of previous studies on the use of machine learning algorithms for estimating nitrogen levels in crops, particularly maize. This would provide context for the current research and highlight its contribution to the existing body of knowledge.
Authors: A literature review on crop nitrogen estimation using machine learning algorithms was performed. lines (343-361).
REVIEWER 1: The methodology section requires further elaboration to ensure reproducibility and transparency. Details such as the specific parameters selected of machine learning algorithms used, and criteria for selecting spectral indices should be provided. Additionally, information on the sample size (line 118), data collection process (Line 113), and validation methods is essential for assessing the robustness of the study.
Authors: The machine learning algorithms subsection was improved. The parameters were included in (lines 221-235). the sample size in (lines 134-142). the data acquisition process (line 131-133). and the validation of the models (lines 267-270).
REVIEWER 1: Line 212: Enumerating equations, such as labeling them as Equation 1, Equation 2, etc., serves as a fundamental navigational aid for readers, enabling them to refer to specific equations easily throughout the text.
Authors: The equations are listed for the reader's reference.
REVIEWER 1: the section 2.7 (line 170) lacks clarity in elucidating the specific parameters and configurations employed for both ANN and RF models. Details such as the number of hidden layers, activation functions, and optimization algorithms utilized in the ANN model, as well as the number of decision trees and splitting criteria used in the RF model, are essential for understanding the modeling process comprehensively. Without this information, readers may struggle to assess the robustness and reproducibility of the models.
Authors: The model parameters were included as suggested by the reviewer in (lines 221-235).
REVIEWER 1: While the use of k-fold cross-validation for model evaluation is commendable (line 199), the section could benefit from a more explicit explanation of the rationale behind this approach. Although k-fold cross-validation is widely recognized as a standard practice for model validation, providing a justification for its selection and discussing potential alternatives or limitations would enhance the transparency and credibility of the methodology.
Authors: The k-fold cross validation section was improved (Lines 236-255).
REVIEWER 1: Additionally, the description of hyperparameter tuning could be further elaborated to provide clarity on the specific hyperparameters being optimized, the range of values tested, and the criteria used for selecting the optimal combination. Furthermore, addressing the potential challenges, such as overfitting, associated with hyperparameter tuning and the measures taken to mitigate these issues would enrich the discussion and provide valuable insights into the model optimization process.
Authors: The description of the hyperparameters of the algorithms has been improved (line 221-235).
REVIEWER 1: The interpretation of results should be more thorough and nuanced. Rather than solely focusing on the percentage of variance explained by the models, the authors should discuss the practical implications of their findings and potential limitations. Additionally, a comparison of the performance of Random Forest and Neural Network algorithms in estimating nitrogen levels and the other agronomic aspects would provide valuable insights.
Authors: Increased discussion (line 349-361)
Author Response File: Author Response.docx
Reviewer 2 Report
Comments and Suggestions for AuthorsThe manuscript nitrogen-3004864 entitled "Comparison of Machine Learning Algorithms for Estimating Total Nitrogen for the sustainable forage Maize (Zea mays L.) management in Northern Mexico" submitted by Aldo Rafael Martínez-Sifuente et al. presents an interesting experimental activity that assess the effect of UAV and ML integration on the estimation of N plant concentration.
The experiental activity and the manuscript are very interesting. In particular, the manuscript is well written.
In my opinion, only the M&M section need some revision. With this regard, I suggest 1) to add a detailed description of soils and agronomical management of the two experimental sites; 2) accurately describe how ML approches were implemented: models, software used, scripts.
Author Response
REVIEWER 2
In my opinion, only the M&M section need some revision. With this regard, I suggest 1) to add a detailed description of soils and agronomical management of the two experimental sites; 2) accurately describe how ML approches were implemented: models, software used, scripts.
Authors:
The materials and methods section was improved as suggested by the reviewer. a detailed description of the soil in the study plots was integrated, as well as the agronomic management (Line 104-125). The ML models were included in (Line 221-251), including the description of hyperparameters and k-fold cross-validation, essential for replication of the stuthe materials and methods section was improved as suggested by the reviewer. a detailed description of the soil in the study plots was integrated, as well as the agronomic management (Line 104-125). the ML models were included in (Line 221-251), including the description of hyperparameters and k-fold cross-validation, essential for replication of the study in other sites. dy in other sites.
Author Response File: Author Response.docx