Next Article in Journal
Influence of Pasture Stocking Method on Surface Runoff and Nutrient Loss in the US Upper Midwest
Next Article in Special Issue
Nitrogen Use Efficiency Using the 15N Dilution Technique for Wheat Yield under Conservation Agriculture and Nitrogen Fertilizer
Previous Article in Journal
Tree Species Influence Nitrate and Nitrous Oxide Production in Forested Riparian Soils
Previous Article in Special Issue
Estimating Fertilizer Nitrogen-Use Efficiency in Transplanted Short-Day Onion
 
 
Article
Peer-Review Record

Simulating Maize Response to Split-Nitrogen Fertilization Using Easy-to-Collect Local Features

Nitrogen 2023, 4(4), 331-349; https://doi.org/10.3390/nitrogen4040024
by Léon Etienne Parent 1,* and Gabriel Deslauriers 2
Reviewer 1:
Reviewer 2: Anonymous
Nitrogen 2023, 4(4), 331-349; https://doi.org/10.3390/nitrogen4040024
Submission received: 13 June 2023 / Revised: 30 October 2023 / Accepted: 3 November 2023 / Published: 9 November 2023
(This article belongs to the Special Issue Optimizing Fertilizer Nitrogen Use on Crops)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Comments for nitrogen-2476981

The authors presented an investigation of the maize response to nitrogen fertilization. The study combines simulation using machine learning and historical datasets, as well as verification using field experiments.  The major objective of the current manuscript is to determine the N response of maize by leveraging machine learning and available data. After reading the manuscript, I felt the following items need to be addressed/clarified before the manuscript can be considered for publication

1.      Introduction: Basically, the introduction write-up needs to include the following contents: (1) what is the significance of the major research question, and what is the current stage of this question; (2) what is the major gap in the related field and how your research fills the gap; and (3) what are the specific objectives of your research.  In the current introduction, I cannot tell (1) why selecting machine learning over other methods and (2) why selecting two specific algorithms rather than others.

 2.      Methodology: The maize records from 1992 to 2022 were applied in the machine learning process. However, I am not sure how potential temporal effects were handled in the model development. As a well-accepted fact that the atmosphere temperature continuously increased in the past decades and the pattern of precipitation is also changed. I am not sure how results trained by observations from all time periods are valid for current and future situations.

 3.      Methodology: Following the previous comment, I would like to see the summary statistics for management practices (e.g., tillage, fertilization, irrigation, and others). Specifically, I am also wondering if these practices have changed over time.

 4.      Methodology: From Figure 2 and Table 2, it is found that a large number of missing values exist in the training datasets. Most of the selected variables don’t have records for every observation. In this case, how do the authors handle the missing data during the model training? What would be the potential impacts on the final results?

 5.      Results: The authors didn’t provide information about the variable selection. Was every variable used for both machine learning algorithms? If variables were selected, please report the selection results. In addition, please provide a sensitivity analysis for the trained model. Otherwise, it is not possible to figure out the most important driving factors.

 6.      Results: Telling from Figure 4, obvious spatial effects can be found for universal tests. The trained model underestimated grain yield for sites 1, 2, 3, and 6; but overestimated grain yield for sites 4 and 5. What is the explanation for this observation?

 Overall, I suggest a major revision for the current manuscript. 

 

Comments on the Quality of English Language

Language is easy to follow. Minor errors can be improved if the manuscript is considered for publication. 

Author Response

Reviewer no. 1

Open Review

( ) I would not like to sign my review report
(x) I would like to sign my review report

Quality of English Language

( ) I am not qualified to assess the quality of English in this paper
( ) English very difficult to understand/incomprehensible
( ) Extensive editing of English language required
( ) Moderate editing of English language required
(x) Minor editing of English language required
( ) English language fine. No issues detected

 

 

 

Yes

Can be improved

Must be improved

Not applicable

Does the introduction provide sufficient background and include all relevant references?

( )

(x)

( )

( )

Are all the cited references relevant to the research?

( )

(x)

( )

( )

Is the research design appropriate?

( )

(x)

( )

( )

Are the methods adequately described?

( )

( )

(x)

( )

Are the results clearly presented?

( )

( )

(x)

( )

Are the conclusions supported by the results?

( )

(x)

( )

( )

Comments and Suggestions for Authors

Comments for nitrogen-2476981

The authors presented an investigation of the maize response to nitrogen fertilization. The study combines simulation using machine learning and historical datasets, as well as verification using field experiments.  The major objective of the current manuscript is to determine the N response of maize by leveraging machine learning and available data. After reading the manuscript, I felt the following items need to be addressed/clarified before the manuscript can be considered for publication

We thank reviewer #1 for his constructive comments.

  1. Introduction: Basically, the introduction write-up needs to include the following contents: (1) what is the significance of the major research question, and what is the current stage of this question; (2) what is the major gap in the related field and how your research fills the gap; and (3) what are the specific objectives of your research.  In the current introduction, I cannot tell (1) why selecting machine learning over other methods and (2) why selecting two specific algorithms rather than others.

Answer. We agree that the introduction needs a re-write. More justifications included in introduction for the five points raised by reviewer #1.

(1) what is the significance of the major research question, and what is the current stage of this question;

l.31-36: Compared to official N recommendations of 120 to 170 kg N ha-1 in eastern Canada, the EONRs was found to vary widely between 0 and 240 kg N ha-1 [3,4]. Not considering such large variation results not only in economic losses but also in nitrate leaching that impacts water quality, and in emission of nitrous oxide (N2O), a potent greenhouse gas (GHG) [5-7] that also depletes stratospheric ozone [8]. Fertilization decisions to tackle N2O emissions may show greater leverage than soil C sequestration to tackle climate change [9].

 

(2) what is the major gap in the related field and how your research fills the gap;

  1. 37-47: The decision to apply the “right” N rate can be assisted by predictive models. To capture the complexity of crop behavior, models must be calibrated and validated after partitioning the database into training and testing sets, and universality tests conducted to verify model’s ability to generalize to unseen cases [10]. Model validation and universality tests are crucial to measure model accuracy. The present N models used to recommend the “right” N rate rely on N budgets, mechanistic-empirical concepts, parametric statistics, and non-parametric machine learning (ML) approaches. The EONR evaluated at each experimental site is the target variable in most models. However, EONRs may vary widely even among trials conducted under similar conditions due in part to the differential experimental setups and to the selected non-linear response functions. Alternatively, the results of several field trials conducted under similar conditions could be aggregated to generate typical response patterns reflecting the specified combination of yield-impacting features. The EONRs are derived from aggregated response patterns then verified for generalization by unseen on-farm universality tests.

(3) what are the specific objectives of your research. 

The title has been changed to fit better with the objective.

  1. 87-91: We hypothesized that (1) maize yield are predicted accurately by random forest and XGBoost using easy-to-collect yield-impacting features, and (2) ML prediction of maize response to added N given a combination of key features reflects the actual response pattern observed in on-farm universality tests. Our objective was to develop reliable ML models for N response patterns of maize crops under local conditions as evidence-based tools for maize growers to contribute reducing the detrimental effects of N overfertilization.

 

In the current introduction, I cannot tell (1) why selecting machine learning over other methods

  1. 73-78: The ML decision trees are non-parametric models, have very few parameters and good scalability, and can detect multivariate interacting effects between numerous variables in high-dimensional databases (33). Adding interactions in ML models greatly increase the number of features at the expense of model parsimony with no gain in accuracy (34). The ML models enable capturing nonlinear interactions (35). The ML models do not require defining expected yield, N offtake, and N credits a priori. The extreme gradient boosting (XGBoost) and random forest (RF) proved to be efficient ML methods to predict crop performance [35], EONRs [18,19, 34, 36, 37], and maize yield [38].

(2) why selecting two specific algorithms rather than others.

We need models accurate enough to make reliable predictions and contributing to avoid excessive N fertilization that leads to environmental damage.

  1. 77-79The extreme gradient boosting (XGBoost) and random forest (RF) proved to be efficient ML methods to predict crop performance [35], EONRs [18,19, 34, 36, 37], and maize yield [38]. However, missing data can limit and potentially bias posterior analyses in ML models [39].
  2. Methodology: The maize records from 1992 to 2022 were applied in the machine learning process. However, I am not sure how potential temporal effects were handled in the model development. As a well-accepted fact that the atmosphere temperature continuously increased in the past decades and the pattern of precipitation is also changed. I am not sure how results trained by observations from all time periods are valid for current and future situations.

We replaced geographic coordinates and year of experimentation that are observable features by meteorological data that make the model predictable across periods. Indeed, the ML predictions were improved in universality tests.

  1. Methodology: Following the previous comment, I would like to see the summary statistics for management practices (e.g., tillage, fertilization, irrigation, and others). Specifically, I am also wondering if these practices have changed over time.
  2. 31-32: Compared to official N recommendations of 120 to 170 kg N ha-1, the EONRs was found to vary widely between 0 and 240 kg N ha-1 in eastern Canada [3,4].
  3. 104-112: The yield variable and the associated features used for modeling purposes are presented in Table 1. Before 2013, the tillage practice was mainly conventional (ploughing and harrowing). From 2013 to 2019, 39-52% of maize areas in Quebec were under conservation tillage (reduced tillage or no-till) with an objective of ≥ 70% by 2030 [43]. There were 1894 trials under reduced tillage, 1707 under no till, and 5043 under conventional tillage. Maize was scarcely irrigated except in some coarse-textured soils. Previous crops were mostly soybean in recent trials, and maize in older trials. We documented 3923 trials with soybean as previous crop, 2692 with maize, 957 with small grains, 375 with forage crops, and 89 with other crops. The database indicated that 2192 trials received organic amendments and 3537 trials did not receive any organic amendment. The managerial features other than the ones listed were assumed to have been addressed adequately by growers.

 

  1. Methodology: From Figure 2 and Table 2, it is found that a large number of missing values exist in the training datasets. Most of the selected variables don’t have records for every observation. In this case, how do the authors handle the missing data during the model training? What would be the potential impacts on the final results?

Figure 2 indicates the counts for trials conducted between 1992 and 2022. No missing data there. Now Figure 1.

Missing data are attributable to the differential information provided by various research teams that contributed to the historic database.

  1. 137-145: There were 31% missing data in the database, primarily as data not missing at random in the historic dataset, due to different objectives and financial support among research projects. Missing data can limit and potentially bias posterior analyses and can thus affect statistical inference [39]. To address missing data, data preprocessing could “rebalance” the database. Features selected in Table 1 were easy-to-collect and present in large number. Soil test results other than pH and soil organic matter have already been addressed by local recommendation guidelines and were thus eliminated. The remaining missing data were imputed. However, imputing compositional data such as soil genetic fuzzy scores and particle-size distribution is absurd because the sum of components may not be bounded to one or 100%. This is why compositional data were ilr-transformed before conducting the imputation as described below. As suggested by [39,40], we used the random forest imputation for large proportions of missing data (30-50%).

 

  1. Results: The authors didn’t provide information about the variable selection. Was every variable used for both machine learning algorithms? If variables were selected, please report the selection results. In addition, please provide a sensitivity analysis for the trained model. Otherwise, it is not possible to figure out the most important driving factors.
  2. 182-185: The RReliefF ranks predictors according to their relevance to the target variable in problems with strong dependencies (56). The RReliefF computes a difference between actual and predicted values in regression problems based on the nearest neighbor paradigm and operated in a non-myopic manner (considering feature interactions). We used the RReliefF algorithm as supplied by the Orange Data Mining freeware v. 3.34.0.

Figure 3: results of RRelief

Table 1: sensitivity analysis. It is noteworthy that scenario #3 can predict yield accurately using meteorological data collected before stage V6 an exceptionally good result. Split-N fertilization can be computed by difference between total N requirement and the N rate applied at seeding..

  1. Results: Telling from Figure 4, obvious spatial effects can be found for universal tests. The trained model underestimated grain yield for sites 1, 2, 3, and 6; but overestimated grain yield for sites 4 and 5. What is the explanation for this observation?

 In this paper, we look for EONR as the slope of the response curve. Maize yield may vary with among models, but the slopes are similar. In two cases, growers’ results did not match models’ outcomes.

  1. 247-252: While yields predicted by random forest and XGBoost were close to actual yields in most universality tests and within RSME or MAE values, actual yields at sites #3 and #10 were lower than predicted. At site #3, there was evidence of spatial variability among replicates that may require delineating more soil management zones. At site #10, features such as drainage, land levelling, slope, and compacted subsurface layer could have limited yield but were insufficiently documented for inclusion in the ML models. In such situations, the decision should consider additional on-site investigation.

 

Overall, I suggest a major revision for the current manuscript. 

 The Ms has been revised thoroughly.

 

Comments on the Quality of English Language

Language is easy to follow. Minor errors can be improved if the manuscript is considered for publication. 

The former version has been examined by the MDPI editing service for the quality of the English. However, major revision implies reshuffling sentences and adding new material. We hope to meet the requirements for the quality of the English.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Dear authors,

please see comments in the attachment

Comments for author File: Comments.pdf

Author Response

Reviewer #2

Submission Date

13 June 2023

Date of this review

20 Jun 2023 21:46:12

 

Reviewer no. 2

Open Review

(x) I would not like to sign my review report
( ) I would like to sign my review report

Quality of English Language

(x) I am not qualified to assess the quality of English in this paper
( ) English very difficult to understand/incomprehensible
( ) Extensive editing of English language required
( ) Moderate editing of English language required
( ) Minor editing of English language required
( ) English language fine. No issues detected

 

 

 

Yes

Can be improved

Must be improved

Not applicable

Does the introduction provide sufficient background and include all relevant references?

( )

(x)

( )

( )

Are all the cited references relevant to the research?

( )

(x)

( )

( )

Is the research design appropriate?

(x)

( )

( )

( )

Are the methods adequately described?

( )

( )

(x)

( )

Are the results clearly presented?

( )

(x)

( )

( )

Are the conclusions supported by the results?

( )

( )

(x)

( )

Comments and Suggestions for Authors

Dear authors,

please see comments in the attachment

General answer: interactions among features can be addressed by parametric or non-parametric methods, depending on the objective of the research. Our objective was to develop a useful and accurate model to assist growers making wise decisions on N fertilization rates. The important issue is to gain information from features available in the database. We elaborated ML models using data easy to collect and little impacted by interactions. The RReliefF algorithm was instrumental to select important features by accounting for interactions. Random forest imputation is an efficient method to impute missing-not-at-random data (not measured by researchers) in large databases.


peer-review-30230339.v1.pdf

Review of Site-specific maize (Zea Mays) response to nitrogen fertilization

The manuscript presents a comparison of two ML approaches on the analysis of a large set of 461 trials with 8909 observations. Environmental covariates and treatment-specific fertilization amount were used in a training dataset to predict yield response in a test dataset. Additionally, N response were predicted for seven universitary trials. The data are unbalanced as most environmental information were incomplete. The authors showed that XGBoost is the prefered ML algorism. The paper is based on a large and potentially interesting dataset. ML approaches are easy to a apply black-box approaches. Especially random forst does not neccessary find the best model at all. While the data is interesting and the work itself is not too bad, the published work lacks deeper interpretation and – lacks some base-line model to show, that ML has any advantage at all. Details in the ML approaches remain unclear.

Answer: Reviewer #2 requires major changes in the modelling process. We thank him for his great comments. We had indeed to manage with METs and decided to apply non-parametric ML methods to solve this complex system. Our objective was to gain as much information as possible from a dataset made of easy-to-collect features in a database made of 461 trials conducted over 30 years by different research teams. We thus retained both random forest and XGBoost as baseline  models for universality tests. Comparing parametric to non-parametric models is another interesting objective that could be addressed in future studies using balanced designs such as the one used by Qin et al. (2017) using 47 US trials and Ransom  et al. (2018) using 49 US field trials conducted for 3 years.

 

 Major remark:

The title is missleading. The paper estimates site-specific means AND a response to N fertilization. This is not the same as site-specific response at all, as the model does not allow for different responses on N depending on the site.

Answer:

The title has been changed to fit better with the objective.

  1. 37-47: The decision to apply the “right” N rate can be assisted by predictive models. To capture the complexity of crop behavior, models must be calibrated and validated after partitioning the database into training and testing sets, and universality tests conducted to verify model’s ability to generalize to unseen cases [10]. Model validation and universality tests are crucial to measure model accuracy. The present N models used to recommend the “right” N rate rely on N budgets, mechanistic-empirical concepts, parametric statistics, and non-parametric machine learning (ML) approaches. The EONR evaluated at each experimental site is the target variable in most models. However, EONRs may vary widely even among trials conducted under similar conditions due in part to the differential experimental setups and to the selected non-linear response functions. Alternatively, the results of several field trials conducted under similar conditions could be aggregated to generate typical response patterns reflecting the specified combination of yield-impacting features. The EONRs are derived from aggregated response patterns then verified for generalization by unseen on-farm universality tests.
  2. 87-91: We hypothesized that (1) maize yield are predicted accurately by random forest and XGBoost using easy-to-collect yield-impacting features, and (2) ML prediction of maize response to added N given a combination of key features reflects the actual response pattern observed in on-farm universality tests. Our objective was to develop reliable ML models for N response patterns of maize crops under local conditions as evidence-based tools for maize growers to contribute reducing the detrimental effects of N overfertilization.

 

This has two major consequences: First, add base-line model: The authors fit two ML approaches. Independent on the result, this cannot prove that ML is a prefered method, that should be used for analysis. It only shows that ML gives a certaim accurancy – and one method is better than the other (but both can work fine or not). May be there are much better approaches. The state of art analysis of such type of data is a mixed model that can be extended by environmental covariantes. There is lots of recent work on this topic, e.g. https://doi.org/10.1007/s00122-022-04186-w (there are many more papers on this topic, no need to cite the manuscript itself). The advantage of mixed models are that they can potentially explore all information (in contrast to RF that is limited to a given number of variables). The authors need to fit a base-line mixed model – a common MET data analysis - ignoring all environmental covariates but adding in-trial covariates. This model should be treated as benchmark and can be extended by environmental covariates (to predict intercept and/or N-response)– or can be replaced by any ML approaches Gain of ML approaches should be discussed compared to those alternatives. The current ML-only analysis tells nothing.

Answer:

The reviewer suggests that mixed models should be viewed as baseline models including environmental interactions. Recent papers on N fertilization issues analyzed the data using machine learning models. In general, ML accuracies exceeded 80%, within the range of the best models used in agronomic and clinical sciences. We could not find MET-applied mixed model for comparison with ML models. This would be a gigantic task. Environmental interactions may include not only meteorological data at a given level of aggregation but also managerial and soil data. Interactions may include 2-way, 3-way, 3-way...  interactions, generating myriads of variables difficult to interpret in the realm of complex agroecosystems. The significance level selected to reject interactions may dismiss crucial effects (Armheim et al., 2019).  In comparison, ML models enable capturing nonlinear interactions [35] (l. 75-76). Thie interaction issue was also addressed by Ransom et al. (2018) who found little effect on RF accuracy after adding interactions.

  1. 82-86: treatment × environment interactions and variance heterogeneity among genomic trials may be addressed parametrically using mixed models for balanced databases of multi-environment trials (MET) that include interactions [41]. However, random forest used to predict maize EONR largely outperformed ridge, lasso, principal component, and partial least square regression models, even after adding interactions [34]. Moreover, using statistical significance to support conclusions in parametric approaches may dismiss crucial effects [42].

 

Second, there are two types of covariates: Both types of variables used as input for ML approaches. Variables that describe the trial location and year, and variables that describe in-trial variance. The effect of both should be seperated as the first estimates the correct intercept while the second estimates the slope. Interactions of N and site-specific covariables allow what the authors claimed that they had done – a site specific response on N.

Answer:

Trial location and year were replaced by meteorological features (weekly precipitations and CHU). To make models more flexible to address climate variations, we used meteorological data as random variables instead of geographic coordinates and year of experimentation as fixed variables. The ML models analyze combinations of all features. The response pattern is obtained from the regression. We aim to reach high gains of information by adding variables, not to test variables and their interactions for significance. Model reliability is secured not only by model accuracy but also by its generalization ability to generalize to unseen cases in growers’ fields (universality tests).

 

The presented results on universitary trials showed that the general relationship between yield and N can be explored well, but the intercept varies from expectation. No information on site-specific response is given. The revision should seperate effects on both types and present accurancy information for both.

More complete information on site features is provided in Table 2. It is not possible to compute a reliable intercept because all response patterns start with a minimum N fertilization called “starter fertilizer” at seeding. The EONR is the slope of the response pattern, hence the intercept is of no interest. It is common to verify predictive models within a given range of N rates to avoid too many treatments of little interest to the grower, as shown by Kyveryga et al. (2007, 2011).

 

Further comments:

Missing discussion on causal influences: The major disadvantage of ML is that there is no causal relationship that explain which variables have an influence on which process in yield development. In RF there is an option that shows the influence of variables, but results are not shown in the manuscript. The titel of the manuscript furthermore reads as if there is a prove for a causal relationship between N and yield, but ML in general does not allow to prove causal relationships. Note that the ML approaches predict higher yields for 50 compared to 75 kg N. This is not plausible, but plausibility is not part of ML prediction. It would be interesting toe see, how the base-line model handle this artefact.

Answer:

It is well known that maize respond to N fertilization. RReliefF in Figure 3 presents the relative importance of features for maize yield while all of them have been already documented as impacting maize yield. Sensitivity analysis in Table 1 shows that some features contribute little to model accuracy. The N dosage is indeed an important feature. Our fertilizer trials generally comprised 5 N doses often showing different spacings compared to 8 for Qin et al. (2017) and Ransom et al. (2018) who computed site-by-site EONR values. The number of N rates must be high to draw a response curve. This can be obtained by aggregating field trials conducted under similar conditions but results realistically in general discontinuous response trends not necessarily increasing constantly, especially close to yield plateau.

We could have also varied meteorological features to predict maize yields under climate change, but this would be a different objective. An important aspect of the present model is that meteorological features up to V6 suffice to reach high model accuracy, allowing to recommend split-N rate by difference between total N requirement and the N applied already at seeding. Our very objective is to avoid excessive N fertilizer applications in a convincing evidence-based manner for the growers who is the decision-making person.

 

Survey data: The data were taken from several experiments. All site-specific covariates were not part of the randomisation process, and therefore may show pattern of survey data. E.g. planting date and longitute are probably related to each other. Furthermore, NPK are often given in a certain ratio. In experiments, independence of factors can be garanteed. In the authors data, this cannort be garanteed. This issue should be discussed. The authors stated that they assumed that all nutrients except N were adequately available. If this is the case, regressions on K and P should not be explored.

Answer:

In complex systems, all features are interrelated and impact crop yield to some degree. Features could be selected a priori from common knowledge on inter-relationships. This is why we eliminated soil tests that provide recommendations for nutrients other than N (there is no a priori ratios among elements). Soil tests and nutrient recommendations for nutrients other than N are thus related and aim to provide those nutrients in sufficient amounts. The RReliefF ranks features in order of importance for the target variable and may thus help in feature selection. Features could also be removed is they contribute little to model accuracy. This was the case of PSNT and bulk density. The features retained for modelling purposes are presented in Table 2.

 

Missing data:

There is no information given how the author handle the missing data problem. In RF there are imputation approaches that vary in calculation time. The authors should add the missing information and should discuss the consequences, especially whetjer the missing data pattern is (completely) at random or not.

We eliminated features that showed too many missing data and retained those presented in Table 1. This is normal where data are assembled from different sources (different objectives and differential financial support). We used RF imputation (l. 137-145) as the most appropriate method to handle missing data where missing data made 30-50% of the dataset, as suggested by:

Petrazzini, B.O.; Naya1,H.; Lopez-Bello F.; Vazquez, G.; Spangenberg, L. Evaluation of different approaches for missing data imputation on features associated to genomic data. BioData Mining 2021, 14-44 https://doi.org/10.1186/s13040-021-00274-7

Kokla, M.; Virtanen, I.; Kolehmainen, M.; Paananen, J.; Hanhineva, K. Random forest-based imputation outperforms other methods for imputing LC-MS metabolomics data: a comparative study. BMC Bioinformatics 2019, 20, 492. https://doi.org/10.1186/s12859-019-3110-0

 

Minor remarks:

The authors used 40 trees in RF, but showed no evidence that 40 is satisfying. The R² seem to relate to a regression model of actual and predicted yield. Note that the model includes an intercept. I cannot see a logical interpretation for the intercept. It further show a bias for lower values as all values are below the curve. Please give a justification for the intercept. Results from validation showed that depending on the the site, predicted value are much higher or lower – and therefore does not fit a corerct intercept.

Answer:

We reran the models using meteorological features rather than geographic coordinates and year of experimentation (Figure 2). The relationship between predicted and actual yields is an outcome from the model given the specified features. Sources of error are experimental error and features that limited yields but were not included in the model. For sites like #3 and #10, soil management zones should be better delineated, and the missing yield-limiting features should be identified by field visits. Finally, the EONR is the slope of the model close to maximum yield, hence, the intercept has no impact.

 

Final remarks:

I know that MDPI revision is normally due in 3 to 7 days. Fromm y experience, running the modelles suggested will take much longer.

 

Sorry for the delay.

 

Summary of the review:

The manuscript has the potential to be a valuable paper after a careful revision with adding a baseline model with and without environemntal covariates. Without that extension, the manuscript is uninteresting to the audience. Note that whenever you compare two approaches, one is better than the other, but this does not tell something about the quality of both models at all.

Answer:

Our objective was to build an accurate and useful parsimonious model from easy-to-collect data, not to test for interactions among features that are already handled by ML models. There are myriads of interactions in the real world that cannot be entirely handled by any model. For example, the genetic × environment interactions in MET are very limited parametrically to meteorological features aggregated in a certain way, and do not consider other important features such as soil quality, previous crops, and tillage practices. The paper may not be of great interest to a certain audience of modelers but is of great importance to challenge the present N fertilization practices that contribute to climate change. As George Box said: “All models are wrong, but some are useful”.

 

Submission Date

13 June 2023

Date of this review

26 Jun 2023 10:13:33

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

Dear Authors,

from the response to reviewer I cannot see that my comments were accounted for. The authors explained what was done, but ignored what should be changed.

A comparison with mixed models as base-line approach (or any other non-ML approach that can serve as base-line model) is mandatory to show that ML is reasonable at all. A R² of 0.8 tells nothing. Such a comparison is for sure not limited to balanced cases and were made e.g. in

https://link.springer.com/article/10.1007/s10260-022-00658-x

Mixed model approaches were listed in the introduction of

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6540942/

which is a nice paper on a similar topic. There is much more literature on mixed models for MET, I really do not understand, that the authors claim that they found nothing "We could not find MET-applied mixed model for comparison with ML models.". Many things that were made in the second paper can and need to be done in the current paper, see also the comments about this in the conclusion. Note that there are many more papers on MET analysis and ML approaches showing limited gain of ML approaches compared to mixed models for small data but advantages for very large data set. The two papers given can be used as a good starting point for literture review.

Also the second point was completely ignored. ML are black-box approaches that does not allow to draw causal relationships. My comment was, that the authors used covariates (as ML input) on two different levels, covariate that describe sites and covariate that describe N response. As the paper tells that the later was the aim, it is mandatory to clearly distinguish between a good prection on average site mean and a good prediction on the N response within a site. This is still ignored.

Once again: I like the data and the work made so far. But the current version lack serious ommisions that were already solved and published the last years. I see no argument, why the authors should not be able to include the progress made across the last years into their work.

For more details please see my first review

 

Author Response

from the response to reviewer I cannot see that my comments were accounted for. The authors explained what was done, but ignored what should be changed.

Answer:

Our objective differed from reviewer’s objective. Reviewer’s own objective was apparently to run statistical tests such as those conducted by genotype-developing companies to compare their genotypes tested in different environments. This is fine to convince growers to adopt those new technologies as secured by conventional statistical significance tests. This was not the objective of our paper. We needed an easy-to-run decision-support model to abate on-farm GHG emissions based on known causal relationships between the target variable and easy-to-collect key informative features.

We added references, as suggested by the reviewer, in the introduction, as well as some more to insist on the importance to abate GHG with wise split N application that would avoid speculating on insurance N. Note that beyond model performance, the most important issue to facilitate adoption is the model’s ability to generalize to unseen cases. While there is no perfect model that must rely on the information available, we could reduce the N rate by 50 kg N ha-1 in most universality tests (quality control of model outcomes at farm gate) compared to growers’ auto-diagnoses. This is comparable to Adapt-N, the commonly used empirical-mechanistic model used in the Midwest.

We made several minor additions and adjustments so that a re-reading would be necessary. We still hope that this new version of the Ms will now be meet reviewer’s satisfaction. In any event, we thank the reviewer sincerely for the debate.

 

A comparison with mixed models as base-line approach (or any other non-ML approach that can serve as base-line model) is mandatory to show that ML is reasonable at all. A R² of 0.8 tells nothing. Such a comparison is for sure not limited to balanced cases and were made e.g. in

https://link.springer.com/article/10.1007/s10260-022-00658-x

Mixed model approaches were listed in the introduction of

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6540942/

which is a nice paper on a similar topic. There is much more literature on mixed models for MET, I really do not understand, that the authors claim that they found nothing "We could not find MET-applied mixed model for comparison with ML models.". Many things that were made in the second paper can and need to be done in the current paper, see also the comments about this in the conclusion. Note that there are many more papers on MET analysis and ML approaches showing limited gain of ML approaches compared to mixed models for small data but advantages for very large data set. The two papers given can be used as a good starting point for literture review.

Answer:

We should not be too dogmatic about the choice of the model or some baseline model. As indicated by Hu et al. (2022), ‘the nature of data is of primary importance rather than the learning technique’. An effective model should decrypt the information in the database in relation with the objective. Our database comprised fertilizer METs with the objective to extract maize N response patterns from a combination of specified features that are easy-to-collect by growers. Our objective differed from classifying genotypes for the genomic sector using statistical tests across environmental conditions. Surprisingly, the MET papers on the genotype × environment interaction do not account for soil texture, SOM, pH, soil genesis, tillage practice, N dosage and the previous crop. The environment is often defined by location and year or as large timesteps for meteorological data, while we used a narrow weekly timestep that is convenient for crop management (N, irrigation, …). The number of potential interactions would become astronomic and require long computing time for detailed environmental information to run mixed models. Nevetheless, we ran a parsinomious mixed model, but it returned deceiving results (R2 = 0.487, RSME = 1.996, and MAE = 1.529) compared to the ML models. The mixed model was thus discarded for its low performance.

Note that ML models are also fully convenient to analyze the genotype × environment interaction, and did not require comparison with the mixed model, see:

Westhues, C.C.; Simianer, H.; Beissinger, T.M. learnMET: an R package to apply machine learning methods for genomic prediction using multi-environment trials data. G3 2022. 12(11), jkac226. https://doi.org/10.1093/g3journal/jkac226

Feature interactions can be handled by ML models, see:

Huynh-Thu, V.A.; Geurts, P. Unsupervised Gene Network Inference with Decision Trees and Random Forests. In: Sanguinetti, G., Huynh-Thu, V. (eds) Gene Regulatory Networks. Methods in Molecular Biology 2019, 1883. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8882-2_8

The importance of every feature can be ranked using RReliefF (as required by reviewer #1), see:

Robnik- Šikonja, M.; Kononenko, I. Theoretical and Empirical Analysis of ReliefF and RReliefF. Machine Learning Journal 2003, 53, 23-69.

RReliefF is enough for feature selection. We used R2, RSME and MAE for comparisons among models. Model’s ability to generalize was validated using universality tests for the selected range of N doses, as required by Kyveryga et al. (2007), see:

Kyveryga, P.M.; Blackmer, A.M.; Morris, T.F. Alternative Benchmarks for Economically Optimal Rates of Nitrogen Fertilization for Corn. Agronomy Journal 2007, 99, 1057–1065. doi:10.2134/agronj2006.0340

 

Also the second point was completely ignored. ML are black-box approaches that does not allow to draw causal relationships. My comment was, that the authors used covariates (as ML input) on two different levels, covariate that describe sites and covariate that describe N response. As the paper tells that the later was the aim, it is mandatory to clearly distinguish between a good prection on average site mean and a good prediction on the N response within a site. This is still ignored.

Answer:

Indeed, ML models are claimed to be black-box models, but are used widely to solve complex problems in health science, agriculture, ecology, ... In our case, we needed an easy-to-run decision-support model based on already documented causal relationships between the target variable and key specified features that are familiar to stakeholders. This is not different than other papers using ML models for fertilizer METs, see:

Wang, X.; Miao, Y.; Dong, R.; Zha, H.; Xia, T.; Chen, Z.; Kusnierek, K.; Mi, G.; Sun, H.; Li, M. Machine learning-based in-season nitrogen status diagnosis and side-dress nitrogen recommendation for maize. European Journal of Agronomy 2021, 123, 126193.

Ransom, C.J.; Kitchen, N.R.; Camberato, J.J.; Carter, P.R.; Ferguson, R.B.; Fernandez, F.G.; Franzen, D.W.; Laboski, C.A.M.; Myers, D.B.; Nafziger, E.D.; Sawyer, J.E.; Shanahan, J.F. Statistical and machine learning methods evaluated for incorporating soil and weather into maize nitrogen recommendations. Computers and Electronics in Agriculture 2019, 164, 104872.

Coulibali, Z.; Cambouris A.N.; Parent S.-É. Site-specific machine learning predictive fertilization models for potato crops in Eastern Canada. PLoS ONE 2020, 15(8), e0230888. https://doi.org/10.1371/journal.pone.0230888

Qin, Z.;Myers, D.B.; Ransom, C.J.; Kitchen, N.R.; Liang, S.Z.; Camberato, J.J.; Carter, P.R.; Ferguson, R.B.; Fernandez, F.G.; Franzen, DW.; Laboski, C.A.M.; Malone, B.D.; Nafziger, E.D.; Sawyer, J.E.; Shanahan, J.F. Application of Machine Learning Methodologies for Predicting Maize Economic Optimal Nitrogen Rate. Agronomy Journal 2018, 110, 2596–2607. doi:10.2134/agronj2018.03.0222

Correndo, A.A.; Tremblay, N.; Coulter, J.A.; Ruiz-Diaz, D.; Franzen, D.; Nafziger, E.; Prasad, V.; Moro Rosso, L.H.; Steinke, K.; Du, J.; Messina, C.D.; Ciampitti, I.A. Unraveling uncertainty drivers of the maize yield response to nitrogen: A Bayesian and machine learning approach. Agricultural and Forest Meteorology 2021, 311, 108668.

Padarian, J.; Minasny, B.; McBratney, A.B. Machine learning and soil sciences: a review aided by machine learning tools. SOIL 2020, 6, 35–52, 2020. https://doi.org/10.5194/soil-6-35-2020

On the other hand, other ML models on MET on fertilizer trials use optimum economic N rate (EONR) for each individual site as target variable. Such approach is subjected to biases and large variation in EONR values. Non-linear response model computed site by site vary widely with the selected model despite high R2 values (see Cerrato et al., 1990; Bachmaier, 2012). We do not need to derive EONR site by site. We averaged response patterns for the specified combination of features, hence obtaining smoother response patterns. The universality test results are compared to the average response pattern for the specified combination of features to ascertain model’s ability to generalize to unseen cases.

Once again: I like the data and the work made so far. But the current version lack serious ommisions that were already solved and published the last years. I see no argument, why the authors should not be able to include the progress made across the last years into their work.

Answer:

The machine learning models are competitive with the linear reaction norm approach and tend to outperform it as the training set size increases (Westhues et al., 2022). In our case, ML models outperformed the mixed model. Data acquisition gets larger with results obtained from precision agriculture studies. Other methods that could deserve further examination can combine mixed-effects models and tree methods (Fu and Simonoff 2015; Loh and Zheng 2013; Eo and Cho 2014). An extended comparison with more recently developed machine learning methods, such as deep learning, would also be of interest. However, it was not our intention to compare a large number of models. This could be done in a separate paper and still results would depend on the nature of the database. We wanted here to develop a reliable decision-support model to abate GHG emissions in the maize production system based on facts.

For more details please see my first review

Author Response File: Author Response.docx

Back to TopTop