Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Assessing Soil Prediction Distributions for Forest Management Using Digital Soil Mapping

Soil Syst. 2024, 8(2), 55; https://doi.org/10.3390/soilsystems8020055

by Gonzalo Gavilán-Acuna^1,*, Nicholas C. Coops¹

, Guillermo F. Olmedo²

, Piotr Tompalski³, Dominik Roeser¹

and Andrés Varhola¹

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3:

Prahlad Jat

Reviewer 4: Anonymous

Reviewer 5:

Ali Keshavarzi

Reviewer 6: Anonymous

Soil Syst. 2024, 8(2), 55; https://doi.org/10.3390/soilsystems8020055

Submission received: 1 February 2024 / Revised: 29 April 2024 / Accepted: 1 May 2024 / Published: 16 May 2024

(This article belongs to the Special Issue Contemporary Applications of Geostatistics to Soil Studies)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Authors present a digital soil mapping paper using data from LiDAR and Landsat and two different models SCORPAN and regression model (QRF). Paper is very-well explained, written, and the research was designed properly, leaving a very interesting paper for the scientific community.

There are a minor changes mainly related to citations that authors should address.

Please change citations according to the journal citation style in lines 43, 45, 55, 66, 87, 89, 120, 148, 175, 215, 256, 257, 292, 299, 335, 345, 366, 543, 556, 568, 582.

Lines 163-168: Figure 1: Clarify the four selected sites described in table 6. Table 6 provides the accuracy of the modelled soils properties. Also, soil pit and auger information are in black instead of red, please correct.

Author Response

We appreciate your feedback and time spent reading through this article, which resulted in modifications that increased the value of this paper. To clarify this research, we have made changes to both the citations and in Figure 1. We addressed all the specific comments, which are described in the section bellow.

Reviewer 1 Specific comments (original draft line)

(new draft version with track changes version line number)

Please change citations according to the journal citation style in lines 43, 45, 55, 66, 87, 89, 120, 148, 175, 215, 256, 257, 292, 299, 335, 345, 366, 543, 556, 568, 582.

Thank you for this comment. Citation changed according to guideline Lines: 44, 46, 56, 67, 89, 90, 121, 157. 187, 231, 281, 282, 323, 330, 405, 420, 581, 624, 636, 650.

Thank you for this comment. We revised the caption of Figure 1 to specify that readers should refer to Figure 5, and ensured that the color of the figure caption matches the picture description. Line 174-178.

Reviewer 2 Report

Comments and Suggestions for Authors

The authors present an application of how quantile regression forests (QRF) can be used to model five soil properties as a function of several climate, topographical, vegetative, and morphology variables to quantify predictions over a forest with diverse spatial domains. The methods are applied to a forest in Chile. The manuscript is structured logically and while some improvement to the organization of the methods could be made, was easy to follow. The introduction in particular is well written, and the development of a probability map is highly encouraging and an effective way to communicate prediction uncertainty. Nevertheless, I do have major concerns regarding the methods.

Major Comments:

Highly concerning is the misuse of the word “mean” to refer to the 50^th percentile of the prediction distribution. The authors should be well aware that this is instead the “median”, a critical concept to understand the output of QRF. While in practice there will be little different between the two estimates for symmetric distributions (such as the Normal distribution of which is used to create the probability map), it is incorrect to use the two interchangeably. This further means that the prediction uncertainty developed using Eq (1) (line 336) is incorrect.

In Section 4.5.3, the authors method for quantifying prediction uncertainty relies on an assessment of Normality for the 50^th percentile estimate, and then use the Normal cumulative distribution function (CDF) to generate a probability map. I have multiple concerns with this method. First, this method would generate what would be similar to a confidence interval (as opposed to a prediction interval) around the median, ignoring the uncertainty associated with the 50^th percentile. Second, while I expect the median to follow a Normal distribution over the spatial domain (which the authors results also show), the actual distribution at each spatial point (e.g., pixel) over all possible percentiles may not follow Normality, and that is the prediction distribution that should be showcased. Lastly, using the Normal CDF includes an unnecessary added assumption (creating a parametric model from a non-parametric algorithm), and does not take advantage of the benefit of using a QRF approach. The authors instead should use the distribution provided by the QRF algorithm, particularly because it is extremely simple to do within the ‘quantregForest’ package. The authors should look at the help documentation for the ‘quantregForest()’ function, paying close attention to the ‘predict’ calls when the what flag uses a ‘function(x) sample(.)’ or ‘edf(x)(.)’ call as well as “out-of-bag predictions”. The resulting distribution are indeed the prediction distribution, as noted by the original Quantile Regression Forest paper (Meinshausen, 2006).

I suggest including the R²/RMSE/MAPE values for both the training test and test sets. This should show how well both the model fits to the training data but also use in prediction. It also will give more meaning to the actual values of RMSE/MAPE.

Key details are missing from Section 4.3.3. What type of cross validation was used? I would assume 10-fold cross validation, but no detail is mentioned (except in Section 4.6). The authors also mentioned later the use of ‘autoKrige.cv’ function from the R package “automap”. Was the chosen covariance function used also automatically selected? What covariance functions were considered? Further, are RSME and r calculated on the cross-validation sets?

It is unclear what the novelty of the approach in applying quantile regression forests for DSM compared to other previous studies (such as Vaysse and Lagacherie, 2017; Nikou and Tziachris, 2022). I suspect that, for the soil sciences, the addition of a probabilistic map (Figure 5) would be highly useful for forest management. This should be highlighted more in the introduction.

Minor/Specific Comments:

Line 18 – Suggest “practical soil examples” instead of “practical examples”.

Lines 23-25 – Please check for consistency in spacing around mathematical notation, notably around the equals signs.

Line 55 – The reference Gavilan-Acuna et la., 2021 is listed in full as opposed to a number aong the lines of the rest of the manuscript. Please check throughout the manuscript for consistency in references.

Line 93 – The term “local error” suggests local specific error, similar to the nugget effect in geostatistical models and generally interpreted as measurement error or small-scale variability. Please check you have used the term correctly as I could not find its usage in the given citation. Do you mean prediction uncertainty/error?

Line 100 – “estimating local uncertainty”, see point raised for line 93.

Line 104 – Provide references using random forests for digital soil mapping.

Line 120-121 – Consistency in citation format. This also occurs several times throughout the text. Please proofread manuscript for consistency in citations.

Line 125 – “there remains a gam in terms of its use.” What is that gap? Is it just applied examples of QRFs for soil mapping? Or is it use of how understanding the quantified uncertainty provided by QRFs is useful for managerial decisions in the soil domain? It is unclear what that gap the authors are referring to.

Line 156 – Define ALS in the text (e.g. – “Airborne laser scanning (ALS)”) as was done in the caption in Table 1.

Line 163-166 – Figure 1 caption. Sites are showing in white in my copy, not pink as the caption states. Also reference location on the globe is lower right, not lower left.

Line 204 – Please define DEM.

Line 221-222 – Suggest naming Recursive Feature Elimination here as the method used for variable selection. Additionally, was variable selection done prior to fitting the 5 QRF models or done while model fitting (e.g., were there multiple iterations of fitting QRFs with fewer and fewer variables)?

Line 223 – “A quantile regression forest model was employed…” makes it sound like only one QRF was fit, while in reality the authors fit 5 different models. Suggest rephrasing.

Line 231-237 – Is there a need for this added step classifying deep soils from shallow soils? More justification for a multi-tiered approach is needed.

Line 234 – What validation set are the authors referring to? Is this the same as the testing set?

Line 238 – Figure 2 needs a closed bracket after CIREN.

Line 301-302 – What criteria was used to determine the final set of variables used for the model?

Line 304-306 – A note, the quantiles are not the prediction distribution, but can be used generate the prediction distribution. Suggest rephrasing “also known as the prediction distribution.”

Line 309 – As stated in the major comments, the 50^th percentile is not the mean, but rather the median.

Line 314-315 – Suggest expanding more on the method for SoD, namely the combined classification RF with QRF for <180 cm soils.

Line 315 – Why does this line end with “QRF”?

Line 325 – 326 – Are the R² and RMSE values calculated on the test set? How is R² calculated for SoD given the modeling approach is 2-tiered?

Line 336 – Equation (1) is written as R-code. While per my major comments, this equation is unnecessary, equations in general should be written in proper mathematical notation instead of a programming language.

Line 363 – The package used is “automap”, suggest mentioning that here instead of just the R function.

Line 368 – Suggest deleting “mtry” as the option name in the function is not needed for the reader.

Line 374 – “The cross-validation revealed that KED….” Is awkwardly phrased. Suggest rephrasing.

Line 379 – An r value of 0.99 is extremely high and suggests near perfect interpolation. Is this r value from the cross-validation set?

Line 391 - Table 4 – Suggest including the order of importance as well, which could be done via numbering instead of checkmarks.

Line 383-384 – Please use accurate numbers instead of saying “around 60%” and “less than 5%”. Total should add up to 100% subject to rounding.

Line 428 – Figure 3 – Why not include the uncertainty plot for SoD as well? Certainly, even if the standard deviation could be included for soils below 180cm, this is still useful information for the reader.

Line 436-437 – Do you mean to reference Figure 4B instead, noting uncertainty is mentioned? Additionally, this paragraph could expand more the results for not only SoM, but sand, silt, and clay as well.

Line 456 – Table 7 – Suggest changing this into a plot (either line plots grouped by site with x-axis representing percentile and y-axis representing predicted soil property, or boxplots with x-axis the different sites, y-axis soil properties) showing the entire distribution of each soil property at each site. A table with numbers is difficult for the reader to understand the change in variation over the different sites.

Line 466 – Why was 80% probability the mark chosen? This also should be mentioned in the methods section as well.

Line 471 – Make sure you have the correct number in the caption (5 instead of 6).

Line 479 – Capitalize “figure”

Line 482 – “Figure 6B….”

Line 510-512 – The manuscript did not present results comparing other methods, only the methods outlined in the manuscript. As such, I do not believe significant improvement can be claimed without (minimal) additional information/comparison of performance.

Line 526 – Change “standard deviation around the mean” to “mean standard deviation”.

Line 532 – Do you mean Table 7?

Line 534-539 – Suggest adding references to site numbers as well.

Line 534-555 – It wasn’t clear why 5% SOM and 80% probability were used for the probability map. Please expand on this here, as well as in relevant previous portions of the text.

Line 560 – Capitalize “figure”.

Author Response

Answer:

We appreciate your feedback and the effort spent reviewing this article, which has led to changes that enhanced the paper's value. We have addressed all the specific comments, including both major and minor ones.

We have replaced 'mean' with 'median' throughout the manuscript when referring to the 50th percentile. Additionally, in Section 5.4.3, which develops a probabilistic map to determine areas to fertilize based on the established threshold of Soil Organic Matter (SOM) content, we have shifted from using the 50th percentile and standard deviation to using the entire distribution of SOM, from the 1st to the 99th percentile (0.01 to 0.99). Line 365-385.

We have also added R², RMSE, and MAPE values for both the training and test sets, as suggested. In Section 4.3.3, we have included missing details regarding the type of cross-validation, the covariate function, and how RMSE and R were calculated. Line 480.

Finally, we have added a new paragraph in the introduction to clearly explain the novelty of this research (Line 130-137). Details on minor comments are explained in the section below.

Minor/Specific Comments:

Reviewer 1 Specific comments (original draft line)	(new draft version with track changes version line number)
Line 18 – Suggest “practical soil examples” instead of “practical examples”.	Thanks for this comment. We have followed your suggestion. Line 19.
Lines 23-25 – Please check for consistency in spacing around mathematical notation, notably around the equals signs.	Thank you for this observation. We have checked the consistency in spacing and made the respective modifications in lines 23-26.
Line 55 – The reference Gavilan-Acuna et la., 2021 is listed in full as opposed to a number aong the lines of the rest of the manuscript. Please check throughout the manuscript for consistency in references.	Thank you for this comment. Citation changed according to guideline. Line 56
Line 93 – The term “local error” suggests local specific error, similar to the nugget effect in geostatistical models and generally interpreted as measurement error or small-scale variability. Please check you have used the term correctly as I could not find its usage in the given citation. Do you mean prediction uncertainty/error?	Thank you for this observation. We have changed the term local error for uncertainty error. Line 93-94
Line 100 – “estimating local uncertainty”, see point raised for line 93.	Thank you for this observation.
Line 104 – Provide references using random forests for digital soil mapping.	Thank you for this observation. We have added a reference for line 106.
Line 120-121 – Consistency in citation format. This also occurs several times throughout the text. Please proofread manuscript for consistency in citations.	Thank you for this comment. Citation changed according to guideline. Line 121
Line 125 – “there remains a gam in terms of its use.” What is that gap? Is it just applied examples of QRFs for soil mapping? Or is it use of how understanding the quantified uncertainty provided by QRFs is useful for managerial decisions in the soil domain? It is unclear what that gap the authors are referring to.	Thank you for this comment. We have clarified the gap referred to in line 126
Line 156 – Define ALS in the text (e.g. – “Airborne laser scanning (ALS)”) as was done in the caption in Table 1.	Thank you for this comment. We added the definition of ALS. Line 165
Line 163-166 – Figure 1 caption. Sites are showing in white in my copy, not pink as the caption states. Also reference location on the globe is lower right, not lower left.	Thank you for this observation. We revised the caption of Figure 1 to specify that readers should refer to Figure 5, and ensured that the color of the figure caption matches the picture description. Line 174-178.
Line 204 – Please define DEM.	Thank you for this comment. We added the definition of DEM. Line 220.
Line 221-222 – Suggest naming Recursive Feature Elimination here as the method used for variable selection. Additionally, was variable selection done prior to fitting the 5 QRF models or done while model fitting (e.g., were there multiple iterations of fitting QRFs with fewer and fewer variables)?	Thank you for this comment. We have referred to the method as RFE in this section, which was implemented prior to the QRF as specified in line 237.
Line 223 – “A quantile regression forest model was employed…” makes it sound like only one QRF was fit, while in reality the authors fit 5 different models. Suggest rephrasing.	Thank you for this observation. We have rephrased this sentence. Line 239
Line 231-237 – Is there a need for this added step classifying deep soils from shallow soils? More justification for a multi-tiered approach is needed.	Thank you for this suggestion. More justification details regarding the extra step were added into the text. Line 253-256.
Line 234 – What validation set are the authors referring to? Is this the same as the testing set?	Thank you for this comment. We have change validation for testing dataset. Line 256.
Line 238 – Figure 2 needs a closed bracket after CIREN.	Thank you for this observation. We have added the closed bracket in Figure 2. Line 263
Line 301-302 – What criteria was used to determine the final set of variables used for the model?	Thank you for this comment. We have added the criteria used in RFE for variables selection (Percent Increase in Mean Squared Error). Lines 324-327.
Line 304-306 – A note, the quantiles are not the prediction distribution, but can be used generate the prediction distribution. Suggest rephrasing “also known as the prediction distribution.”	Thank you for this observation. We have rephrased this sentence. Line 339.
Line 309 – As stated in the major comments, the 50^th percentile is not the mean, but rather the median.	Thanks for this comment. We have change median instead of mean through the entire manuscript
Line 314-315 – Suggest expanding more on the method for SoD, namely the combined classification RF with QRF for <180 cm soils.	Thank you for this suggestion. More justification details regarding the extra step were added into the text. Line 253-256.
Line 315 – Why does this line end with “QRF”?	Thank you for this observation. We have removed QRF from this line. Line 352
Line 325 – 326 – Are the R² and RMSE values calculated on the test set? How is R² calculated for SoD given the modeling approach is 2-tiered?	Thank you for this comment. We have clarified that we used the testing dataset for validation and that the two-tiered SoD (values only below 180, developed with QRF) was validated using the same approach. Line 363-364.
Line 336 – Equation (1) is written as R-code. While per my major comments, this equation is unnecessary, equations in general should be written in proper mathematical notation instead of a programming language.	Thank you for this comment. We have changed the equation 1, to be shown in a mathematical notation. Line 381.
Line 363 – The package used is “automap”, suggest mentioning that here instead of just the R function.	Thank you for this comment. We have added “automap” package in this section. Line 417.
Line 368 – Suggest deleting “mtry” as the option name in the function is not needed for the reader.	Thank you for this comment. We have deleted “mtry” from the manuscript.
Line 374 – “The cross-validation revealed that KED….” Is awkwardly phrased. Suggest rephrasing.	Thank you for this comment. We have deleted rephrased that sentence, to “Cross-validation results demonstrated that KED downscaling provides accurate estimates of climatic variables.” Line 428.
Line 379 – An r value of 0.99 is extremely high and suggests near perfect interpolation. Is this r value from the cross-validation set?	Thank you for this comment. We have clarified that the r value is from the cross-validation process. Line 435.
Line 391 - Table 4 – Suggest including the order of importance as well, which could be done via numbering instead of checkmarks.	Thanks for the comment. However, we have decided to retain the current format with checkmarks. This decision was made to ensure that the table remains as clear and simple as possible for readers. We believe that presenting the data in this minimalist format helps avoid overcomplication.
Line 383-384 – Please use accurate numbers instead of saying “around 60%” and “less than 5%”. Total should add up to 100% subject to rounding.	Thank you for this comment. We have removed 'around' and replaced it with a fixed number. Line 439-440.
Line 428 – Figure 3 – Why not include the uncertainty plot for SoD as well? Certainly, even if the standard deviation could be included for soils below 180cm, this is still useful information for the reader.	Thank you for this comment. We decided not to include the standard deviation for SoD, as only areas classified as less than 180 cm were analysed with QRF. Therefore, only part of the map will have a predicted standard deviation, which could confuse the readers.
Line 436-437 – Do you mean to reference Figure 4B instead, noting uncertainty is mentioned? Additionally, this paragraph could expand more the results for not only SoM, but sand, silt, and clay as well.	Thank you for this observation. We have changed from 4A to 4B, and have added more details regarding the results for sand, silt, and clay. Line 504-505.
Line 456 – Table 7 – Suggest changing this into a plot (either line plots grouped by site with x-axis representing percentile and y-axis representing predicted soil property, or boxplots with x-axis the different sites, y-axis soil properties) showing the entire distribution of each soil property at each site. A table with numbers is difficult for the reader to understand the change in variation over the different sites.	Thank you for this suggestion. We have changed the Table for a Figure instead (Figure 5). Line 519.
Line 466 – Why was 80% probability the mark chosen? This also should be mentioned in the methods section as well.	Thank you for this comment. We have added a justification for choosing an 80% mark in the methods section. Lines 386-392.
Line 471 – Make sure you have the correct number in the caption (5 instead of 6).	Thank you for this observation. We have changed the caption numbers. Line 538.
Line 479 – Capitalize “figure”	Thank you for this observation. We have capitalized “figure”. Line 546.
Line 482 – “Figure 6B….”	Thank you for this observation. We added “Figure” in line 550.
Line 510-512 – The manuscript did not present results comparing other methods, only the methods outlined in the manuscript. As such, I do not believe significant improvement can be claimed without (minimal) additional information/comparison of performance.	Thank you for this comment. We have rephrased this sentence to explain that the predicted soil properties are spatially well-represented when compared to other studies instead. Line 574.
Line 526 – Change “standard deviation around the mean” to “mean standard deviation”.	Thank you for this comment. We followed your suggestion, and now it reads as “mean standard deviation”. Line 593.
Line 532 – Do you mean Table 7?	Thank you for this comment. It was changed to Figure 5. Line 599.
Line 534-539 – Suggest adding references to site numbers as well.	Thank for this comment. We have added a reference to this paragraph as well. Line 615
Line 534-555 – It wasn’t clear why 5% SOM and 80% probability were used for the probability map. Please expand on this here, as well as in relevant previous portions of the text.	Thank you for this comment. We have added a justification in the methods section for choosing an 80% threshold, as well as an explanation for why a 5% SOM threshold was chosen.
Line 560 – Capitalize “figure”.	Thank you for this observation. We have capitalized “figure”. Line 628.

Reviewer 3 Report

Comments and Suggestions for Authors

The manuscript is well-crafted and demonstrates a robust structure. I commend the significant contribution of the paper in evaluating spatial uncertainty regarding soil depth, texture, and organic matter estimations. Assessing uncertainty and incorporating it into decision-making processes is crucial, especially for variable fertilization rates in forestry and agriculture. Considering its substantial value, I believe this paper deserves publication.

I just have a very minor comment:

Line: 156 - Define abbreviation ALS (as defined in the title of table 1) before using the term in paragraph.

Author Response

We appreciate your helpful comment and your positive approach toward this paper. We have addressed the comment and defined ALS abbreviation, as detailed in the section below.

Reviewer 3 Specific comments (original draft line)

(new draft version with track changes version line number)

Line: 156 - Define abbreviation ALS (as defined in the title of table 1) before using the term in paragraph.

Thank you for this comment. We added the definition of ALS. Line 165.

Reviewer 4 Report

Comments and Suggestions for Authors

Repeated error in notation of citations Line: 43, 45, 55, 66, 87, 89, 120,121,148,175,215,256,257,292,299,335,345,352,366,513,543, 556

Line 223: Can you clarify what was the spatial distribution of the training and test data?

Figure 4) The letters in the pictures are barely visible - I recommend enlarging them. (The size in Figure 5 is easy to read.)

Figure 5) In the description of the picture there is a link to picture 6A and 6B, it should be 5A and 5B.

Author Response

We appreciate your feedback and effort spent going through this article, which resulted in changes that improved the paper's value. We addressed to all of the specific comments, which are detailed in the section below.

Reviewer 4 Specific comments (original draft line)	(new draft version with track changes version line number)
Repeated error in notation of citations Line: 43, 45, 55, 66, 87, 89, 120,121,148,175,215,256,257,292,299,335,345,352,366,513,543, 556	Thank you for this comment. Citation changed according to guideline Lines: 44, 46, 56, 67, 89, 90, 121, 157. 187, 231, 281, 282, 323, 330, 405, 420, 581, 624, 636, 650.
Line 223: Can you clarify what was the spatial distribution of the training and test data?	Thank you for this comment. Both Training and testing set were spatially included in Figure 1, and referenced in line 240.
Figure 4) The letters in the pictures are barely visible - I recommend enlarging them. (The size in Figure 5 is easy to read.)	Thank you for this comment. We modified Figure 5, and now the letters are easier to read. Line 514.
Figure 5) In the description of the picture there is a link to picture 6A and 6B, it should be 5A and 5B.	Thank you for this comment. The description of picture 5, now Figure 6 has changed to 6A and 6B. Line 538-540

Reviewer 5 Report

Comments and Suggestions for Authors

Dear authors/editor,

1. The title of the paper is unclear and needs to be changed.

2. Please plot the error indices with actual Vs. predicted values graphs.

Comments on the Quality of English Language

Minor editing of English language required

Author Response

We appreciate your feedback and the effort you put into reviewing this article, which resulted in changes that enhanced the paper's value. We have addressed all of the specific comments, which are detailed in the section below. Additionally, minor revisions regarding the English language have been made throughout the manuscript.

Reviewer 5 Specific comments (original draft line)	(new draft version with track changes version line number)
The title of the paper is unclear and needs to be changed.	Thank you for this suggestion. We have changed the title of the paper to “Assessing Soil Prediction Distribution for Forest Management Using Digital Soil Mapping”.Line 1-2
Please plot the error indices with actual Vs. predicted values graphs.	Thank you for this comment. Plots of predicted and observed values have been added to the Appendix. Line 737-743.

Reviewer 6 Report

Comments and Suggestions for Authors

The manuscript entitled “Characterizing Prediction Distribution in Digital Soil Mapping 2 for Forest Management” submitted to Soil system MDPI is a good work. The general comments are mentioned below as:

1- Prepared land use map of the study are should be presented.

2- The framework of quantile regression forest (QRF) model must be explained more in method and material.

3- Climate data must be explained more.

4- The novelty must be explained clearly

Overall, I think the manuscript is suitable for publish in the Soil system MDPI after minor changing.

Comments on the Quality of English Language

1- Prepared land use map of the study are should be presented.

2- The framework of quantile regression forest (QRF) model must be explained more in method and material.

3- Climate data must be explained more.

4- The novelty must be explained clearly

Overall, I think the manuscript is suitable for publish in the Soil system MDPI after minor changing.

Author Response

Reviewer 6 Specific comments (original draft line)	(new draft version with track changes version line number)
Prepared land use map of the study are should be presented.	Thank you for this comment. The land use Map has been added to the appendix section, and reference in line 162.
The framework of quantile regression forest (QRF) model must be explained more in method and material.	Thank you for this comment. More information about the QRF framework has been added to material and method lines 339-342
Climate data must be explained more.	Thank you for this comment. Climate data has been explained more in the data section. Lines 202-206
The novelty must be explained clearly	Thank you again for this comment. A new paragraph has been added to the introduction to emphasize the importance and novelty of this research approach. Lines 130-137.

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

I thank the authors for their thorough revision and addressing my previous comments. In particular, the authors have addressed the major concern I previously had on quantifying the probablistic map of SoM from interpolation of the Normal distribution to using the distribution provided by the QRF algorithm. The authors have addressed most of my concerns, and I only have a few added extra comments.

Lines 245-254: The explanation of the two-tiered classification/regression approach for soil depth is well described, but I'm still unsure why it is necessary (and admittedly this may be beyond my expertise in the domain). Why is 180 cm the chosen mark? Is it just there isn't just much value in understanding the exact depths of soils above 180 cm? Certainly (in theory) a random forest approach can handle the full range of soil depth, even if 60% of the data is deeper soils. This begs the question as to the distribution of soil depths both below 180cm and combined.

Line 292-298: Please add the citation for the 'automap' package here. While the added information is welcomed, my original comment was not addressed specifically. The default covariance functions are considered in the automap package currently are the Spherical, Exponential, Gaussian, and the Matérn. Suggest rephrasing this something along as follows:

"This process was implementing using 10-fold cross validation with covariance function automatically chosen using the R package 'automap' [64]. The covariance function was selected automatically from either a Spherical, Exponential, Gaussian, or Matérn function via minimizing the prediction error RMSE and coefficient of correlation"

Line 370 - Percentile instead of quantile.

Line 378 - Ensure new line for new subsection

Figure 3 - I do understand the concern the authors have of not including the uncertainty for SoD, but I believe the authors are not giving the reader enough credit here (given my concerns about the 180 cm cutoff are addressed above). I believe the readers would not be confused with missing pixels for standard deviation. I encourage the authors to consider including another subplot showing the standard deviations for the <180cm pixels, particularly given the spatial clustering of deep soils.

Figure 5 - Thank you for changing this from a table to a figure. This is much easier for the reader to comprehend the distribution of the 4 soil properties. Minor request: suggest increasing the resolution and potentially changing the color scheme of the lines.

Author Response

We appreciate your feedback and the effort you put into reviewing this article again, which has led to changes that have enhanced the value of the paper. We have addressed your latest comments in the following section.

Reviewer 2 Specific comments (original draft line)	(new draft version with track changes version line number)
Lines 245-254: The explanation of the two-tiered classification/regression approach for soil depth is well described, but I'm still unsure why it is necessary (and admittedly this may be beyond my expertise in the domain). Why is 180 cm the chosen mark? Is it just there isn't just much value in understanding the exact depths of soils above 180 cm? Certainly (in theory) a random forest approach can handle the full range of soil depth, even if 60% of the data is deeper soils. This begs the question as to the distribution of soil depths both below 180cm and combined.	Thank you for this comment. We implemented the SoD as a two-tiered classification/regression approach for two main reasons. First, soil depths below 180 cm are important for forest management as this is where the roots interact with water and nutrient content, which is precisely what we aim to represent. Second, the information at a depth of 180 cm comes from censored data, meaning these values represent depths of 180 cm or deeper, which a regression using RF cannot appropriately represent. Initially, we attempted a Survival RF analysis to handle censored data; however, this approach was not successful.
Line 292-298: Please add the citation for the 'automap' package here. While the added information is welcomed, my original comment was not addressed specifically. The default covariance functions are considered in the automap package currently are the Spherical, Exponential, Gaussian, and the Matérn. Suggest rephrasing this something along as follows: "This process was implementing using 10-fold cross validation with covariance function automatically chosen using the R package 'automap' [64]. The covariance function was selected automatically from either a Spherical, Exponential, Gaussian, or Matérn function via minimizing the prediction error RMSE and coefficient of correlation"	Thank you for this suggestion. We have rephrased the paragraph accordingly.
Line 370 - Percentile instead of quantile.	Thank you for this observation. We have changed Percentile instead of quantile.
Line 378 - Ensure new line for new subsection	Thank you for this observation. A new line has been added.
Figure 3 - I do understand the concern the authors have of not including the uncertainty for SoD, but I believe the authors are not giving the reader enough credit here (given my concerns about the 180 cm cutoff are addressed above). I believe the readers would not be confused with missing pixels for standard deviation. I encourage the authors to consider including another subplot showing the standard deviations for the <180cm pixels, particularly given the spatial clustering of deep soils.	Thank you for this comment. We decided not to include the standard deviation for SoD, as only areas classified as less than 180 cm were analyzed using QRF. Therefore, only part of the map will have a predicted standard deviation. Additionally, the areas with soils less than 180 cm constitute only a small portion of the study area, so the standard deviation in this case would not provide much useful information.
Figure 5 - Thank you for changing this from a table to a figure. This is much easier for the reader to comprehend the distribution of the 4 soil properties. Minor request: suggest increasing the resolution and potentially changing the color scheme of the lines.	Thank you for this suggestion. We have changed resolution and color scheme.

Article Menu

Assessing Soil Prediction Distributions for Forest Management Using Digital Soil Mapping

Further Information

Guidelines

MDPI Initiatives

Follow MDPI