Previous Article in Journal
Effects of Long-Term Soil Management Under Alfalfa Cultivation on Soil Fertility and Salinity in Arid Agroecosystems of the Ziban Region, Algeria
Previous Article in Special Issue
A Framework Based on Isoparameters for Clustering and Mapping Geophysical Data in Pedogeomorphological Studies
 
 
Article
Peer-Review Record

Assessment of the Accuracy of ISRIC and ESDAC Soil Texture Data Compared to the Soil Map of Greece: A Statistical and Spatial Approach to Identify Sources of Differences

Soil Syst. 2025, 9(4), 133; https://doi.org/10.3390/soilsystems9040133
by Stylianos Gerontidis, Konstantinos X. Soulis *, Alexandros Stavropoulos, Evangelos Nikitakis, Dionissios P. Kalivas, Orestis Kairis, Dimitrios Kopanelis, Xenofon K. Soulis and Stergia Palli-Gravani
Reviewer 1: Anonymous
Reviewer 3:
Soil Syst. 2025, 9(4), 133; https://doi.org/10.3390/soilsystems9040133
Submission received: 7 September 2025 / Revised: 7 November 2025 / Accepted: 16 November 2025 / Published: 25 November 2025
(This article belongs to the Special Issue Use of Modern Statistical Methods in Soil Science)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper presents a comparison for national and global soil products. The paper is well-written and structured. The represented results are clear. My suggestions mostly for "results" part:

 

Abstract.

Everything is clear. No suggestions.

 

  1. Introduction

Lines 43-45. Please specify the method of producing this map – conventional or DSM?

  1. ISRIC soil digital maps based on WOSIS dataset. https://doi.org/10.5194/essd-9-1-2017

Please keep in mind.

 

47 and 50. Please clarify what do you mean under “datasets” – georeferenced soil datasets with measurements or digital maps?

 

  1. Why not the new version of SoilGrids 2.0.? https://doi.org/10.5194/soil-7-217-2021

 

  1. Materials and Methods

2.2.1 section. Greek Soil map. Please specify the spatial resolution of digital maps here.

 

  1. please specify sampling depth.

 

  1. depth?

 

  1. how 0-30 cm depth of maps was achieved for SoilGRids ? Ok, see it further at 387.

 

Table 1. step 4. “Compute differences between observed and predicted values for each soil property. “ – specify dataset for observed values.

 

  1. please specify sampling depth.

 

  1. please specify a SoilGrids version.

 

  1. Results

Fig. 3. I am little bit confused: is it soil measurements from Soil Map of Greece or raster values???

 

  1. Discussion

Everything is clear. No suggestions.

Author Response

Comment 1: The paper presents a comparison for national and global soil products. The paper is well-written and structured. The represented results are clear. My suggestions mostly for "results" part:

Response 1: We would like to thank the reviewer for the insightful comments that helped us to improve further our work.

 

 

Comment 2 : Abstract.

Everything is clear. No suggestions.

Response 2: We thank the reviewer for the positive feedback and are glad that the abstract was found to be clear

 

 

Comment 3:  1.Introduction

Lines 43-45. Please specify the method of producing this map – conventional or DSM?

Response 3: The method of production of the Soil Map of Greece is conventional. We clarified in the manuscript that “The Soil Map of Greece is a georeferenced vector (point shapefile) dataset.” See lines: 46-47. We modified the excerpt of the text as follows: “The Soil Map of Greece is a georeferenced vector (point) dataset, that created with conventional methods. Recording of sampling points was made with the use of Global Navigation Satellite System (GNSS) devices”

 

Comment 4: 49.ISRIC soil digital maps based on WOSIS dataset. https://doi.org/10.5194/essd-9-1-2017

Please keep in mind.

Response 4: Thank you, we added the reference to WOSIS dataset [15] and we modified lines 68-69 as follows: “Another prominent dataset in the field of global soil property prediction is Soil Grids, developed using the WOSIS dataset [15] and machine learning techniques.”

 

Comment 5: 47 and 50. Please clarify what do you mean under “datasets” – georeferenced soil datasets with measurements or digital maps?

Response 5: The Soil Map of Greece is a georeferenced vector dataset while ISRIC and ESDAC datasets are Raster datasets. We corrected the above-mentioned lines and now the relative paragraph is as follows: “ The Soil Map of Greece is a georeferenced vector (point) dataset, that created with conventional methods. Recording of sampling points was made with the use of Global Navigation Satellite System (GNSS) devices. However, this soil map covers only a fraction of the country’s surface as it mostly focuses on agricultural areas. This limits its use in many applications [9] necessitating the use of additional spatial datasets coming from international organizations like the EU Joint Research Center (JRC) and the International Soil Reference and Information Centre (ISRIC).

 

Comment 6: 56.Why not the new version of SoilGrids 2.0.? https://doi.org/10.5194/soil-7-217-2021

Response 6: Thank you for noticing this. The newest version of the raster datasets of the SoilGrids were used. We clarified this in the relevant manuscript lines.

 

Comment 7: 2.Materials and Methods

2.2.1 section. Greek Soil map. Please specify the spatial resolution of digital maps here.

Response 7: The Greek soil map is in vector point file format. The locations of the sampling points have been mapped with coordinates based on GNSS measurements.

 

Comment 8: 283.please specify sampling depth.

Response 8: the sampling depth is specified as 0 – 30 cm, wherever needed, corresponding to the topsoil layer

 

Comment 9: 320.depth?

Response 9:  The analysis was conducted on samples collected from the topsoil layer

 

Comment 10: 330.how 0-30 cm depth of maps was achieved for SoilGRids ? Ok, see it further at 387.

Response 10: From the multidimensional raster dataset of ISRIC, values corresponding to each depth range (0-5cm ,5-15cm, 15-30cm) where extracted. To generate a single representation for the 0-30 cm depth range, a weighted average was calculated following the methodology outlined in the ISRIC data user manual.

 

Comment 11: Table 1. step 4. “Compute differences between observed and predicted values for each soil property. “ – specify dataset for observed values.

Response 11: The observed values correspond to those derived from the Greek Soil Map, whereas the predicted values originate from the international soil databases (ESDAC and ISRIC). This was clarified in the manuscript.

 

Comment 12: 366.please specify sampling depth.

Response 12: The analysis was conducted on samples collected from the topsoil layer (0-30cm). This was clarified in the manuscript.

 

Comment 13: 373. please specify a SoilGrids version.

Response 13: As mentioned above, SoilGrids250m version 2.0 was released in May 2020. This was clarified throughout the manuscript.

 

Comment 14: 3.Results

Fig. 3. I am little bit confused: is it soil measurements from Soil Map of Greece or raster values???

Response 14:  We appreciate the reviewer’s observation and understand the source of confusion. The sand content values shown in Figure 3 represent data derived from the Soil Map of Greece (field-based observations) as well as corresponding raster values extracted from the ISRIC and ESDAC global datasets. We have now clarified this information in the figure caption to ensure transparency and avoid misunderstanding. The new document is: “Figure 3: Smoothed frequency distribution (kernel density plot) of Sand content values (%) for the observed point values coming from the Soil Map of Greece (GR) (red line), and the corresponding values at the same positions sampled (using the “raster sampling” GIS function) from ISRIC (blue line), and ESDAC (green line) raster datasets

 

Comment 15: 4.Discussion

Everything is clear. No suggestions.

Response 15: We sincerely thank the reviewer for recognizing the clarity of the discussion section and for the positive feedback.

Reviewer 2 Report

Comments and Suggestions for Authors

Dear authors,

Your study is very interesting and useful. It certainly shows that global / continental models are not appropriate at local / regional even country scale. However, it should be stressed in your article that such models were not designed for local scale applications, mainly because local factors could not be accounted for. They remain useful for global, continental and , in some cases, country evaluations, but their application at finer scales should be performed with caution. 

So the failure of the models to capture the specific variability of particle size fractions and texture in Greece is mainly the effect of local factors (like extreme textures). Did you analyze the correlations between models after eliminating the areas where such local factors manifest? It would be useful to complete your study with such an analysis.

Finally, I consider useful and necessary to present in your study the maps of particle size fractions and texture of GR, ISRIC and ESDAC.

Please find some more comments in my annotated manuscript.    

Comments for author File: Comments.pdf

Author Response

Comment 1: Your study is very interesting and useful. It certainly shows that global / continental models are not appropriate at local / regional even country scale. However, it should be stressed in your article that such models were not designed for local scale applications, mainly because local factors could not be accounted for. They remain useful for global, continental and , in some cases, country evaluations, but their application at finer scales should be performed with caution. 

Response 1: Thank you for the insightful comments that helped us to significantly improve our manuscript.

 

Comment 2: So the failure of the models to capture the specific variability of particle size fractions and texture in Greece is mainly the effect of local factors (like extreme textures). Did you analyze the correlations between models after eliminating the areas where such local factors manifest? It would be useful to complete your study with such an analysis.

Response 2: In the revised version of our manuscript we made a more thorough evaluation of the effect of local factors. Furthermore, the analysis of errors clustering revealed that errors are clustered in many areas with specific characteristics. Still, since the inherited spatial variability of Greece as in most Mediterranean countries makes such areas with local factors predominant. This means that putting these areas aside would provide a false picture for the performance of the models. Apart from this, in our study we investigated the global datasets performance using additional observed datasets covering different conditions and areas to validate our results. The obtained results indicated very similar performance irrespective of the observed dataset used or the region.

 

Comment 3: Finally, I consider useful and necessary to present in your study the maps of particle size fractions and texture of GR, ISRIC and ESDAC.

Response 3: You are right. We included maps for all the datasets and all the studied properties. As the number of resulted maps for all datasets and properties is big, we included them in the “Appendix” and cited them in the respective places in the text. The maps of particle size fraction and texture are in figures: A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12.

 

Comment 4: l.147: please let us know if the methods use for particle size distribution measurement are the same of if they differ among the 3 approches

Response 4:  Following the reviewer’s suggestion, we have revised the text to clarify this point. The production of the SoilGrids 2.0 maps relied on an extensive collection of high-quality, laboratory-verified measurements (WOSIS) of soil texture from around the world. As mentioned above, these data were obtained from numerous countries and laboratories, where standard particle-size analysis methods were applied. The pipette method and the hydrometer method (with appropriate sample pretreatment and dispersion) form the basis for many samples, while laser diffraction has also been employed in several modern measurements.

However, due to the vast number and diversity of samples included, it is not possible to determine with certainty which analytical method was used more frequently
The LUCAS 2009 topsoil samples were air‑dried and sieved to <2 mm before analysis. Particle‐size distribution (sand, silt, clay) was determined by the conventional sieve‑and‑sedimentation (pipette) method.

For the Greek National SoIl Map, the laboratory measurements of sand, silt, clay content in the collected 0 – 30 cm samples, were performed using classical sedimentation procedures. For the particle size fraction determination, the hydrometer method was utilized. We revised the manuscript at the relative lines to include all these details.

 

Comment 5: l.150: which year? please specify the publication year of the soil map and the interval in which the sampling and lab analysis was performed

Response 5: The reviewer is absolutely right; we did not include this information in the manuscript. The soil map of Greece was published in the year 2015. Upon discussion  with one of the map’s creators, the time span where the samples where collected was from 2012 to 2014 and the samples were analyzed in a period ranging from one week to one month. The manuscript revised as follows: line 158: “The Soil Map of Greece (published in 2015) is a comprehensive digital soil mapping product covering the entire Greek territory, developed by OPEKEPE [5] in collaboration with the Aristotle University of Thessaloniki [30]. “ We also mentioned the period of the sampling campaign in Line 175.

 

Comment 6: l.251: As far as I know it is 20 cm

Response 6: We corrected this. We understand the confusion and thank you for stating the correct depth range of the samples. The LUCAS database was based on soil sampling from 0 – 20 cm of depth. Upon further inspection we concluded that we mistyped the correct depth during the process of constructing the manuscript.

 

Comment 7: l.255: LUCAS is basically a point database.  The raster maps and covariates belong to applications based on LUCAS. So to say that the aim of LUCAS was to produce continuous raster maps is not correct

Response 7: LUCAS is indeed a point database. By referring to the program we meant the goal of the ESDAC dataset. We truly agree with the reviewer that we should point out that by referring to the program we mean the ESDAC dataset and not the LUCAS to which the ESDAC rasters where based on. The clarification included in line 285 as follows : “ The ESDAC raster products are modelled and spatially interpolated layers derived from these LUCAS point measurements, mapping key soil properties The aim of ESDAC application was to produce continuous surface maps (rasters) for the entire European territory.

 

Comment 8: l.345: which literature? please provide some references

Response 8: In response to the reviewer’s comment, we added the requested literature [39] in lines 329 and 379.

 

Comment 9: l.390: if you refer to Ballabio et al (2016) study , it is the 0-20 cm the topsoil layer . The same as in LUCAS database. Please correct

Response 9: Thank you. It was corrected.

 

Comment 10: l.614: maps of particle size fractions (sand, silt, clay) and textural classes are necessary to be presented for each of the 3 approaches

Response 10: We completely agree, and we appreciate the reviewer for drawing our attention to this. We added the maps of particle size fraction and textural classes at the “Appendix” (figures: A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12) and mentioned them in the text.

 

Comment 11: l.752: Indeed, it is extremely difficult for a global or continental model to capture local variability, mainly because such local factors were not considered in the models. But this is not such a big problem, because it is expected for a global / continental model not to perform well at local scales and therefore they should not be used at these scales

Response 11: Thank you for your comment. We agree and with this phrase we wanted to highlight this..

 

Comment 12: l.876: are there any non-significant points? I can't see them on the maps

Response 12: We fully agree with the reviewer’s observation, and we acknowledge that in the original manuscript, the non-significant points were not easily distinguishable, as they were presented in grey color. In the revised version of manuscript, these points have been rendered in black, so that they are clearly distinguishable from the rest points (figure 8).

 

Comment 13: l.1013: I had the impression it was the other way around. ISRIC and ESDAC overestimate silt and undeestimate sand and clay

Response 13: Thank you very much for noticing this. It was corrected.

 

Reviewer 3 Report

Comments and Suggestions for Authors

Dear Authors,

You can find my comments and recommedations for your paper below. 

1. Main Research Question
The main research question addressed is whether two major global/pan-European digital soil databases (ISRIC-SoilGrids and ESDAC) can reliably reproduce the texture distribution of soils in Greece, as compared with a national field-based dataset.
The research therefore seeks to quantify the accuracy, identify spatial patterns of error, and evaluate the potential causes of discrepancies. This is indeed a relevant and important question, particularly for Mediterranean regions with complex lithology and topography.

2. Originality and Identified Gap

The study’s novelty lies in applying an extensive national dataset (>10 000 samples) to test global soil texture models and in exploring spatial error clustering.
The existing literature (e.g., Poggio et al. 2021; Ballabio et al. 2020) rarely focuses on regional, heterogeneous terrains such as Greece.
However, the manuscript does not clearly articulate what gap in knowledge it fills beyond confirming the limited transferability of global models.
To strengthen the originality, the authors should explicitly emphasize that this is the first quantitative validation for Greece and one of the few studies combining statistical and geostatistical (Gi) error analysis* linked with parent-material categories.

3. Added Value Compared with Previous Studies

Compared with other publications, this paper provides valuable empirical evidence showing the limited explanatory power (R² < 0.2) of global datasets at the national scale.
The inclusion of both ISRIC and ESDAC together, under identical reference conditions, allows a rare direct comparison.
Yet, the paper’s contribution would be more meaningful if the authors used their results to propose a framework for local recalibration (e.g., integrating Greek samples into SoilGrids training data) rather than only stating that “global datasets have limitations.”
A brief comparison with similar evaluations for Croatia, Spain, or Turkey would help contextualize the relative performance.

4. Methodology
The methodology is thorough but has several weaknesses that limit reproducibility and scientific rigor:
 - The "best-pixel selection" algorithm, which is employed to minimise residuals, introduces a bias due to its utilisation of observed data for the selection of the predicted cell. This should be removed or only used for sensitivity analysis.
 - Sand, silt, and clay fractions are compositional variables; standard RMSE and regression on raw percentages are statistically invalid. The authors should apply an isometric log-ratio (ilr) transformation or Aitchison distance for error computation.
 - No uncertainty validation is performed despite SoilGrids 2.0 providing prediction intervals. A coverage test of the 90 % confidence band would reveal model calibration quality.

 - The hot-spot (Gi) analysis* lacks parameter transparency: the neighborhood definition, distance threshold, and multiple-comparison correction should be reported and justified.
 - The influence of laboratory analytical procedures (e.g., pipette vs hydrometer, pre-treatment) must be discussed, since methodological differences can explain systematic shifts in texture fractions.
Adding these controls would make the evaluation statistically robust and reproducible.


5. Consistency of Conclusions with Evidence
The conclusions state that the global models "underrepresent extreme textures" and "fail to capture local variability." While in general this is supported by low R² and high RMSE values, the statement that errors are "systematically clustered due to parent materials" is not demonstrated clearly, analysis of lithological groups lacks statistical testing and sample numbers.
Also, the analysis does not quantify the extent to which each of the explanatory variables (climate, geology, or terrain) explains the overall bias.
Therefore, the conclusions slightly surpass the evidence given.
The main questions, accuracy, spatial pattern, and causes, were partly addressed, but the causal link between model covariates and local conditions is hypothetical.

6. References
The reference list is generally relevant but somewhat dated. Also, the authors should clearly cite the versions and access dates of the ISRIC and ESDAC datasets to ensure reproducibility.


Lastly, The manuscript is based on solid data and a relevant scientific question but requires significant methodological refinement and clearer interpretation before it can meet Soil Systems’ standards.
If the authors address the above issues, particularly the compositional-data treatment, bias correction, uncertainty analysis, and more critical discussion of causal mechanisms, the study could become a strong and widely cited reference for regional validation of global soil datasets.

Author Response

Comment 1:

  1. Main Research Question

The main research question addressed is whether two major global/pan-European digital soil databases (ISRIC-SoilGrids and ESDAC) can reliably reproduce the texture distribution of soils in Greece, as compared with a national field-based dataset. The research therefore seeks to quantify the accuracy, identify spatial patterns of error, and evaluate the potential causes of discrepancies. This is indeed a relevant and important question, particularly for Mediterranean regions with complex lithology and topography.

Response 1: We sincerely thank you for these encouraging and thoughtful remarks. We are pleased that the relevance and importance of our research question have been recognized, particularly in the context of Mediterranean regions characterized by complex lithology and topography. Our intention was precisely to address this gap by providing an evidence-based assessment of the reliability of global and pan-European soil datasets under such challenging environmental conditions. We appreciate your positive evaluation, which reinforces the motivation and significance of this study.

Further , we would like to thank you for the insightful comments that helped us improve our study and gave us insights for our future work.

 

Comment 2:

  1. Originality and Identified Gap

The study’s novelty lies in applying an extensive national dataset (>10 000 samples) to test global soil texture models and in exploring spatial error clustering.
The existing literature (e.g., Poggio et al. 2021; Ballabio et al. 2020) rarely focuses on regional, heterogeneous terrains such as Greece.
However, the manuscript does not clearly articulate what gap in knowledge it fills beyond confirming the limited transferability of global models.
To strengthen the originality, the authors should explicitly emphasize that this is the first quantitative validation for Greece and one of the few studies combining statistical and geostatistical (Gi) error analysis* linked with parent-material categories.

Response 2: Your feedback in strengthening the originality of our study is greatly appreciated. We have done the relative improvements to the last part of the “Introduction” section of the manuscript (Lines 120 to 125).

 

Comment 3:

  1. Added Value Compared with Previous Studies

Compared with other publications, this paper provides valuable empirical evidence showing the limited explanatory power (R² < 0.2) of global datasets at the national scale.
The inclusion of both ISRIC and ESDAC together, under identical reference conditions, allows a rare direct comparison.
Yet, the paper’s contribution would be more meaningful if the authors used their results to propose a framework for local recalibration (e.g., integrating Greek samples into SoilGrids training data) rather than only stating that “global datasets have limitations.”

Response 3: Thank you for your kind remarks. We have added more insights for the direction of future research (Lines 1426 to 1436)

Actually, this is the first step in our work to improve and downscale the global datasets to spatial resolutions more relevant to the complex environment of Med countries.

 

Comment 4: A brief comparison with similar evaluations for Croatia, Spain, or Turkey would help contextualize the relative performance.

Response 4: We greatly appreciate your insightful comment. Although we conducted an extensive literature search, we did not identify studies similar to ours for Turkey or Spain. However, we found relevant and thus comparable studies for Croatia, India, Norway, and Africa ([17], [18], [21], [22], respectively). We have revised the manuscript accordingly, as per your suggestion, in lines 1322-11337, as shown below:  “In Croatia, SoilGrids predictions for sand and silt explained virtually no correlation (R² ≈ 0.04) and even for clay R² value was only ~0.27, with normalized RMSE values up to ~2.5% (i.e. 250% of the observed range) for sand (Radocaj et al., 2023). It should be noted that in contrast to our study Radocaj et al., (2023) assessment involved a very small dataset of observed values involving very low total variance. In an arid region of India, SoilGrids 2.0 was found to severely misestimate texture fractions, under-predicting sand by ~28% and over-predicting silt and clay by ~14% [18]. In Norway, independent evaluations of SoilGrids [21], have likewise found that predicted texture fractions bear almost no relationship to field observations. In one analysis using Norwegian forest soil profiles, the coefficient of determination for SoilGrids sand, silt and clay was essentially zero (R² on the order of 0.01–0.06). Similarly, continentalscale assessments in Africa [22], report comparable magnitude errors for texture. For instance, the AfSoilGrids 250m maps (a 250 m-resolution soil map for Africa) show RMSE values on the order of 8–16% for texture fractions (≈13.7% for clay, 8.3% for silt, 15.9% for sand). In all of these regions (Greece, Norway, India, Africa), SoilGrids tended to misestimate texture in a systematic way.”

 

Comment 5:

  1. Methodology

The methodology is thorough but has several weaknesses that limit reproducibility and scientific rigor:

The "best-pixel selection" algorithm, which is employed to minimise residuals, introduces a bias due to its utilisation of observed data for the selection of the predicted cell. This should be removed or only used for sensitivity analysis.

Response 5: Thank you and you are absolutely correct, this algorithm was used to test for sensitivity of results to spatial misalignment, your comment suggests that we have to make this clearer in our text, so we modified the title of the relevant sections (section 2.3.10 and 3.8) to reflect this.

 

Comment 6: Sand, silt, and clay fractions are compositional variables; standard RMSE and regression on raw percentages are statistically invalid. The authors should apply an isometric log-ratio (ilr) transformation or Aitchison distance for error computation.

Response 6: thank you for this wonderful comment and suggestion. We have taken steps to include statistically proper metrics, the details of which were included in this revision. The additional methods were added in Lines 481 to 507. The corresponding results text, tables and graphs were included in the entire sections 3.1 and 3.2. Overall per-fraction error metrics such as standard RMSE can be misleading even if individual samples respect compositionality, however for the sake of direct comparison with previous equivalent works we still report the standard RMSE alongside the statistically valid metrics. Whole segments of sections 3.1, 3.2, and 3.8 have been reworked or added to include proper statistical metrics alongside those originally reported. The relevant tables have been amended

As for regression, it was only performed for the sake of extracting relative relationship metrics between the fractions and topographic factors in the vein of exploration, we will try and make this clearer in our text.

 

Comment 7: - No uncertainty validation is performed despite SoilGrids 2.0 providing prediction intervals. A coverage test of the 90% confidence band would reveal model calibration quality.

Response 7: You are absolutely correct with this remark. We thank the reviewer for the suggestion. SoilGrids 2.0 provides uncertainty rasters alongside predicted soil property rasters, but these are not formal confidence intervals. Therefore, a 90% coverage test cannot be directly performed. Instead, we assessed the realism of the reported uncertainties by comparing the absolute error at each sampling point to the corresponding predicted uncertainty. Observations with absolute errors smaller than the predicted uncertainty were considered consistent with the reported uncertainty. Only 1962 (20%) clay, 895 (9.1%) sand, and 827 (8.4%) silt observations met this criterion, indicating that the predicted uncertainties generally underestimate actual prediction errors, particularly for sand and silt. Following your recommendation we have added a new section 3.10 to the manuscript addressing this.

 

Comment 8- The hot-spot (Gi) analysis* lacks parameter transparency: the neighborhood definition, distance threshold, and multiple-comparison correction should be reported and justified.

Response 8: Thank you for this helpful suggestion. In the revised “Materials and Methods” section we have clarified the Gi* settings: the spatial weights were defined using a K-Nearest Neighbors (KNN) approach with k = 15 (i.e. each point’s 15 nearest neighbors), and therefore no fixed distance threshold was specified. We have also explicitly noted that no FDR correction was applied; significance is based on the raw Gi* z-scores and p-values. A brief justification has been added: k=15 was chosen to match the data’s heterogeneity and sampling density, and omitting FDR preserves the sensitivity and interpretability of the detected hot and cold spots. We hope these clarifications fully address the reviewer’s concern. A hole paragraph explaining the above added (lines 551-561) : “ The Getis-Ord Gi* index was operationalized within the ArcGIS Pro 3.2.0 environ-ment. Spatial relationships were defined by a K-Nearest Neighbors (KNN) conceptu-alization with K = 15 (i.e. each sample point’s 15 closest neighbors were included in the analysis). This choice of K = 15 reflects the heterogeneous soil sampling: it pro-vides a consistent neighborhood size that captures local variability without over-smoothing sparse areas [53]. Because KNN fixes the number of neighbors, no fixed distance band was applied. For significance testing, no False Discovery Rate (FDR) correction was used: the reported z-scores and p-values thus reflect standard (uncor-rected) significance levels [54]. We omitted FDR (which would otherwise tighten p-value thresholds to correct for multiple testing and spatial dependence) [54] in or-der to preserve sensitivity and straightforward interpretability of the hot-spot results. “

 

Comment 9- The influence of laboratory analytical procedures (e.g., pipette vs hydrometer, pre-treatment) must be discussed, since methodological differences can explain systematic shifts in texture fractions.

Response 9: Following the reviewer’s suggestion, we have revised the text to clarify this point.  The production of the SoilGrids 2.0 maps relied on an extensive collection of high-quality, laboratory-verified measurements (WOSIS) of soil texture from around the world. As mentioned above, these data were obtained from numerous countries and laboratories, where standard particle-size analysis methods were applied. The pipette method and the hydrometer method (with appropriate sample pretreatment and dispersion) form the basis for many samples, while laser diffraction has also been employed in several modern measurements.

However, due to the vast number and diversity of samples included, it is not possible to determine with certainty which analytical method was used more frequently
The LUCAS 2009 topsoil samples were air‑dried and sieved to <2 mm before analysis. Particle‐size distribution (sand, silt, clay) was determined by the conventional sieve‑and‑sedimentation (pipette) method.

For the Greek National SoIl Map, the laboratory measurements of sand, silt, clay content in the collected 0 – 30 cm samples, were performed using classical sedimentation procedures. For the particle size fraction determination, the hydrometer method was utilized. We revised the manuscript and included the corresponding information in the description of each dataset.

 

Comment 10:

  1. Consistency of Conclusions with Evidence

The conclusions state that the global models "underrepresent extreme textures" and "fail to capture local variability." While in general this is supported by low R² and high RMSE values, the statement that errors are "systematically clustered due to parent materials" is not demonstrated clearly, analysis of lithological groups lacks statistical testing and sample numbers.

Response 10: Thank you for your comment. To further increase the robustness of this claim we have included two new tables (table 8 and table 10), reporting the results of pair-wise t-tests showing the effect of parent materials reported to exhibit extreme textures on clay and sand contents in the GR, ESDAC, and ISRIC datasets. However we cannot understand what you mean by “lacking sample numbers”, the number of samples attributed to each parent material group are presented in Tables 7 and 9 (inside parenthesis to the first column. To strengthen the claim that for parent materials where extreme texture values are expected, international datasets fail to predict these values, we have added some ternary plots for the points located within specific parent materials (figure 10 and figure 11).

 

Comment 11: Also, the analysis does not quantify the extent to which each of the explanatory variables (climate, geology, or terrain) explains the overall bias.
Therefore, the conclusions slightly surpass the evidence given.

Response 11: You are making a fair point, and we hope the additional statistical tests and metrics provided in the revised Section 3.9 cover your concerns regarding the conclusions about inaccuracies due to parent material. Our exploratory assessment of correlation of errors with topographic/hydrological factors does not allow us to make any conclusive remarks about these correlations as it was purely exploratory. Applying more rigorous analysis methods for these factors in our opinion is best reserved for future dedicated papers.

 

Comment 12: The main questions, accuracy, spatial pattern, and causes, were partly addressed, but the causal link between model covariates and local conditions is hypothetical.

Response 12: This is a fair point. We hope that by addressing your previous comments this has been somewhat alleviated.

 

Comment 13:

  1. References

The reference list is generally relevant but somewhat dated. Also, the authors should clearly cite the versions and access dates of the ISRIC and ESDAC datasets to ensure reproducibility.

Response 13: This suggestion was very helpful, and we have now added the versions and access dates to the manuscript

 

Comment 14: Lastly, The manuscript is based on solid data and a relevant scientific question but requires significant methodological refinement and clearer interpretation before it can meet Soil Systems’ standards.
If the authors address the above issues, particularly the compositional-data treatment, bias correction, uncertainty analysis, and more critical discussion of causal mechanisms, the study could become a strong and widely cited reference for regional validation of global soil datasets.

Response 14: We greatly appreciate your constructive criticism and hope the relevant amendments to our manuscript addressed your concerns.

Round 2

Reviewer 3 Report

Comments and Suggestions for Authors

Dear Authors,

Thank you for the corrections.

Back to TopTop