Soil Organic Carbon Mapping Through Remote Sensing and In Situ Data with Random Forest by Using Google Earth Engine: A Case Study in Southern Africa
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis study provides significant advances to spatial modeling of soil organic carbon (SOC) by using RF model and various environmental variables. The questions were as the following:
- Could you list the objectives in several points?
- How did you collect the soil samples in detail?
- You can list only one scale in the left figure in Fig. 1.
- The font size in too small in Fig.3.
- Give the reasons why you selected these VIs in section 2.3.3.
- There were so many numbers after the point in Table 4.
- How you optimize the hyper parameters in RF model?
Author Response
Dear reviewer,
First of all, thank you for your comments and helping me improve my work. Your comments and my responses are attached here. I invite you to view the entire revised document, as many aspects of the manuscript have changed
Best regards
Javier
Reviewer 1.
Could you list the objectives in several points?
The primary objective of this manuscript is to explore the effectiveness and accuracy of DSM techniques, particularly Random Forest models within the Google Earth Engine environment, to predict SOC in southern Africa. The manuscript pursues the following specific objectives:
- Test and validate the potential of the algorythm Random Forest to develop raster products of soil organic carbon in big areas
- Evaluate the use of Google Earth Engine Cloud platform to facilitate the model processing and share the script code as open source for tbe scientific and common-user community
- Generate new digital information for these countries, where the data availability is scarce, contributing in land and resources management
How did you collect the soil samples in detail?
At a more local scale, important data came from Namibia’s efforts under the Land Degradation Neutrality (LDN) programme. As a pilot country, Namibia developed SOC baselines for the Otjozondjupa and Omusati regions, collecting more than 219 field samples according to Digital Soil Mapping methodologies, using a directed stratified sampling design. Each location contained four sub-plots, one central plot and three peripheral subplots equally distributed along the perimeter of a 5 to 6 m radius circle [36]. Then, soil were extracted from the first 30 cm using a soil auger. Soil samples were analyzed for dry mass and SOC using the Walkley-Black method [36].
Similar to the studies carried out in Otjozondjupa and Omusati, different replicates were carried out to improve the representativeness of each point, sampled in the first 30 cm of soil. The Walkley-Black method was used as laboratory method to estimate the SOC.
You can list only one scale in the left figure in Fig. 1.
I removed the other scales so a new figure is in the new document.
The font size in too small in Fig.3.
The size of the subcaption is equal to the other figures and tables.
Give the reasons why you selected these VIs in section 2.3.3.
This set of spectral variables (Table 3) was carefully chosen according to the intrinsic relationship of the variable itself to the ecological meaning of carbon sequestration and storage. In addition, the selection is also based on models and results from other studies [17,41,45]
“Furthermore, In the previous text explain the relationship between this kind of indexes and the [SOC]”
There were so many numbers after the point in Table 4.
I modified the table.
How you optimize the hyper parameters in RF model?
We have done a OOB error analysis, obtaining that between 240-580 trees (between regions) the Random Forest model performed well, minimizing the error. In this case, whe selected a ntrees= 580, where the error is minimum. Also, we changed the bagFraction between 0.5-1 until obtain the minimum RMSE and MAE but avoiding the overfitting. We have decided using bagFract = 0.8
Author Response File: Author Response.docx
Reviewer 2 Report
Comments and Suggestions for AuthorsThere are some problems that can be corrected and it is suggested:
- All area figures and the southern African region must show geographic coordinates.
- Country names must be on the regional map.
- The source of the information must be cited, even if it is prepared by the authors.
- All text must be impersonal, in the third person plural or singular and in the past tense.
- Usually, the caption of the tables is positioned above them. In this case it is not like that. It is suggested to correct it.
However, there are serious problems such as:
- The images in Figures 5, 6, 7 and 8 are overlapping, it is not possible to see them.
- There are no tables 1 and 2, only the captions.
In view of these aspects, it is impossible to evaluate the work, being necessary to resubmit it.
Comments for author File: Comments.pdf
Author Response
Dear reviewer,
First of all, thank you for your comments and helping me improve my work. Your comments and my responses are attached here. I invite you to view the entire revised document, as many aspects of the manuscript have changed
Best regards
Javier
Reviewer 2.
Abstract
Streamlined model description without sacrificing important findings.
I reestructured the entire Abstract
For clarity, a data challenge (Chobe) was stated, and context was added to R2. I modified this. Please review the new document.
Enhanced originality by setting it apart from earlier studies by utilizing Sentinel 2 real time data.
The origanility is highlighted based on the use of GEE for the model configuration for southern African countries, putting the entire script as open source.
Introduction
Clearly state the research gap (for example, the shortcomings of the current SOC mapping in southern Africa) and the contribution of the study (for example, the integration of GEE with data from several sources). As an example. “Despite these global challenges, southern Africa unique climatic and soil conditions remain underexplored in high-resolution SOC mapping." This statement connects the regional focus to global trends.
The relationship between soil properties and the electromagnetic spectrum is well-established, forming the foundation for remote sensing applications in soil characterisation [25, 26]. Digital Soil Mapping (DSM) builds upon this by integrating field and laboratory soil observations with satellite data and machine learning techniques to predict soil attributes across spatial scales [27]. DSM enables the production of accurate and timely soil information, essential for sustainable land management and evidence-based policymaking. However, in regions such as southern Africa, high-resolution SOC mapping remains limited due to data scarcity, inconsistent sampling, and underuse of advanced remote sensing capabilities [28, 29]. Large areas, especially in the western parts of countries like Namibia and South Africa, suffer from a lack of reference data, which makes it difficult to spatially quantify uncertainties in SOC estimation [Venter et al. 2021]. While remote sensing holds considerable promise for monitoring ecosystem services, its potential for improving soil-related decision-making has yet to be fully realised. Further research is needed to validate scalable DSM approaches and enhance strategies for soil carbon storage and agricultural productivity [30, 31].
To make the literature review stronger and demonstrate currency, include one or two current research on SOC mapping (after 2022).
I have included these new references
Venter, Z. S., Hawkins, H., Cramer, M. D., & Mills, A. J. (2021). Mapping soil organic carbon stocks and trends with satellite-driven high resolution maps over South Africa. The Science of the Total Environment, 771, 145384. https://doi.org/10.1016/j.scitotenv.2021.145384
Nenkam, A. M., Wadoux, A. M., Minasny, B., Silatsa, F. B., Yemefack, M., Ugbaje, S. U., Akpa, S., Van Zijl, G., Bouasria, A., Bouslihim, Y., Chabala, L. M., Ali, A., & McBratney, A. B. (2024). Applications and challenges of digital soil mapping in Africa. Geoderma, 449, 117007. https://doi.org/10.1016/j.geoderma.2024.117007
Radočaj, D., Gašparović, M., & Jurišić, M. (2024). Open Remote Sensing Data in Digital Soil Organic Carbon Mapping: A Review. Agriculture, 14(7), 1005. https://doi.org/10.3390/agriculture14071005
Pouladi, N., Gholizadeh, A., Khosravi, V., & Borůvka, L. (2023). Digital mapping of soil organic carbon using remote sensing data: A systematic review. CATENA, 232, 107409. https://doi.org/10.1016/j.catena.2023.107409
Duarte, E., Zagal, E., Barrera, J. A., Dube, F., Casco, F., & Hernández, A. J. (2022). Digital mapping of soil organic carbon stocks in the forest lands of Dominican Republic. European Journal of Remote Sensing, 55(1), 213–231. https://doi.org/10.1080/22797254.2022.2045226
Materials and Methods
Include a table or paragraph that lists the sample sizes by region and the sampling procedures (such as 0 20 cm depth and random vs. stratified sampling).
Please, check the new table generated. I didn’t specify the depth of the sampling because it is variable, depending on the data source.
Reduce story duplication and compile study area details into a table.
I have created a new table with all the information of the case studies but summarized in a table.
References to other research (e.g., [17, 52]) should be used to support the variable selection process.
I added references in the input variables section, for instance: Mitran, T. et al. (2024). Digital Soil Mapping: A Tool for Sustainable Soil Management. In: Rahman, M.M., Biswas, J.C., Meena, R.S. (eds) Climate Change and Soil-Water-Plant Nexus. Springer, Singapore. https://doi.org/10.1007/978-981-97-6635-2_3
Relative humidity should be excluded for statistical or ecological reasons.
This variable was not used in the modelling part (you can not find it in the variable importance plots). Line 312-321 I say that I don’t include this variable in the modelling.
To improve methodological rigor, include a brief sensitivity analysis or citation that backs up the RF parameter selections (e.g., [44]).
We have done a OOB error analysis, obtaining that between 240-580 trees (between regions) the Random Forest model performed well, minimizing the error. In this case, we selected a ntrees= 580, where the error is minimum. Also, we changed the bagFraction between 0.5-1 until obtain the minimum RMSE and MAE.
Results
Include a paragraph explaining the findings, such as how Chobe's difficulties with data distribution and Otjozondjupa's success are related to their sampling strategy (as suggested in the discussion).
I added this information: “In this region, the sampling point distribution was limited and inadequate for digital soil mapping purposes. Although many points existed, their poor spatial distribution left many areas without coverage. Moreover, the quality of the data is questionable, seemingly inaccurate and coming from somewhat old databases, which diminishes their reliability. As previously described, Otjozondjupa benefited from a well-structured soil sampling design, which aligned effectively with digital soil mapping using machine learning algorithms. These findings confirmed that Otjozondjupa had the most favourable data conditions for SOC prediction in the context of this study.”
Make sure no material is shortened and include the complete definition of the RF parameter.
I have included it
Give a narrative or synthesis table that summarizes the model's performance in each region (e.g., average R2 or RMSE by model).
All the old tables were unified into only one table (Table 5), with all the performance indicators summarized
To supplement SOC distribution maps, include standard deviation maps or a quantitative summary of uncertainty.
STD maps are included in the figures, with a color bar indicating the range
Discussion
Move specific data to Results and concentrate the discussion on interpretation to cut down on repetitious sections.
I have changed these sections, please, review it and let me know new comments if needed
To measure predictor contributions, include variable importance ratings or percentages (for example, from Figures 9 12).
I have changed these figures, please check it out
With the help of variable significance data, propose a particular group of predictors (such as elevation, B11, and precipitation) for future simplification.
Moreover, based on our findings, we recommend a set of predictor variables that balance model accuracy and ecological interpretability. The optimal combination includes topographic variables such as elevation and the Topographic Wetness Index (TWI); spectral indices from Sentinel-2—particularly B11 (SWIR), B8 (NIR), GNDVI, NDMI, and BI; and climatic variables like land surface temperature (LST) and annual precipitation. These variables consistently ranked highest in importance across model configurations and are supported by previous studies for their strong associations with soil formation processes, organic matter dynamics, and vegetation productivity
Conclusions
Include a line that contrasts the GEE-RF methodology used in this work with earlier DSM initiatives (for example, "Unlike SoilGrids [41], this study leverages real-time Sentinel-2 data for improved resolution").
I have updated the Conclusions section
Give specific recommendations for future research, such as "To improve SOC predictions in data-scarce regions, future research could explore Sentinel-3 data or climate downscaling models.".
I have updated the conclusions sections. I have included some sentences as:
To further enhance SOC predictions, especially in poorly sampled areas, future research could explore the integration of Sentinel-1 radar data, Sentinel-3 biophysical products, or downscaled climate models. Additionally, upcoming Earth observation missions from the European Space Agency, such as BIOMASS (focused on forest carbon stocks) and FLEX (targeting vegetation fluorescence and photosynthetic activity), may offer valuable inputs to improve spatial and temporal accuracy of SOC estimates.
References
Make sure that every reference is required; to make the list shorter, eliminate any that are only passingly relevant.
I updated all the references. I fixed some issues with the DOI urls. Also, I ordered again the references. I removed the unutilized ones
References cited in the text must appear in the list of references.
Please, check the references section, I hope I fixed all the issues.
You will find some new related references, which should be added to the literature review.
I have added the following references:
IUSS Working Group WRB. 2022. World Reference Base for Soil Resources. International soil classification system for naming soils and creating legends for soil maps. 4th edition. International Union of Soil Sciences (IUSS), Vienna, Austria.
Venter, Z. S., Hawkins, H., Cramer, M. D., & Mills, A. J. (2021). Mapping soil organic carbon stocks and trends with satellite-driven high resolution maps over South Africa. The Science of the Total Environment, 771, 145384. https://doi.org/10.1016/j.scitotenv.2021.145384
Nenkam, A. M., Wadoux, A. M., Minasny, B., Silatsa, F. B., Yemefack, M., Ugbaje, S. U., Akpa, S., Van Zijl, G., Bouasria, A., Bouslihim, Y., Chabala, L. M., Ali, A., & McBratney, A. B. (2024). Applications and challenges of digital soil mapping in Africa. Geoderma, 449, 117007. https://doi.org/10.1016/j.geoderma.2024.117007
Radočaj, D., Gašparović, M., & Jurišić, M. (2024). Open Remote Sensing Data in Digital Soil Organic Carbon Mapping: A Review. Agriculture, 14(7), 1005. https://doi.org/10.3390/agriculture14071005
Pouladi, N., Gholizadeh, A., Khosravi, V., & Borůvka, L. (2023). Digital mapping of soil organic carbon using remote sensing data: A systematic review. CATENA, 232, 107409. https://doi.org/10.1016/j.catena.2023.107409
Mitran, T. et al. (2024). Digital Soil Mapping: A Tool for Sustainable Soil Management. In: Rahman, M.M., Biswas, J.C., Meena, R.S. (eds) Climate Change and Soil-Water-Plant Nexus. Springer, Singapore. https://doi.org/10.1007/978-981-97-6635-2_3
Digital Object Identifier (DOI) for the references should be added.
I revised all the DOI links. Please, check the references out. Some DOI’s had issues with the link
Some References are cited in the body, but their bibliographic information is missing. Kindly provide its bibliographic information in the list.
I fixed all the section. Please, review the new update.
Author Response File: Author Response.docx
Reviewer 3 Report
Comments and Suggestions for AuthorsLand
Soil organic carbon mapping through remote sensing and in-situ data with Random Forest by using Google Earth Engine: a case study in Southern Africa.
Manuscript ID: land-3709467
For Author
This paper deal is: “Soil organic carbon mapping through remote sensing and in-situ data with Random Forest by using Google Earth Engine: a case study in Southern Africa”. While your study addresses an important topic, the manuscript in its current form has several critical deficiencies that need attention and revision before it can be considered for publication.
Comments:
Abstract
- streamlined model description without sacrificing important findings.
- For clarity, a data challenge (Chobe) was stated and context was added to R2.
- Enhanced originality by setting it apart from earlier studies by utilizing Sentinel-2 real-time data.
Introduction
- Clearly state the research gap (for example, the shortcomings of the current SOC mapping in southern Africa) and the contribution of the study (for example, the integration of GEE with data from several sources).
- As an example, "Despite these global challenges, southern Africa’s unique climatic and soil conditions remain underexplored in high-resolution SOC mapping." This statement connects the regional focus to global trends.
- To make the literature review stronger and demonstrate currency, include one or two current research on SOC mapping (after 2022).
Materials and Methods
- Include a table or paragraph that lists the sample sizes by region and the sampling procedures (such as 0–20 cm depth and random vs. stratified sampling).
- Reduce story duplication and compile study area details into a table.
- References to other research (e.g., [17, 52]) should be used to support the variable selection process. Relative humidity should be excluded for statistical or ecological reasons.
- To improve methodological rigor, include a brief sensitivity analysis or citation that backs up the RF parameter selections (e.g., [44]).
Results
- Include a paragraph explaining the findings, such as how Chobe's difficulties with data distribution and Otjozondjupa's success are related to their sampling strategy (as suggested in the discussion).
- Make sure no material is shortened and include the complete definition of the RF parameter.
- Give a narrative or synthesis table that summarizes the model's performance in each region (e.g., average R2 or RMSE by model).
- To supplement SOC distribution maps, include standard deviation maps or a quantitative summary of uncertainty.
Discussion
- Move specific data to Results and concentrate the discussion on interpretation to cut down on repetitious sections.
- To measure predictor contributions, include variable importance ratings or percentages (for example, from Figures 9–12).
- With the help of variable significance data, propose a particular group of predictors (such as elevation, B11, and precipitation) for future simplification.
- Include a brief table or statement that contrasts RF to Cubist/QRF, along with references (for example, "RF achieved a 10% higher R2 than QRF in Otjozondjupa").
Conclusions
- Include a line that contrasts the GEE-RF methodology used in this work with earlier DSM initiatives (for example, "Unlike SoilGrids [41], this study leverages real-time Sentinel-2 data for improved resolution").
- Give specific recommendations for future research, such as "To improve SOC predictions in data-scarce regions, future research could explore Sentinel-3 data or climate downscaling models.".
References
- Make sure that every reference is required; to make the list shorter, eliminate any that are only passingly relevant.
- References cited in the text must appear in the list of references.
- You will find some new related references, which should be added to the literature review.
- Digital Object Identifier (DOI) for the references should be added.
- Some References are cited in the body but their bibliographic information is missing. Kindly provide its bibliographic information in the list.
Comments for author File: Comments.pdf
Author Response
Dear reviewer,
First of all, thank you for your comments and helping me improve my work. Your comments and my responses are attached here. I invite you to view the entire revised document, as many aspects of the manuscript have changed
Best regards
Javier
Reviewer 3.
All area figures and the southern African region must show geographic coordinates.
I have done it. Check the new figures.
Country names must be on the regional map.
I have done it, check the resubmitted document.
The source of the information must be cited, even if it is prepared by the authors.
I have done it. Please, check it the new subcaption
All text must be impersonal, in the third person plural or singular and in the past tense.
I have checked all the sentences and transform it into impersonal.
Usually, the caption of the tables is positioned above them. In this case it is not like that. It is suggested to correct it.
I reviewed all the subcaptions.
The images in Figures 5, 6, 7 and 8 are overlapping, it is not possible to see them.
The problem was solved.
There are no tables 1 and 2, only the captions.
There was a problem, I uploaded the wrong document. Find the tables 1 and 2 in the body.
Author Response File: Author Response.docx
Reviewer 4 Report
Comments and Suggestions for AuthorsIn this manuscript the authors, using dataset combining spectral, topographical, and climatic variables from global sources and regional soil sampling, explores the potential of Digital Soil Mapping (DSM) to predict Soil Organic Carbon (SOC) in southern Africa.
The introduction provides sufficient information and the objectives are sufficiently explicit. In general, the manuscript presents solid and well-structured research, with a clear methodology and relevant results for soil organic carbon mapping. This is relevant because the generated SOC maps provide essential insights for soil conservation and land-use planning in savanna ecosystems. But some paragraphs could be improved as suggested below.
Between lines 40-45, a more explicit sentence could be added regarding the specific challenges of SOC mapping in Sub-Saharan Africa (landscape heterogeneity, limited data, etc.), which would further justify the proposed approach.
In lines 70-75, the specific "gap" that this study fills could be clarified a bit more, beyond it being a case study. What kind of information about SOC or what specific method is less explored in this region that your study addresses? For example, if there are few studies that combine these data sources or if your study is one of the first to use GEE at this scale in the region.
Line 83. Further research is necessary to validate alternative soil management approaches and optimize carbon storage and agricultural productivity 84 [30,31]. In my opinion this frame is not in the right place. Please move it.
Between lines 95-100, if there are any distinctive pedological characteristics or specific sampling challenges in these regions that are relevant to SOC mapping, it would be useful to mention them briefly.
To foster the reproducibility of this work, between lines 180-185, the exact parameters used for the Random Forest model should be specified. Additionally, in lines 190-194, mention should be made of how the validation was performed.
The font size is practically illegible in some figures. Please improve.
When mentioning soil types, it is necessary to indicate the classification system. Obviously, in this case study, it is the FAO WRB, but it is also necessary to indicate the version (I assume the 2015 or the recent 2022 version). For example, on line 182, it says: Soils in this region are often classified as Arenosols and Cambisols; or in line 190.. Regosols and Leptosols.
Line 320. To predict SOC, we derived a set of spectral indices categorized into three functional groups based on their ecological relevance. This is key, so I would ask for a little more depth.
Table 3 cannot appear like this; it is completely illegible.
The discussion is a good start, but it can be expanded to delve deeper into the interpretation and implications.
Comments on the Quality of English LanguageIn my opinion, it is acceptable.
Author Response
Dear reviewer,
First of all, thank you for your comments and helping me improve my work. Your comments and my responses are attached here. I invite you to view the entire revised document, as many aspects of the manuscript have changed
Best regards
Javier
Reviewer 4.
A more explicit sentence could be added regarding the specific challenges of SOC mapping in Sub-Saharan Africa (landscape heterogeneity, limited data, etc.), which would further justify the proposed approach.
This paragraph was modified according to other reviewers. I modified it according all the comments. Please review lines 74 – 90 approximately.
In lines 70-75, the specific "gap" that this study fills could be clarified a bit more, beyond it being a case study. What kind of information about SOC or what specific method is less explored in this region that your study addresses? For example, if there are few studies that combine these data sources or if your study is one of the first to use GEE at this scale in the region.
I have added the following line: “While remote sensing holds considerable promise for monitoring ecosystem services, its potential for improving soil-related decision-making has yet to be fully realised. Moreover, the facility of Google Earth Engine platform and the script developed here will serve as an easy way to test and run Random Forest models for DSM purposes, especially in southern Africa”
Line 83. Further research is necessary to validate alternative soil management approaches and optimize carbon storage and agricultural productivity 84 [30,31]. In my opinion this frame is not in the right place. Please move it.
I have reestructured the paragraph, please review the new document
Between lines 95-100, if there are any distinctive pedological characteristics or specific sampling challenges in these regions that are relevant to SOC mapping, it would be useful to mention them briefly.
I added some new text in the Soil sampling data section. Check the text in red in the three last paragraphs.
To foster the reproducibility of this work, between lines 180-185, the exact parameters used for the Random Forest model should be specified. Additionally, in lines 190-194, mention should be made of how the validation was performed.
The hyperparametrization of the Random Forest was described in other reviewer comments.
The font size is practically illegible in some figures. Please improve.
I have checked all the font size, and it is in 10pt Palatino Linotype.
When mentioning soil types, it is necessary to indicate the classification system. Obviously, in this case study, it is the FAO WRB, but it is also necessary to indicate the version (I assume the 2015 or the recent 2022 version). For example, on line 182, it says: Soils in this region are often classified as Arenosols and Cambisols; or in line 190. Regosols and Leptosols.
I have checked the text related with these comments and I added the following reference:
IUSS Working Group WRB. 2022. World Reference Base for Soil Resources. International soil classification system for naming soils and creating legends for soil maps. 4th edition. International Union of Soil Sciences (IUSS), Vienna, Austria.
Line 320. To predict SOC, we derived a set of spectral indices categorized into three functional groups based on their ecological relevance. This is key, so I would ask for a little more depth.
I changed the sentence adding the folowing:
To SOC prediction, we derived a set of spectral indices categorized into three functional groups based on their ecological relevance and common uses in the remote sensing field.
After these lines, I explain in three different paragraphs the groups of remote sensing indices. Vegetation, brightnesses and moisture indexes, with their environmental implications.
Table 3 cannot appear like this; it is completely illegible.
I attached the table in the annex in a horizontal page. Please, review the new file.
The discussion is a good start, but it can be expanded to delve deeper into the interpretation and implications.
I changed the Discussion and conclusion section. Please, review the updated one.
Author Response File: Author Response.docx
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsNo comments.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe requested corrections and suggestions were met.
Comments on the Quality of English LanguageThe requested corrections and suggestions were met.
Reviewer 3 Report
Comments and Suggestions for AuthorsI wish success to the authors in their study.
With Best Regards