Soil Organic Carbon Assessment Using Remote-Sensing Data and Machine Learning: A Systematic Literature Review
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsSoil Organic Carbon (SOC) acts on these two topics, essential to fertility when incorporated as soil organic matter and problematic when disposed of in the atmosphere in its gaseous form. Laboratory methods to measure SOC are expensive and time-consuming. This systematic literature review (SLR) aims to identify techniques and alternative ways to estimate SOC using Remote Sensing (RS) spectral data and computer tools to process this database. The findings underscore the potential of combining RS and advanced artificial intelligence techniques for efficient and scalable SOC monitoring. But there are still several issues that need to be resolved:
1. Some paragraphs in the article are overly numerous. It is recommended to combine paragraphs that address similar topics. For example, the second and third paragraphs in the introduction discuss the same issue and can be merged. Similar situations appear in many other sections of the article and should be revised accordingly.
2. The selection of keywords appears excessive. Terms like "artificial intelligence" and "satellite images" have limited relevance to the article and are suggested to be removed.
3. In section 3.13, there are repetitive explanations of certain technical terms. It is advised to streamline and simplify this content.
4. Regarding the content from lines 474 to 479, the conclusions mentioned there need clarification about their source or basis.
5. The manuscript discussed the advantages of deep learning models. However, the discussion on complex models such as Concrete Autoencoder - Deep neural network and CNN - LSTM is not in-depth. What are their specific advantages in SOC data processing?
6. In terms of multi-source data (such as satellite data, field spectrometer data, UAV data, etc.) fusion, the fusion method should be strengthened to further explore the accuracy and effect of SOC monitoring.
7. The samples selected in the study involve 13 countries, of which 12 are in the northern hemisphere and only South Africa is in the southern hemisphere. The regional distribution is unbalanced. Will it have an impact on the research results of global SOC assessment? How did the author consider this?
Author Response
Comments 1: Some paragraphs in the article are overly numerous. It is recommended to combine paragraphs that address similar topics. For example, the second and third paragraphs in the introduction discuss the same issue and can be merged. Similar situations appear in many other sections of the article and should be revised accordingly.
Response 1: We appreciate your suggestion. The paper was reviewed to summarize the text and improve the clarity of ideas, for example, Lines , 37, 40-41, 43, 57, 63-64, 85-86, ...
Comments 2: The selection of keywords appears excessive. Terms like "artificial intelligence" and "satellite images" have limited relevance to the article and are suggested to be removed.
Response 2: Thank you for your comment. This was done. Line 27.
Comments 3: In section 3.13, there are repetitive explanations of certain technical terms. It is advised to streamline and simplify this content.
Response 3: Thank you for your comment. The sentences have been summarized to improve text clarity and simplify its content. Lines 387-404.
Comments 4: Regarding the content from lines 474 to 479, the conclusions mentioned there need clarification about their source or basis.
Response 4: Thank you for the observation. This has been corrected in the corrected version. Lines 560 to 563.
Comments 5: The manuscript discussed the advantages of deep learning models. However, the discussion on complex models such as Concrete Autoencoder - Deep neural network and CNN - LSTM is not in-depth. What are their specific advantages in SOC data processing?
Response 5: Thank you for the relevant comment. We made some changes to the text in section 4.4, where we discussed these methods (CAE-DNN, CNN-LSTM) used by the authors, to make it clearer and more in-depth. Specific advantages of the Concrete Autoencoder – Deep Neural Network consist of reducing the number of features in order to find the optimal number and the more relevant of them. This is important in estimating SOC, as soil properties are influenced by many variables that are not equally relevant for predictive models. The CNN-LSTM also optimizes the SOC assessment, as the CNN extracts spatial features from environmental variables, and the LSTM is used to extract information of temporal series data. Lines 568-569, 580-581 and 585.
Comments 6 : In terms of multi-source data (such as satellite data, field spectrometer data, UAV data, etc.) fusion, the fusion method should be strengthened to further explore the accuracy and effect of SOC monitoring.
Response 6: Thank you for your relevant observation. To strengthen the importance of spectral data, regardless of the type (satellite, field spectrometer,...), the texts of section 3.1.2. and Figure 6 were reworked, so it is now observed that remote sensing data are the most widely used for SOC density prediction studies. Lines 332-345.
Comments 7: The samples selected in the study involve 13 countries, of which 12 are in the northern hemisphere and only South Africa is in the southern hemisphere. The regional distribution is unbalanced. Will it have an impact on the research results of global SOC assessment? How did the author consider this?
Response 7: We appreciate your suggestion. The article is a systematic review paper that aims to evaluate or analyze the strength of the methodology used (type of spectral data or type of AI model), regardless of the region or country. For this reason, some statistical descriptors that better predict C are shown (R2, Fig 14). We have not done the exercise of analyzing the results of the SOC value for each region or country, but it would be interesting to do so in future research.
Reviewer 2 Report
Comments and Suggestions for AuthorsIt is meaningful work to review the study of SOC using remote sensing technique. Several comments are listed below:
1. Line 9, 12, 14, 16…: “con-cludes”, “da-tasets”, “el-evation”, “scala-ble”…. Please provide the correct forms of these words.
2. Line 23: “Organic Soil Carbon” should be “Soil Organic Carbon”
3. Line 173: Please provide the complete content of PRISMA the first time it appears.
4. Please clarify the meaning of the x- and y-coordinates, as shown in Figure 12. It is unclear whether the total of the three different spatial resolutions adds up to 100%. The sum of the three values exceeds 100%. Are there any papers that discuss the use of different resolutions?
5. Section 2.3. The search term “deep learning OR neural network” was used, which may exclude some papers if specialized methods were applied. Is it necessary to include papers using specific methods? If only the strings listed in lines 116-125 are used, would papers that employ specific methods be overlooked?
6. Line 129: time range is 2021-2023; why not include 2024?
Author Response
Comments 1: Line 9, 12, 14, 16…: “con-cludes”, “da-tasets”, “el-evation”, “scala-ble”…. Please provide the correct forms of these words.
Response 1: Thank you for your comment. This was done. Lines 12-26.
Comments 2: Line 23: “Organic Soil Carbon” should be “Soil Organic Carbon”
Response 2: Thank you for your comment. This was done. Line 36.
Comments 3: Line 173: Please provide the complete content of PRISMA the first time it appears.
Response 3: Thank you for your comment. This was done. Line 194-195.
Comments 4: Please clarify the meaning of the x- and y-coordinates, as shown in Figure 12. It is unclear whether the total of the three different spatial resolutions adds up to 100%. The sum of the three values exceeds 100%. Are there any papers that discuss the use of different resolutions?
Response 4: Thank you for your relevant observation. The raw data was reviewed, and it was observed that in some works, both resolutions are used indiscriminately, which has generated a sum greater than 100%. To avoid this, it was described in the text the use of simultaneous data (lines 439-442).
Comments 5: Section 2.3. The search term “deep learning OR neural network” was used, which may exclude some papers if specialized methods were applied. Is it necessary to include papers using specific methods? If only the strings listed in lines 116-125 are used, would papers that employ specific methods be overlooked?
Response 5: In principle, this shouldn't happen because the specific methods are within the groups that the strings "deep learning OR neural network" represent, so it's very unlikely. Furthermore, listing all specific methods by name would be unfeasible, as authors can use names created by themselves, as happens in some cases in this work, for example, where two methods were merged.
Comments 6: Line 129: time range is 2021-2023; why not include 2024?
Response 6: As the idea for this work was conceived at the end of 2023, it is not possible to include publications from 2024 that were not yet available online. However, as the number of interested authors in this topic increases every year, we could expand this in the future.
Reviewer 3 Report
Comments and Suggestions for AuthorsThis paper provides a very detailed discussion on the research progress of monitoring soil organic carbon from the aspects of research area, sample number and density, data sources, models, etc. The method selection is reasonable, the analysis is in-depth and thorough, and the author has their own thinking and understanding, providing readers with a lot of valuable information.
1. What are the essential differences between spectroscopic, satellite imagery and RS data?
Has the standard deviation reached 6115.8?
2. Lines 273. It is recommended to analyze the reasons for the sample density reaching 393 per Km2.
3. Lines 296 to 300. In theory, satellite image includes spectrophotometry, airborne image, and UAV Image, and vegetation indices are also calculated based on image data. Therefore, the categories in Figure 6 should not be duplicated and defined. Therefore, the author needs to make a clear distinction between these concepts and check whether the relevant theories and data are accurate.
4. Line 352. n fact, radiation resolution expresses the ability of sensors to distinguish subtle changes in the radiation energy of ground objects.
5. Lines 341 to 354. The 5 different resolutions are all common knowledge, just briefly describe them.
6. Sections 3.1.2 and 4.2. Soil type, soil particle size, and soil temperature all have significant impacts on soil organic carbon content, but the author did not provide detailed explanations.
7. Section 4.5. The discussion is too redundant and it is recommended to delete it. Many of the discussions have already been discussed in the previous section, and only need to be summarized in the conclusion section.
Author Response
Comments 1: What are the essential differences between spectroscopic, satellite imagery and RS data? Has the standard deviation reached 6115.8?
Response 1: These are spectroscopic data, i.e. reflectance data from the Earth's surface. The difference could be in the acquisition range, i.e. visible (VIS), infrared (IR), thermal (TIR), etc., or in the type of instrument with which the data are obtained (with a field spectrophotometer, a drone, a satellite, etc.) or whether the data are continuous or discontinuous (hyper- or multispectral). Lines 332-345.
Since there is an article with 37,540 soil samples, while the mean is around 1,513, the standard deviation is very high. Due to the impact that this article had, we decided not to remove it from the study, but it would clearly be an outlier. Lines 282-285.
Comments 2: Lines 273. It is recommended to analyze the reasons for the sample density reaching 393 per Km2.
Response 2: Thank you for your comment. Following the referee's recommendation, the value was verified and effectively corresponds to 1,121 soil samples in 2,85 km2, obtained with a robotic machine, which corresponds to a very high sampling density but is difficult to reproduce by other authors. Lines 299-307.
Comments 3: Lines 296 to 300. In theory, satellite image includes spectrophotometry, airborne image, and UAV Image, and vegetation indices are also calculated based on image data. Therefore, the categories in Figure 6 should not be duplicated and defined. Therefore, the author needs to make a clear distinction between these concepts and check whether the relevant theories and data are accurate.
Response 3: The referee is right. As the data was presented in Figure 6, it appears that they are different data types. To resolve this, all spectroscopic data were included in a single category (satellite image, indices, field spectrophotometers, airborne, and UVA). Lines 332-345.
Comments 4: Line 352. In fact, radiation resolution expresses the ability of sensors to distinguish subtle changes in the radiation energy of ground objects.
Response 4: Thank you for your comment. This was done. Line 402-404.
Comments 5: Lines 341 to 354. The 5 different resolutions are all common knowledge, just briefly describe them.
Response 5: Thank you for your comment. The sentences have been summarized to improve text clarity and simplify its content. Lines 388-404.
Comments 6: Sections 3.1.2 and 4.2. Soil type, soil particle size, and soil temperature all have significant impacts on soil organic carbon content, but the author did not provide detailed explanations.
Response 6: As reported in the paper (line 511), mean annual temperature and precipitation directly influence SOC content [46,62,73,78]. Global climate information can be obtained from satellite data, such as MODIS, which captures land surface temperature data. To obtain information on soil properties or soil type, due to soils being covered by vegetation, it is advisable to use regional soil mapping that has been developed from reliable field data. However, this review shows that these types of field data (soil cover, soil properties, soil type, or geology) are the least used by users worldwide (Figure 6).
Comments 7: Section 4.5. The discussion is too redundant and it is recommended to delete it. Many of the discussions have already been discussed in the previous section, and only need to be summarized in the conclusion section.
Response 7: Thank you for your comment. Following his instructions, this section has been removed, and we have verified that its removal does not affect reading comprehension. Lines 697-733.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsIn Figure 2,The color coding of the Taiwan Province of the People’s Republic of China is inconsistent with the mainland China. Please correct this discrepancy.
Author Response
Comments 1: In Figure 2, The color coding of the Taiwan Province of the People’s Republic of China is inconsistent with the mainland China. Please correct this discrepancy.
Response 1: Thank you for the comment. We have corrected the coloring of the map as suggested.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe comment 3 regarding the explanation of “PRISMA” first appears in line 19, while the explanation is provided in line 163. Please confirm whether it is acceptable that not explained in the abstract. Generally, explanations are located at the first instance where the term appears.
The comment 4 is unclear to me in the revised manuscript. I understand the authors' intent in their response to the reviews, but there is an inconsistency between the revised manuscript and the response. Could you please confirm whether “(lines 439-442)” is correct or incorrect?
Author Response
Comments 1: The comment 3 regarding the explanation of “PRISMA” first appears in line 19, while the explanation is provided in line 163. Please confirm whether it is acceptable that not explained in the abstract. Generally, explanations are located at the first instance where the term appears.
Response 1: Thank you for the comment. We insert the explanation of “PRISMA” in the abstract as suggested. Lines 19-20 of the Word or 8-9 of the PDF.
Comments 2: The comment 4 is unclear to me in the revised manuscript. I understand the authors' intent in their response to the reviews, but there is an inconsistency between the revised manuscript and the response. Could you please confirm whether “(lines 439-442)” is correct or incorrect?
Response 1: Thank you for your comment. We indicated the wrong lines, sorry for it. The explanation is in lines 349-352of the Word or 307-310 of the PDF, just above figure 12. We also made a minor complementation to the text to improve understanding.