Next Article in Journal
Study on the Mechanisms of Rock Mass Watering for Rockburst Prevention in Phosphorite Mines from Laboratory Results
Next Article in Special Issue
Analysis of Optimal Buffer Distance for Linear Hazard Factors in Landslide Susceptibility Prediction
Previous Article in Journal
Does Stronger Environmental Regulation Promote Firms’ Export Sophistication? A Quasi-Natural Experiment Based on Sewage Charges Standard Reform in China
Previous Article in Special Issue
Study on Early Identification of Landslide Perilous Rocks Based on Multi-Dynamics Parameters
 
 
Article
Peer-Review Record

Landslide Susceptibility Mapping in Guangdong Province, China, Using Random Forest Model and Considering Sample Type and Balance

Sustainability 2023, 15(11), 9024; https://doi.org/10.3390/su15119024
by Li Zhuo 1,2,3, Yupu Huang 1,2, Jing Zheng 4,*, Jingjing Cao 1,2 and Donghu Guo 1,5
Reviewer 1:
Reviewer 2:
Sustainability 2023, 15(11), 9024; https://doi.org/10.3390/su15119024
Submission received: 13 May 2023 / Revised: 31 May 2023 / Accepted: 1 June 2023 / Published: 2 June 2023

Round 1

Reviewer 1 Report

This study proposed an improved random forest (RF)-based LSM model and applied it in Guangdong, China. Despite the extensive literature on landslide susceptibility prediction and the utilization of random forest for generating landslide susceptibility maps, the author presents a valuable perspective in the manuscript. With minor revisions, the manuscript should be considered suitable for publication. Here are some suggestions and questions for further clarification:

 

1.      The author should provide a clear explanation regarding the specific type of landslide, whether it is shallow landslides, deep-seated landslides, rockfall, or any other specific landslide type, that was utilized for interpretation and establishment of the landslide susceptibility map.

2.      The manuscript does not explicitly mention the determination of the specific type of landslide disaster or the sliding range of slopes based on the 335 historical landslide data for the reported landslide events in Guangdong Province from 2015 to 2019. It would be valuable for the author to provide this information to enhance the interpretation and understanding of the landslide susceptibility mapping process. Including details about the types of landslides observed and the range of slope movements considered would contribute to the robustness and applicability of the research outcomes.

3.      Regarding the 586 landslides obtained through remote sensing interpretation of the GaoFen-2 images, is there a procedure to assess the accuracy of the landslide inventory? What is the accuracy in the interpretation from the remote sensing images?

4.      It would be beneficial for the author to explain why satellite images were not utilized to build landslide inventories after typhoon events for landslide susceptibility assessment, especially considering that many landslides are attributed to typhoon events. Satellite images can provide valuable information for identifying and mapping landslides, particularly in areas affected by natural disasters. Clarifying the reasons for not incorporating satellite imagery after typhoon events in the landslide inventory process would enhance the understanding of the methodology and potentially offer insights into the limitations of the study.

5.      It would be helpful for the author to clarify whether soil erosion was considered as part of the landslide inventory. If soil erosion was not included, it is important to explain how the distinction between soil erosion and shallow landslides was made during the interpretation process. Distinguishing between these two phenomena is crucial for accurately assessing landslide susceptibility and understanding the underlying factors contributing to slope instability. Providing insights into the methodology used to differentiate between soil erosion and shallow landslides would enhance the robustness of the study and the interpretation of the results.

6.      The manuscript would benefit from providing a stronger basis for the selection of different factors and a more thorough explanation of the reasoning behind their inclusion in the modeling process. It is recommended to reference relevant literature that supports the choice of specific factors and their significance in landslide susceptibility assessment. Additionally, it is suggested to consider listing other potential factors that were not included in the study and provide a clear rationale for their exclusion. This would enhance the transparency and comprehensiveness of the research methodology and allow readers to better understand the factors considered and their relative importance in the landslide susceptibility modeling process.

7.      The authors did not provide clear information about the attributes of the different factors used in their study. It is important to clarify whether the data for each factor is continuous or categorical, as this can have an impact on the analysis results. Continuous data implies a numerical scale, while categorical data consists of distinct categories or classes. Understanding the attribute of each factor is crucial for selecting appropriate analysis methods and interpreting the results accurately. Therefore, it is recommended that the authors provide detailed information regarding the attribute of each factor and discuss how it may influence the analysis outcomes.

8.      The author mentioned the use of min-max normalization for pre-processing the factors, but there is no clear explanation of how the factors of aspect, lithology, land cover type, and global human modification were specifically dealt with. It is important to provide information on how these factors were encoded or represented in the analysis. For aspect and lithology, it is common to represent it as categorical data, where different aspect and lithological classes are assigned specific codes or labels. The specific coding scheme or representation used should be described. Land cover type can also be represented as categorical data, where different land cover classes are assigned codes or labels. It is important to specify the categories used and how they were assigned. Global human modification can be a complex factor to consider. It may involve collecting data on various human activities, such as urbanization, infrastructure development, and land use changes. The specific methods and data sources used to capture and represent global human modification should be explained. Overall, it is necessary for the authors to provide detailed information on the encoding or representation methods used for these factors. This will enhance the transparency and reproducibility of the study.

9.      It is important to provide a stronger justification for using data with different resolutions for different factors in the analysis. This can be supported by referencing relevant literature that explains the rationale behind such an approach. The manuscript would benefit from discussing the advantages and limitations of using data with varying resolutions and how it can affect the accuracy and reliability of the results. Additionally, it would be helpful to provide explanations for why specific resolutions were chosen for each factor and how this choice aligns with the objectives of the study. This will enhance the scientific rigor of the research and address any potential concerns regarding the use of data with different resolutions in the analysis.

10.  The rationale behind selecting average annual rainfall as the rainfall factor in the study should be clearly explained. It is important to discuss why average annual rainfall was chosen over other rainfall characteristics such as intensity, duration, or cumulative rainfall. The author should provide a justification for this choice, considering the specific objectives and scope of the research. It would be helpful to discuss how average annual rainfall is relevant to landslide susceptibility and its potential relationship with other rainfall characteristics. Additionally, it would be valuable to review and reference relevant literature that supports the use of average annual rainfall as a suitable indicator for landslide susceptibility assessment in the study area or similar regions.

11.  It is important to clarify whether the accuracy of the landslide susceptibility map was validated using landslides triggered by different typhoon events. The author should provide information on whether the landslide susceptibility map was evaluated using independent landslide events that were not included in the dataset used for model development. If landslides triggered by different typhoon events were used for validation, it would be beneficial to provide details on the performance metrics used to assess the accuracy of the susceptibility map, such as area under the receiver operating characteristic curve (AUC-ROC), success rate, prediction rate, or other relevant measures. This information would enhance the robustness and reliability of the landslide susceptibility map and provide insights into the model's performance in different triggering conditions.

12.  There are grammatical errors in the text that need to be corrected. For example, in line 19, there is a mismatch in the verb form before and after "and."

13.  The following references should be added:

[1] Random Forest-Based Landslide Susceptibility Mapping in Coastal Regions of Artvin, Turkey, ISPRS Int. J. Geo-Inf. 9(9), 553, 2020. [2] Assessment of Rainfall-Induced Landslide Susceptibility Using GIS-based Slope Unit Approach, Journal of Performance of Constructed Facilities, ASCE, 31, 2017. [3] Probability of Road Interruption due to Landslides under Different Rainfall-Return Periods Using Remote Sensing Techniques, Journal of Performance of Constructed Facilities, ASCE, 30, 2016. [4] Landslide susceptibility mapping using random forest and boosted tree models in Pyeong-Chang, Korea, Geocarto International ,33, 2018.

There are grammatical errors in the text that need to be corrected.

Author Response

Please see the attachment.

Reviewer 2 Report

This paper proposes an improved LSM model based on random forest algorithm, collects the sample data of landslides caused by rainfall factors and 13 possible environmental impact factors in Guangdong, finds out the best positive-negative sample proportion and training-test sample proportion, and constructs an LSM model based on random forest algorithm to predict whether landslides will occur. By comparing this model with the other three machine learning models, the performance advantages of this model are demonstrated.

There are some shortcomings in this paper:

1.This paper selects 6 types of influencing factors, including a total of 13 environmental factors. Please provide additional explanations on the reasons for selecting these 6 types of influencing factors.

2.This paper uses the random forest algorithm to build the LSM model, but does not describe the setting of its hyperparameter and the training process of the model, such as the number of decision trees, the maximum depth of decision trees, and the number of iterations. It is recommended to supplement.

3.In section 2.3.1, the writing of formula (2) is incorrect. This article selects 13 influencing factors to calculate the VIF value of a certain influencing factor. It is necessary to calculate the correlation coefficient between this influencing factor and the other 12 influencing factors and stack them. It is recommended to replace R with .

4.The paper selects other three machine learning algorithms to compare with the random forest algorithm selected in this paper. Please explain the reasons for choosing SVM, MLP, LR or simply introduce these three algorithms.

5.The paper uses AUC as an indicator to compare the performance of four algorithms. Please explain the calculation method of AUC.

6.The clarity of image 11 is low, and the effective information is blurry.

7.Reference format: Some references lack valid information, such as 7, 20, and 40; Some references have inconsistent author formats, such as 3 and 10.

This paper proposes an improved LSM model based on random forest algorithm, collects the sample data of landslides caused by rainfall factors and 13 possible environmental impact factors in Guangdong, finds out the best positive-negative sample proportion and training-test sample proportion, and constructs an LSM model based on random forest algorithm to predict whether landslides will occur. By comparing this model with the other three machine learning models, the performance advantages of this model are demonstrated.

There are some shortcomings in this paper:

1.This paper selects 6 types of influencing factors, including a total of 13 environmental factors. Please provide additional explanations on the reasons for selecting these 6 types of influencing factors.

2.This paper uses the random forest algorithm to build the LSM model, but does not describe the setting of its hyperparameter and the training process of the model, such as the number of decision trees, the maximum depth of decision trees, and the number of iterations. It is recommended to supplement.

3.In section 2.3.1, the writing of formula (2) is incorrect. This article selects 13 influencing factors to calculate the VIF value of a certain influencing factor. It is necessary to calculate the correlation coefficient between this influencing factor and the other 12 influencing factors and stack them. It is recommended to replace R with .

4.The paper selects other three machine learning algorithms to compare with the random forest algorithm selected in this paper. Please explain the reasons for choosing SVM, MLP, LR or simply introduce these three algorithms.

5.The paper uses AUC as an indicator to compare the performance of four algorithms. Please explain the calculation method of AUC.

6.The clarity of image 11 is low, and the effective information is blurry.

7.Reference format: Some references lack valid information, such as 7, 20, and 40; Some references have inconsistent author formats, such as 3 and 10.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

It can be accepted in the current version 

It can be accepted 

Back to TopTop