Assessment of Landslide Susceptibility Using Different Machine Learning Methods in Longnan City, China
Round 1
Reviewer 1 Report
The paper is not clearly written and not relatively easy to read.
The article simulates a prediction of landslide susceptibility, following different methodologies, leading to very good statistical results as viewed by ROC analysis. The paper considers many factors for triggering. But the thing that is not well explained is how anthropogenic activities in particular roads (a factor taken into consideration and of much influence in the simulations) lead to be triggers of the phenomenon. The materials and methods need a lot of improvement especially the physical description of the study area. I also think the discussions need to be expanded, particularly by making comparisons with similar work. And also the abstract seems to have a different writing font. The paper is interesting but definitely needs improvement.
Here are some more detailed comments and corrigenda:
Line 13-15: This sentence is not really clear, do you intend that determining landslide susceptibility is useful for land use management planning?
Line 48-50: these sentences need some references like this or others
• An, K.; Kim, S.; Chae, T.; Park, D. Developing an accessible landslide susceptibility model using open-source resources. Sustainability 2018, 10, 293.
• Moresi, F. V., Maesano, M., Collalti, A., Sidle, R. C., Matteucci, G., & Scarascia Mugnozza, G. (2020). Mapping Landslide Prediction through a GIS-Based Model: A Case Study in a Catchment in Southern Italy. Geosciences, 10(8), 309.
• Pal, S.C.; Chowdhuri, I. GIS-based spatial prediction of landslide susceptibility using frequency ratio model of Lachung River basin, North Sikkim, India. SN Appl. Sci. 2019, 1, 416
Line 87-89: This sentence either needs a reference or should be expanded to better explain what is intended by human action and quantify this phenomenon
Line 100-114: By reading the paragraph the area is very large, the coordinates you include are of the center of the area?
Also, where did the climate data come from? I think the reference should be included. Moreover it is an paper on landslides, not a mention was spent, on the lithologies present, the land cover, neither on the main slopes. I think the description of the study site should be strongly improved
Line 124: For this sentence, a reference should be added. I think it's this one but it needs to be verified
· Analysis was conducted using R (R Core Team, 2020), RStudio (Rstudio Team, 2020), and the tidyverse package (Wickham, 2017).
Line 199: This sentence either needs a reference
Line 223: The author already in the introduction makes this assertion that human activities increase the probability of formation of landslide events, but what type of activities from the way he mentions later he is considering roads but in what sense can a road trigger a landslide? In my opinion this part needs to be more clearly and detailed rewritten
Fig 6: The figure appears out of focus
Fig. 7/9/11: In the figure I suggest including contour lines to get a morphological perception of areas susceptible to landslides
Author Response
Dear reviewer:
Thank you for your letter on our manuscript entitled “Assessment of Landslide Susceptibility Using Different Machine Learning Methods in Longnan City, China” (ID: sustainability-2056606). Those comments are very helpful for revising and improving our paper, as well as the important guiding significance to other researchs. We have studied the comments carefully and made corrections which we hope meet with approval. The main corrections are in the revised manuscript (the modifications are highlighted in red) and the responds to the reviewers’ comments are as follows (the responses are highlighted in blue).
- Lines 49-50: landslide susceptibility assessment has attracted increasing attention in the geotechnical and geological engineering community in recent years, such as ‘Slope stability prediction using ensemble learning techniques: A case study in Yunyang County’ and ‘Transfer learning improves landslide susceptibility assessment’. Some latest research can also be reviewed in the introduction section.
Response: Thanks for the reviewer’s comments. As for the review of the latest literatures you proposed, we have read these two literatures and found that these two literatures are really helpful for our review of the latest literatures. Therefore, we cite these two literatures in the article. Please see line 37-39 and 556-558 of revised manuscript.
- Lines 85-86: ‘It is necessary to select appropriate factors for landslide susceptibility assessment based on the actual situation of the study area.’ Compared with previous studies, what are the site-specific factors for Longnan City.
Response: Many thanks to reviewers for their suggestions. First of all, according to the actual situation of the study area, there are many mountainous areas and dense roads in the study area. Secondly, previous studies have also shown that terrain and roads have a great impact on landslides. Please see line 83-84 of revised manuscript.
- Lines 140-141: ‘The same numbers of non-landslide points were randomly selected from areas that were not prone to landslides.’ Although the ratio (1:1) was applied in this study, it is suggested to further investigate the influences of the ratio on the landslide susceptibility assessment.
Response: Many thanks to reviewers for their suggestions. We randomly selected an equal number of non-landslide sites in areas that were not prone to landslides. These non-landslide points are basically evenly distributed in the whole study area, and a 1:1 ratio is adopted to build the landslde inventory, and then the training set and test set are divided, which is the same with many studies [9, 25, 23, 27].
- LSM of LR, DT, and RF models are compared and analyzed.’ Please tabulate the model parameters of these three algorithms calibrated in this study.
Response: Many thanks to reviewers for their advice. In order to make our model more scientific, two accuracy parameters of sensitivity and specificity were added in Table 6 according to a large number of previous studies [27,41]. Please see Table 6 of the revised manuscript and formulas (7) and (8)..
- It is well-recognized that machine learning algorithms generally contain several hyperparameters, thus hyperparameter optimization plays an important role in machine learning applications. Until recently, several hyper-parameter optimization techniques have been proposed, such as the Bayesian optimization in the literature (Doi:10.1016/j.gsf.2020.03.007). Please specify the hyper-parameter optimization method used in this study for the RF.
Response: Many thanks to reviewers for their advice. According to the suggestions of reviewers, in this paper, we give the super parameters for optimizing the random forest model, namely the decision tree number (n_estimators) super parameters. Please see line 394-395 of revised manuscript.
Author Response File: Author Response.docx
Reviewer 2 Report
This manuscript presented an interesting study on the assessment of landslide susceptibility using different machine learning methods in Longnan City, China. In general, the manuscript is well-organized and revision is suggested. Some technical issues can be addressed or clarified to improve the quality of the manuscript:
(1) Lines 49-50: landslide susceptibility assessment has attracted increasing attention in the geotechnical and geological engineering community in recent years, such as ‘Slope stability prediction using ensemble learning techniques: A case study in Yunyang County’ and ‘Transfer learning improves landslide susceptibility assessment’. Some latest research can also be reviewed in the introduction section.
(2) Lines 85-86: ‘It is necessary to select appropriate factors for landslide susceptibility assessment based on the actual situation of the study area.’ Compared with previous studies, what are the site-specific factors for Longnan City.
(3) Lines 140-141: ‘The same numbers of non-landslide points were randomly selected from areas that were not prone to landslides.’ Although the ratio (1:1) was applied in this study, it is suggested to further investigate the influences of the ratio on the landslide susceptibility assessment.
(4) Lines 359: ‘LSM of LR, DT, and RF models are compared and analyzed.’ Please tabulate the model parameters of these three algorithms calibrated in this study.
(5) It is well-recognized that machine learning algorithms generally contain several hyperparameters, thus hyperparameter optimization plays an important role in machine learning applications. Until recently, several hyper-parameter optimization techniques have been proposed, such as the Bayesian optimization in the literature (Doi:10.1016/j.gsf.2020.03.007). Please specify the hyper-parameter optimization method used in this study for the RF.
Author Response
Dear reviewer:
Thank you for your letter on our manuscript entitled “Assessment of Landslide Susceptibility Using Different Ma-chine Learning Methods in Longnan City, China” (ID:sustainability-2056606). Those comments are very helpful for revising and improving our paper, as well as the important guiding significance to other researchs. We have studied the comments carefully and made corrections which we hope meet with approval. The main corrections are in the revised manuscript (the modifications are highlighted in red) and the responds to the reviewers’ comments are as follows.
- Line 13-15: This sentence is not really clear, do you intend that determining landslide susceptibility is useful for land use management planning?
Response: Thanks for the reviewer. In urban planning (land use), residential areas and industrial and agricultural production bases with a greater impact on the national economy should not be planned in landslide-prone areas. This can reduce the impact of landslide hazards. Before urban development planning in China, many cities carry out flood and geological hazard risk assessment.
- Line 48-50: these sentences need some references like this or others.
Response: Thanks for the reviewer. According to the reviewer's opinion, we cited three literatures to prove the rationality of this sentence. Please see line 37-38 of revised manuscript.
- Line 87-89: This sentence either needs a reference or should be expanded to better explain what is intended by human action and quantify this phenomenon.
Response: Thanks for the reviewer. According to the reviewer's opinion, we cited a literature to prove the rationality of this sentence. Please see line 70-71 of revised manuscript.
- Line 100-114: By reading the paragraph the area is very large, the coordinates you include are of the center of the area? Also, where did the climate data come from? I think the reference should be included. Moreover it is an paper on landslides, not a mention was spent, on the lithologies present, the land cover, neither on the main slopes. I think the description of the study site should be strongly improved.
Response: Many thanks to reviewers for their advice.
(1) This coordinate refers to the whole area of Longnan City, as shown in Figure 1 (b), which refers to the whole research area.
(2) Similar to our study, the document [29] also takes Longnan City as the research area, and our climate data comes from her article. Therefore, according to the reviewer's opinion, we quoted her article. Please see line 103-113 of revised manuscript.
(3) According to the reviewer's comments, we have made detailed descriptions of the topography, lithology, river distribution, soil type and vegetation type of the study area. Please see line 113-126 of revised manuscript.
- Line 124: For this sentence, a reference should be added. I think it's this one but it needs to be verified. Analysis was conducted using R (R Core Team, 2020), RStudio (Rstudio Team, 2020), and the tidyverse package (Wickham, 2017).
Response: Many thanks to reviewers for their advice. According to the reviewer's comments and the actual R language package used in our study, we cited a literature to ensure the rationality of the article. Please see line 133-134 of revised manuscript.
- Line 199: This sentence either needs a reference
Response: Many thanks to reviewers for their advice. According to the reviewer's opinion, we cited two literatures on plane curvature and profile curvature to ensure the rationality of the article. Please see line 196 and 201 of revised manuscript.
- Line 223: The author already in the introduction makes this assertion that human activities increase the probability of formation of landslide events, but what type of activities from the way he mentions later he is considering roads but in what sense can a road trigger a landslide? In my opinion this part needs to be more clearly and detailed rewritten.
Response: Many thanks to reviewers for their advice. The construction of roads will break the integrity of the ecosystem, lead to the destruction of vegetation, and also produce a lot of unstable slopes. Landslides will occur due to rainfall and other triggering factors. Please see line 243-244 of revised manuscript.
- Fig 6: The figure appears out of focus.
Response: Many thanks to reviewers for their advice. According to the reviewer's opinion, we have adjusted Figure 6, and now it is in the center. Please see Figure 6 of revised manuscript.
- Fig. 7/9/11: In the figure I suggest including contour lines to get a morphological perception of areas susceptible to landslides.
Response: Thanks for the reviewer. According to the reviewer's opinion, we added the contour lines of the study area to Figure 11 for example. However, due to the steep terrain and dense contour lines in some parts of the study area, the contour lines covered the grade distribution of landslide prone. Therefore, we did not add contour lines. The region through which the river flows is low-lying. Therefore, rivers can also show the elevation of the terrain to some extent. Therefore, the rivers in the study area were added to Figure 7,9, and 11 to reflect the topography of the study area. Please see Figure 7, 9 and 11 of revised manuscript.
Figure 11. The landslide susceptibility map of the RF model.
Author Response File: Author Response.docx
Reviewer 3 Report
The Authors presented three landslide susceptibility maps of the Longnan City area, Gansu Province (central part of China), based on three different models (i.e. decision tree - DT, logistic regression - LR and random forest –RF). The study area is large (27,000 km2), a landslide inventory of 1656 historical landslides was available and 12 landslide influencing factors have been considered in the proposed analyses (Elevation, Slope, Aspect, Profile Curvature, Plan Curvature, Distance to faults, Normalized Difference Vegetation Index, Land Cover, Rainfall, Distance to rivers, Soil types, Distance to roads). A frequency ratio model (FR) was used to analyse the relationship between each influencing factor and the occurrence of landslide and helped the Authors in the selection of the relevant factors.
The use of machine learning techniques for the study of landslide hazard of large areas is a promising tool in helping researchers and authorities to make rational hazard maps. However, this kind of analysis is a sort of “black box” and care is needed in presenting the methods and validating the results obtained. In this perspective I have found the paper very accurate, all the aspects of the different methods were clearly introduced and briefly explained. When machine learning methods are applied to landslide hazard this clarity is rare, so I appreciated the work of the Authors.
To improve the paper I suggest the Authors the following points:
1. The quality of the inventory landslide map is crucial for the reliability of the applied methods and a few more words about it would be useful: for example, is the entire territory mapped with the same accuracy? Are there parts where the inventory is lacking? How different accuracy in the inventory map affects the machine learning methods? For example, I wonder if the relevance of the distance from the roads could be influenced by the greater number of landslides detected when they interfere with infrastructures instead of when they occur in a forest.
2. A comment on some false negatives might be helpful. In other words, what happened where a landslide is mapped in the inventory but the models do not predict a high landslide risk area?
3. Conclusions are particularly concise: I suggest adding some words about the limit and difficulties of the applied methodology to obtain reliable hazard map; also, some promising direction to improve this kind of studies could be outlined.
Also, some suggestions related the readability of the paper:
4. Fig. 5 is very important, but legend and inscriptions are too little to be read, so please redraw it to an appropriate size;
5. The paper contains a large number of acronyms, so please add a glossary at the end of the paper where all acronyms and abbreviations are explained
Author Response
Dear reviewer:
Thank you for your letter on our manuscript entitled “Assessment of Landslide Susceptibility Using Different Ma-chine Learning Methods in Longnan City, China” (ID: sustainability-2056606). Those comments are very helpful for revising and improving our paper, as well as the important guiding significance to other researchs. We have studied the comments carefully and made corrections which we hope meet with approval. The main corrections are in the revised manuscript (the modifications are highlighted in red) and the responds to the reviewers’ comments are as follows (the responses are highlighted in blue).
- The quality of the inventory landslide map is crucial for the reliability of the applied methods and a few more words about it would be useful: for example, is the entire territory mapped with the same accuracy? Are there parts where the inventory is lacking? How different accuracy in the inventory map affects the machine learning methods? For example, I wonder if the relevance of the distance from the roads could be influenced by the greater number of landslides detected when they interfere with infrastructures instead of when they occur in a forest.
Is the entire territory mapped with the same accuracy?
Response: Thanks for the reviewer.
(1) In comparison, 1656 historical landslides are a lot. In addition, we randomly generated the same number of non-landslide points in the study area, as shown in the figure below, which were evenly distributed throughout the study area, so as to ensure that the accuracy of landslide prediction is basically consistent throughout the study area. In addition, in order to ensure the accuracy of the landslide point, we conducted a field survey. Please see line151-152 of revised manuscript.
Are there parts where the inventory is lacking?
(2) We are very sorry for the lack of inventory, we really did not take it into account. We believe that if landslide sites used in history can be collected, such as the inventory in recent years, it can better identify the areas where landslide disasters are most likely to occur under the current climate change and human activities. Therefore, in the follow-up study, we will also collect the inventory of recent years.
How different accuracy in the inventory map affects the machine learning methods?
(3) How different accuracies in inventory maps affect machine learning methods. As for the relationship between the road and the landslide point, according to our statistics, the nearest distance from the road is 1.6 meters, and there are 23 landslide points less than 20 meters. Compared with the 1656 landslide points, this has very little impact on the accuracy of the machine learning model.
- A comment on some false negatives might be helpful. In other words, what happened where a landslide is mapped in the inventory but the models do not predict a high landslide risk area?
Response: Thanks for the reviewer.
Many thanks to reviewers for their advice. False negative refers to the actual landslide point, but the prediction result is non-landslide point, which can call this omission. According to the suggestions of reviewers, we made statistics on the confusion matrix of the three models and calculated the false negative rate to obtain the false negative rate of each model. Please see lines 427-428of the revised draft.
Table. the confusion matrix of the three models.
Model |
LR |
DT |
RF |
|||
1 |
0 |
1 |
0 |
1 |
0 |
|
1 |
392 |
141 |
418 |
172 |
412 |
143 |
0 |
114 |
348 |
88 |
317 |
94 |
346 |
- Conclusions are particularly concise: I suggest adding some words about the limit and difficulties of the applied methodology to obtain reliable hazard map; also, some promising direction to improve this kind of studies could be outlined. Also, some suggestions related the readability of the paper:
Response: Thanks for the reviewer.
(1) Limitations and difficulties of application methods for obtaining reliable risk maps. We believe that the models used in this paper are all single models, and the accuracy of the results needs to be further improved. However, a coupled model might be a better approach. This paper identifies the areas with high landslide susceptibility, which has certain guiding significance for the future urban development planning of the study area.
(2) As for the readability suggestion, the internal mechanism of landslide is less studied in this paper. Follow-up studies need to investigate and sample typical landslides in the area, and use structural equation models to explore the internal mechanism of landslide occurrence at the micro scale. Please see conclusion 4 of the revised manuscript (lines 520-527).
- Fig. 5 is very important, but legend and inscriptions are too little to be read, so please redraw it to an appropriate size
Response: Many thanks to reviewers for their advice.
According to the reviewer's comments, we adjusted all the subgraphs in Figure 5. Please see Figure 5 of revised manuscript.
- The paper contains a large number of acronyms, so please add a glossary at the end of the paper where all acronyms and abbreviations are explained.
Response: Many thanks to reviewers for their advice.
According to the reviewer's opinion, we have explained all the abbreviations in the paper and listed them in the table at the end of the paper. Please see the table at the end of the article.
Author Response File: Author Response.docx
Round 2
Reviewer 1 Report
The article has been thoroughly revised, I noticed other corrections made by the other reviewers, I think the article can be published in the present form.
Reviewer 2 Report
Thanks to the contribution of all the authors, and the manuscript has been carefully revised according to the reviewer comments.