Next Article in Journal
Object-Based Mapping of Gullies Using Optical Images: A Case Study in the Black Soil Region, Northeast of China
Previous Article in Journal
An Automated Method for Surface Ice/Snow Mapping Based on Objects and Pixels from Landsat Imagery in a Mountainous Region
 
 
Article
Peer-Review Record

Slope Failure Prediction Using Random Forest Machine Learning and LiDAR in an Eroded Folded Mountain Belt

Remote Sens. 2020, 12(3), 486; https://doi.org/10.3390/rs12030486
by Aaron E. Maxwell 1,*, Maneesh Sharma 2, James S. Kite 1, Kurt A. Donaldson 2, James A. Thompson 3, Matthew L. Bell 1 and Shannon M. Maynard 2
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Remote Sens. 2020, 12(3), 486; https://doi.org/10.3390/rs12030486
Submission received: 10 January 2020 / Revised: 28 January 2020 / Accepted: 30 January 2020 / Published: 3 February 2020
(This article belongs to the Section Remote Sensing in Geology, Geomorphology and Hydrology)

Round 1

Reviewer 1 Report

The manuscript entitled "Large-Area slope failure prediction using random forest machine learning and LiDAR: findings and recommendations" by Maxwell et al aims at providing guidance in selection of methods, parameters and predictors for slope failure prediction using random forest and well selected training datasets, through a case study in US. The introduction is very comprehensive and brings the reader towards the specific results obtained in the study as an additional building-block compared to previous studies. I really appreciated  the fact that the authors bring the community efforts into light in a concise way, understandable for a wider audience. Assertions are generally very well justified and explained in a convincing way. The graphical support is well chosen, but may need some more explanations to make them more understandable. Hereafter are some minor comments that should help in the readability of this good manuscript for a wide audience. Congratulation for this well written manuscript!

Fig 3: explain the meaning of the circles size Table 5: explain the colored steps section 3.2: some insight into fig 6 needed, with not only a descritpion, but the interpretation of it l.66: the introduction is well written, however the questions are pretty much technical compared to the rest, and as such it is difficult to understand question 1 without reading l.223 first. I would suggest to talk about it before introducing the objective l.284: please clarify what "mtry" means l.310: is the technical info on using the combination of scripts in python and R needed here (without referencing to the specific piece of code)? I would suggest to remove it, or to properly describe the scripts in supplement

Author Response

"Please see the attachment"

Author Response File: Author Response.pdf

Reviewer 2 Report

Dear Authors, your manuscript is well written and addresses an interesting topic (although not entirely new). The paper tackles an important issue (prediction of landslides) using known and already tried methods. The method used in this study is not new, in fact, it had been extensively applied in many regions worldwide. Moreover, there is a recent paper in Remote sensing Journal Dou et al. (2019) which used various machine learning approaches and LiDAR-derived dem products for landslide prediction.

The title is inadequate because the area of interest is not mentioned whereas the paper concentrates on a certain area (MLRA within the state of West Virginia). Also, the LiDAR data used in the current study is available for the United States through the 3D Elevation Program (3DEP), so authors should mention the specific study area.

Introduction Section

The Introduction section should be improved, and the topic should be presented better and in a more general contest to be generalized to other areas of the world.

 

In lines 31-32, authors have used the estimated statistics about fatalities and damage in dollars from the USGS Fact Sheet 2004-3072, July 2004. I suggest using more recent information from 2019. In lines 37-39, authors stated that “there is a need to monitor and predict slope failure occurrence across large spatial extents using consistent and reproducible methods” however, the reproducibility and transferability of current works is challenging in other areas as LiDAR technologies are expensive and open data is not available for other regions. In lines, 42-44, authors stated about more recent use of machine learning for mapping, predicting, and modeling slope failure, I suggest to also mention more recently used deep learning approaches in landslide susceptibility studies (Sameen et al. 2020; Wang et al. 2019). In lines 44-45, English is not clear “Generally, machine learning methods have seen many applications in the geospatial sciences…..” In line 60, authors should mention full abbreviation if used first time in the text such as Major Land Resource Area (MLRA).

Mapping slope failures and susceptibility

 

In lines 80-81 authors mentioned several studies related to landslide mapping and susceptibility, I suggest to add more literature and discuss regarding recent developments in approaches like convolutional neural networks for landslide detection and susceptibility assessments (Dao et al. 2020; Ghorbanzadeh et al. 2019a; Ghorbanzadeh et al. 2019b; Jin et al. 2019; Lei et al. 2019a; Lei et al. 2019b; Reichenbach et al. 2018; Sameen et al. 2020; Wang et al. 2019)

 

Random Forest for Spatial Predictive Modeling

 

There is a typo in line 98 in the spelling of Random forest, please correct it. Authors have not mentioned why they chose Random forest for spatial predictive modeling of landslides and left out other machine learning algorithms.

 

Random Forest for Spatial Predictive Modeling

Please also mention the limitations of using Lidar derived terrain variables in the literature review.

 

Methods

Authors used head scarp of landslides as a point data for training the Models, but using point data at scarp region leads to under-representation of landslide which can cover an area of large polygons. How did the author select landslide effective factors, why?

Results

In many good literatures, both train data and validation data need to validate the performance of the model. I suggest calculating the user and prediction accuracy of the implemented models. In figure 1,2 and 5 please use consistent scale bar and north arrow styles, the author has used different types of scale bars throughout the paper. Which methods are you used for the classification of landslide susceptibility maps into the high likelihood and low likelihood in figure 5 (i.e., natural break, manual, or equal interval).

 

I shall recommend its publication after the comments are addressed.

 

 

Dao DV et al. (2020) A spatially explicit deep learning neural network model for the prediction of landslide susceptibility CATENA 188:104451 doi:https://doi.org/10.1016/j.catena.2019.104451 Dou J et al. (2019) Evaluating GIS-Based Multiple Statistical Models and Data Mining for Earthquake and Rainfall-Induced Landslide Susceptibility Using the LiDAR DEM Remote Sensing 11:638 Ghorbanzadeh O, Blaschke T, Gholamnia K, Meena SR, Tiede D, Aryal J (2019a) Evaluation of Different Machine Learning Methods and Deep-Learning Convolutional Neural Networks for Landslide Detection Remote Sensing 11:196 Ghorbanzadeh O, Meena SR, Blaschke T, Aryal J (2019b) UAV-Based Slope Failure Detection Using Deep-Learning Convolutional Neural Networks Remote Sensing 11:2046 Jin B, Ye P, Zhang X, Song W, Li S (2019) Object-Oriented Method Combined with Deep Convolutional Neural Networks for Land-Use-Type Classification of Remote Sensing Images Journal of the Indian Society of Remote Sensing 47:951-965 doi:10.1007/s12524-019-00945-3 Lei T, Zhang Q, Xue D, Chen T, Meng H, Nandi AK End-to-end Change Detection Using a Symmetric Fully Convolutional Network for Landslide Mapping. In: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 12-17 May 2019 2019a. pp 3027-3031. doi:10.1109/ICASSP.2019.8682802 Lei T, Zhang Y, Lv Z, Li S, Liu S, Nandi AK (2019b) Landslide Inventory Mapping From Bitemporal Images Using Deep Convolutional Neural Networks IEEE Geoscience and Remote Sensing Letters 16:982-986 doi:10.1109/LGRS.2018.2889307 Reichenbach P, Rossi M, Malamud BD, Mihir M, Guzzetti F (2018) A review of statistically-based landslide susceptibility models Earth-Science Reviews 180:60-91 doi:https://doi.org/10.1016/j.earscirev.2018.03.001 Sameen MI, Pradhan B, Lee S (2020) Application of convolutional neural networks featuring Bayesian optimization for landslide susceptibility assessment CATENA 186:104249 doi:https://doi.org/10.1016/j.catena.2019.104249 Wang Y, Fang Z, Hong H (2019) Comparison of convolutional neural networks for landslide susceptibility mapping in Yanshan County, China Science of The Total Environment 666:975-993 doi:https://doi.org/10.1016/j.scitotenv.2019.02.263

Author Response

"Please see the attachment."

Author Response File: Author Response.pdf

Reviewer 3 Report

This is a well written and easy to follow manuscript. The paper only examined random forest machine learning classifiers for predicting large-area slope failure using a set of environmental predictors. Here are some comments:

--This study only applied Random forest classifier. It is possible to obtain a better accuracy using even a simple logistic regression. More classifiers such as SVM, KNN should be added and compared with RF.

--AUC was the only evaluation metric in this study. AUC has been criticized in literature when the number of samples between categories varies significantly (Lobo, J. M., Jiménez‐Valverde, A., & Real, R. (2008). AUC: a misleading measure of the performance of predictive distribution models. Global ecology and Biogeography17(2), 145-151.), I would suggest incorporating comprehensive accuracy assessments on TEST dataset. For binary classification models, in addition to AUC, confusion matrix and its derivatives (sensitivity, specificity, overall accuracy, kappa statistics) could be evaluated. I believe data division could end up with imbalanced categories, if randomly splitted, F1 score and precision-recall curve are more informative.

--It is not clear if data division is only performed on your training + validation data, not your test data.

--It is likely the model overfitted to the training dataset as almost 16% of sample data were considered for the test dataset, and the features are highly that undermines the prediction accuracy on test dataset (has not been investigated).

--I would also have expected a comprehensive hyperparameter tuning of RF.

--Which one of the findings is actionable? How do you take it to the next step in terms of slope failure management implications?

Author Response

"Please see the attachment."

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Thank you for addressing the comments.

Reviewer 3 Report

The authors addressed some main comments.

Back to TopTop