Next Article in Journal
S3MPC: Improvement on Inland Water Tracking and Water Level Monitoring from the OLTC Onboard Sentinel-3 Altimeters
Next Article in Special Issue
Improving Spatial Agreement in Machine Learning-Based Landslide Susceptibility Mapping
Previous Article in Journal
Frequency–Wavenumber Analysis of Deep Learning-based Super Resolution 3D GPR Images
Previous Article in Special Issue
Evaluating the Effects of Digital Elevation Models in Landslide Susceptibility Mapping in Rangamati District, Bangladesh
Open AccessArticle

On the Importance of Train–Test Split Ratio of Datasets in Automatic Landslide Detection by Supervised Classification

Institute of Geodesy and Geoinformatics, Wroclaw University of Environmental and Life Sciences, 50-375 Wroclaw, Poland
*
Author to whom correspondence should be addressed.
Remote Sens. 2020, 12(18), 3054; https://doi.org/10.3390/rs12183054
Received: 31 July 2020 / Revised: 14 September 2020 / Accepted: 15 September 2020 / Published: 18 September 2020
(This article belongs to the Special Issue Remote Sensing of Natural Hazards)
Many automatic landslide detection algorithms are based on supervised classification of various remote sensing (RS) data, particularly satellite images and digital elevation models (DEMs) delivered by Light Detection and Ranging (LiDAR). Machine learning methods require the collection of both training and testing data to produce and evaluate the classification results. The collection of good quality landslide ground truths to train classifiers and detect landslides in other regions is a challenge, with a significant impact on classification accuracy. Taking this into account, the following research question arises: What is the appropriate training–testing dataset split ratio in supervised classification to effectively detect landslides in a testing area based on DEMs? We investigated this issue for both the pixel-based approach (PBA) and object-based image analysis (OBIA). In both approaches, the random forest (RF) classification was implemented. The experiments were performed in the most landslide-affected area in Poland in the Outer Carpathians-Rożnów Lake vicinity. Based on the accuracy assessment, we found that the training area should be of a similar size to the testing area. We also found that the OBIA approach performs slightly better than PBA when the quantity of training samples is significantly lower than the testing samples. To increase detection performance, the intersection of the OBIA and PBA results together with median filtering and the removal of small elongated objects were performed. This allowed an overall accuracy (OA) = 80% and F1 Score = 0.50 to be achieved. The achieved results are compared and discussed with other landslide detection-related studies. View Full-Text
Keywords: automatic landslide detection; OBIA; PBA; random forests; supervised classification automatic landslide detection; OBIA; PBA; random forests; supervised classification
Show Figures

Figure 1

MDPI and ACS Style

Pawluszek-Filipiak, K.; Borkowski, A. On the Importance of Train–Test Split Ratio of Datasets in Automatic Landslide Detection by Supervised Classification. Remote Sens. 2020, 12, 3054. https://doi.org/10.3390/rs12183054

AMA Style

Pawluszek-Filipiak K, Borkowski A. On the Importance of Train–Test Split Ratio of Datasets in Automatic Landslide Detection by Supervised Classification. Remote Sensing. 2020; 12(18):3054. https://doi.org/10.3390/rs12183054

Chicago/Turabian Style

Pawluszek-Filipiak, Kamila; Borkowski, Andrzej. 2020. "On the Importance of Train–Test Split Ratio of Datasets in Automatic Landslide Detection by Supervised Classification" Remote Sens. 12, no. 18: 3054. https://doi.org/10.3390/rs12183054

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop