Optimized Neural Architecture for Automatic Landslide Detection from High-Resolution Airborne Laser Scanning Data

An accurate inventory map is a prerequisite for the analysis of landslide susceptibility, hazard, and risk. Field survey, optical remote sensing, and synthetic aperture radar techniques are traditional techniques for landslide detection in tropical regions. However, such techniques are time consuming and costly. In addition, the dense vegetation of tropical forests complicates the generation of an accurate landslide inventory map for these regions. Given its ability to penetrate vegetation cover, high-resolution airborne light detection and ranging (LiDAR) has been used to generate accurate landslide maps. This study proposes the use of recurrent neural networks (RNN) and multi-layer perceptron neural networks (MLP-NN) in landscape detection. These efficient neural architectures require little or no prior knowledge compared with traditional classification methods. The proposed methods were tested in the Cameron Highlands, Malaysia. Segmentation parameters and feature selection were respectively optimized using a supervised approach and correlation-based feature selection. The hyper-parameters of network architecture were defined based on a systematic grid search. The accuracies of the RNN and MLP-NN models in the analysis area were 83.33% and 78.38%, respectively. The accuracies of the RNN and MLP-NN models in the test area were 81.11%, and 74.56%, respectively. These results indicated that the proposed models with optimized hyper-parameters produced the most accurate classification results. LiDAR-derived data, orthophotos, and textural features significantly affected the classification results. Therefore, the results indicated that the proposed methods have the potential to produce accurate and appropriate landslide inventory in tropical regions such as Malaysia.


Introduction
Landslides are dangerous geological disasters with catastrophic effects on human lives and properties.Landslides occur with high frequency in mountainous and hilly areas, such as the Cameron Highlands in Malaysia.Landslide incidence is related to a cluster of triggering factors, such as intense rainfall, volcanic eruptions, rapid snowmelt, elevated water levels, and earthquakes.Landslide inventory maps are crucial for measuring the magnitude and analyzing the susceptibility, hazard, and risk of earthquakes [1,2], as well as for examining distribution patterns and predicting the landscapes affected by landslide [3].Mapping a landslide inventory in tropical areas is challenging because the dense vegetation cover in these regions obscures underlying landforms [4].Moreover, the majority of available conventional landslide detection techniques are not rapid and accurate enough for inventory mapping given the rapid vegetation growth in tropical regions.Therefore, inventory mapping requires the use of more rapid and accurate techniques, such as light detection and ranging (LiDAR) [5], which uses active laser transmitters and receivers to acquire elevation data.In addition, LiDAR has the unique capability to penetrate densely vegetated areas [5] and provide detailed information on terrains with high point density.Moreover, it depicts ground surface features and provides useful information on topographical features in areas where landslide locations are obscured by vegetation cover [6,7].
Numerous studies have applied a multiresolution segmentation algorithm for the remote sensing of land features [8].This algorithm requires the identification of three parameters (i.e., scale, shape, and compactness); the values of these parameters can be determined using the traditional trial-and-error method, which is very time consuming and laborious [5].Moreover, using the algorithm to delineate the boundary of an object at different scales remains challenging [9].Thus, optimal parameters for segmentation should be identified via semiautomatic and automatic approaches [10][11][12].The automatic selection of segmentation parameters requires the use of the advanced supervised approach presented in [13].
Processing a large number of irrelevant features causes overfitting [14].By contrast, the best classification results are obtained by selecting the most relevant feature [15].Landslide identifcation in a particular area can be improved by selecting the most significant feature [15,16].As shown in [2], selecting the most significant feature facilitates the differentiation of landslides from non-landslides.Accuracy can be improved by decreasing the number of features, as recommended in [17].The efficiency of feature selection techniques for landslide detection has been proven in [18][19][20].
The neural network (NN) is effective in remote sensing applications [21], particularly in solving different image classification problems [22] specified by nonlinear mathematical fitting for function approximation.NN architectures are classified into the recurrent neural network (RNN), back-propagation neural network, probability neural network, and multilayer perceptron neural network (MLP-NN).NN-based classifiers can adapt to different types of data and inputs, and can overcome the issue of mixed pixels by providing fuzzy output and fit with multiple images [23,24].These classifiers include parallel computation, which is superior to statistical classification approaches because it is non-parametric and does not require the prior knowledge of a distribution model for input data [25].Moreover, NN-based classifiers can evaluate non-linear relationships between the input data and desired outputs and are distinguished by their fast generalization capability [26].NN-based classifiers have been successfully in function approximation, prediction, pattern recognition, landslide detection, image classification, automatic control, and landslide susceptibility [27][28][29][30][31][32].Authors of [33] found that MLP-NN can be effectively applied in landslide detection using multi-source data.The RNN model can effectively predict landslide displacement [34].The above neural architecture techniques have not been extensively used for landslide detection using only LiDAR data.This research gap urged us to apply the RNN and MLP-NN models in landslide detection based on very high-resolution LiDAR data.To achieve this objective, we optimized multiresolution segmentation parameters via a supervised approach.Using the correlation-based feature selection (CFS) algorithm, we selected the most significant feature from high-resolution airborne laser scanning data.

Study Area
This study was performed in a small section of the Cameron Highlands, which is notorious for its frequent occurrence of landslides.The study area covers an area of 26.7 km 2 .It is located on northern peninsular Malaysia within the zone comprising latitudes 4 • 26 3 to 4 • 26 18 and longitudes 101 • 23 48 to 101 • 24 4 (Figure 1).The annual average rainfall and temperature in this region are approximately 2660 mm and 24/14 • C (daytime/nighttime temperatures), respectively.Approximately 80% of its area is forested with a flat (0 • ) to hilly (80 Two sites were selected to implement and test the proposed models (Figure 1).All the prerequisite considerations were taken in to account during test site selection to avoid missing any land cover classes.To obtain an accurate map of the analysis and test sites, the training sample size was measured via the stratified random sample method.

Overall Methodology and Pre-Processing
LiDAR data and landslide inventories were first pre-processed to eliminate noise and outliers.A high-resolution digital elevation model (DEM) at 0.5 m was then derived from LiDAR point clouds to generate other LiDAR-derived products (i.e.slope, aspect, height or (normalized digital surface model (nDSM)), and intensity.LiDAR-derived products and orthophotos were then composited by rectifying their geometric distortions to generate one coordinate system and were finally prepared in geographic information system (GIS) for feature extraction.Suitable parameters (scale, shape, and compactness) at various levels of segmentation were obtained via a supervised approach, i.e. a fuzzybased segmentation parameter optimizer (FbSP optimizer) [13].The stratified random method was used to evaluate the training dataset in accordance with the procedure in [35].The correlation-based selection algorithm (CFS) [36] was used to rank features from the most to least important.RNN and MLR-NN models were applied to detect landslide locations.The results of the models were validated using a 10-fold cross validation method.In addition, the models were evaluated in another part of the study area (i.e., the test site).Slope and aspect layers were overlaid with the results to identify other landslide characteristics (i.e.direction and run off).The study flow is illustrated in Figure 2. Two sites were selected to implement and test the proposed models (Figure 1).All the prerequisite considerations were taken in to account during test site selection to avoid missing any land cover classes.To obtain an accurate map of the analysis and test sites, the training sample size was measured via the stratified random sample method.

Overall Methodology and Pre-Processing
LiDAR data and landslide inventories were first pre-processed to eliminate noise and outliers.A high-resolution digital elevation model (DEM) at 0.5 m was then derived from LiDAR point clouds to generate other LiDAR-derived products (i.e., slope, aspect, height or (normalized digital surface model (nDSM)), and intensity.LiDAR-derived products and orthophotos were then composited by rectifying their geometric distortions to generate one coordinate system and were finally prepared in geographic information system (GIS) for feature extraction.Suitable parameters (scale, shape, and compactness) at various levels of segmentation were obtained via a supervised approach, i.e., a fuzzy-based segmentation parameter optimizer (FbSP optimizer) [13].The stratified random method was used to evaluate the training dataset in accordance with the procedure in [35].The correlation-based selection algorithm (CFS) [36] was used to rank features from the most to least important.RNN and MLR-NN models were applied to detect landslide locations.The results of the models were validated using a 10-fold cross validation method.In addition, the models were evaluated in another part of the study area (i.e., the test site).Slope and aspect layers were overlaid with the results to identify other landslide characteristics (i.e., direction and run off).The study flow is illustrated in Figure 2.

Landslide Inventory
The landslide inventory; produced previously by Pradhan and Lee, [39] was used to develop the proposed detection method and the total number of landslides is 21 in the study area covering 3781 m (Figure 3).

Landslide Inventory
The landslide inventory; produced previously by Pradhan and Lee, [39] was used to develop the proposed detection method and the total number of landslides is 21 in the study area covering 3781 m 2 (Figure 3).

Data
LiDAR point-cloud data were collected on 15 January 2015 at a point density of 8 points/m 2 and frequency pulse rate of 25,000 Hz.The absolute accuracy of the data (root-mean square errors) was restricted to 0.15 m and 0.3 m in the vertical and horizontal axes, respectively.Orthophotos were obtained using the same acquisition system that relied on the abovementioned cloud data.A DEM was derived from LiDAR point clouds with a spatial resolution of 0.5 m after non-ground points were removed using inverse distance weighting with a spatial reference of GDM2000/Peninsula RSO.Subsequently, LiDAR-based DEM was used to generate derived layers to facilitate the identification and characterization of landslide locations [37].

Data
LiDAR point-cloud data were collected on January 15, 2015 at a point density of 8 points/m 2 and frequency pulse rate of 25,000 Hz.The absolute accuracy of the data (root-mean square errors) was restricted to 0.15 m and 0.3 m in the vertical and horizontal axes, respectively.Orthophotos were obtained using the same acquisition system that relied on the abovementioned cloud data.A DEM was derived from LiDAR point clouds with a spatial resolution of 0.5 m after non-ground points were removed using inverse distance weighting with a spatial reference of GDM2000/Peninsula RSO.Subsequently, LiDAR-based DEM was used to generate derived layers to facilitate the identification and characterization of landslide locations [37].
According to the authors of [38], slope directly and highly affects landslide phenomenology.The authors of [39] also inferred that slope is the principal factor that affects landslide occurrence.The author of [40] indicated that a hillshade map provides a good image of terrain movements, thus facilitating the development of landslide maps.Texture and geometric features are crucial for improving the classification accuracy of landslide mapping [14].Landslide intensity and texture derived from LiDAR data are affected by the accuracy of landslide detection [9].The accuracy and capacity of DEM to represent surface features are determined by terrain morphology, sampling density, and the interpolation algorithm [41].In this study, hillshade, height (nDSM), slope, and aspect were generated from LiDAR-based DEM.As shown in Figure 4, landslide locations were detected using visible bands and texture features.According to the authors of [38], slope directly and highly affects landslide phenomenology.The authors of [39] also inferred that slope is the principal factor that affects landslide occurrence.The author of [40] indicated that a hillshade map provides a good image of terrain movements, thus facilitating the development of landslide maps.Texture and geometric features are crucial for improving the classification accuracy of landslide mapping [14].Landslide intensity and texture derived from LiDAR data are affected by the accuracy of landslide detection [9].The accuracy and capacity of DEM to represent surface features are determined by terrain morphology, sampling density, and the interpolation algorithm [41].In this study, hillshade, height (nDSM), slope, and aspect were generated from LiDAR-based DEM.As shown in Figure 4, landslide locations were detected using visible bands and texture features.

Image Segmentation
The sizes and shapes of image objects [42] are determined via image segmentation, the preliminary step in object-based classification.Optimal segmentation parameters depend on the environment under analysis, the selected application, and the underlying input data [8].Previous studies have used the multiresolution segmentation algorithm with eCognition software for image segmentation [8,9].Three parameters (scale, shape, and compactness) are defined in this algorithm.According to [5], these parameters can be obtained via the traditional trial-and-error method, which is time consuming and laborious.Therefore, the fuzzy logic supervised approach presented by [13] was adopted in this study.

Image Segmentation
The sizes and shapes of image objects [42] are determined via image segmentation, the preliminary step in object-based classification.Optimal segmentation parameters depend on the environment under analysis, the selected application, and the underlying input data [8].Previous studies have used the multiresolution segmentation algorithm with eCognition software for image segmentation [8,9].Three parameters (scale, shape, and compactness) are defined in this algorithm.According to [5], these parameters can be obtained via the traditional trial-and-error method, which is time consuming and laborious.Therefore, the fuzzy logic supervised approach presented by [13] was adopted in this study.

Training Sets
The authors of [35] suggested the use of stratified random sampling method to obtain an adequately sized training dataset for every class without any bias during sample selection.Accordingly, the present study adopted stratified random sampling to evaluate training samples and

Training Sets
The authors of [35] suggested the use of stratified random sampling method to obtain an adequately sized training dataset for every class without any bias during sample selection.Accordingly, the present study adopted stratified random sampling to evaluate training samples and achieve high performance without strong bias.Four classes with different numbers of objects were set as shown in Table 1.
Stratified random sampling is a prerequisite to obtain prior knowledge of the two sites considered for landslide inventory.Hence, segmentation parameters were first optimized.Then, the landslide inventory was overlapped with the segmented layer for object labeling.ArcGIS 10.3 was used to construct sample sets automatically at each optimal scale.Subsequently, stratified random sampling was applied on the labeled objects.This process was performed 20 times at each optimal scale.

Correlation-Based Feature Selection
The authors of [15] reported that the selection of only the most relevant features improves the quality of landslide identification and classification.Working with large numbers of features causes numerous problems.As reported in [43] and [14], some of these problems include the slow run time of algorithms due to the consideration of numerous resources, low accuracy when the number of features exceed the number of observation features, and overfitting when irrelevant features are used as inputs.Therefore, the most significant features should be selected to enhance the accuracy of feature extraction.In this study, relevant features were extracted using the CFS algorithm with Weka 3.7 software.Furthermore, the CFS algorithm was applied to all LiDAR-derived data, visible bands, and textural features, and was used to determine the feature subsets required to develop models for landslide identification.The CFS algorithm comprises two basic steps: the ranking of initial features and the elimination of the least important features through an iterative process.

MLP-NN
NNs are a family of biological learning models in machine learning.The NN model comprises interconnected neurons or nodes, which are structured into layers with random or full interconnections among successive layers [44].The NN model comprises input, hidden, and output layers that are responsible for receiving, processing, and presenting results, respectively [44].Each layer contains nodes connected by numeric weights and output signals.The weights are the functions of the sum of the inputs to the node modified by a simple activation function [45].The possibility of learning is the most important feature that attracts researchers to use NNs.
Back-propagation, which was first proposed by Paul Werbos in 1974 and independently rediscovered by Rumelhart and Parker, is the most common learning algorithm used in NN.It aims to minimize the error function via the iterative approach as shown in Equation (1).NNs have been successfully used in remote sensing applications.However, this model has some limitations, specifically, high computational complexity and overlearning [46,47].
where d i and o i represent the desired output and the current response of node i in the output layer, respectively."L" is the number of nodes in the output layer.Corrections to weight parameters were calculated and effected with the previous values in the iterative method, as demonstrated in Equation (2): where delta rule ∆w i,j is the weight parameter between nodes i and j; µ is a positive constant that controls the amount of adjustment and is referred to as learning rate; α is the momentum factor, which takes a value between 0 and 1; and t is the iteration number.α is referred to as the stabilizing factor because it smoothens quick changes between weights [48].

RNN
RNNs are designed to model sequences in NNs with feedback connections.They are very powerful in computational analysis and are biologically more reliable than other NN techniques given their lack of internal states.The memory of past activations in RNN is very effective with feedback connections, making them suitable for learning the temporal dynamics of sequential data.RNN is very powerful when used to map input and output sequences because it uses contextual information.However, traditional RNNs face the challenge of exploding or vanishing gradients.Hochreiter and Schmidhuber [49] proposed long short-term memory (LSTM) to tackle this issue.
Hidden units in LSTM are replaced with memory blocks that contain three multiplicative units (input, output, forget gates) and self-connected memory cells to allow for reading, writing, and resetting through a memory block and behavioral control.A single LSTM unit is shown in Figure 5. c t is the sum of inputs at time step t and its previous time step activations.LSTM updates time step i given inputs x t , h t−1 , and c t−1 as reported in [50].Input gates: Forget gates: Cell units: Output gates: The hidden activation (output of the cell) is also given by a product of the two terms: where σ and tanh are an element-wise non-linearity, such as a sigmoid function and hyperbolic tangent function, respectively; W is the weight matrix; x t refers to input at time step t; t, h t−1 represents the hidden state vector of the previous time step; and b c denotes the input bias vector.The memory cell unit c t is a sum of two terms: the previous memory cell unit c t−1 , which is modulated by f t and c t , a function of the current input, and previous hidden state, modulated by the input gate i t due to i t and f t being sigmoidal.Their values range within [0, 1], and i t and f t can be considered as knobs that the LSTM learns to selectively forget its previous memory or consider its current input, whilst o t is an output gate that learns how much of the memory cell to transfer to the hidden layers.

RNN
RNN is a sequence problem considered as the addition of loops to architecture.For example, in any layer under consideration, signals can be passed to each neuron and are subsequently forwarded to the next layer.The network output can be input to the network in the next input feature, and so

RNN
RNN is a sequence problem considered as the addition of loops to architecture.For example, in any layer under consideration, signals can be passed to each neuron and are subsequently forwarded to the next layer.The network output can be input to the network in the next input feature, and so on, as shown in Figure 7.In this study, RNN received 10 features as inputs to differentiate landslides from other objects (cut slope, bare soil, and vegetation).RNN consisted of an LSTM layer with 50 hidden units, two fully connected layers, a dropout layer, and a softmax layer.The back-propagation technique was used in trained the RNN model with Adam optimizer and a batch size of 128.
To avoid overfitting, a dropout layer was used in the RNN model and the NN learned weights from the training dataset.However, overfitting may occur when new data are inputted.The dropout layer randomly set some selected activations to zero, thus alleviating overfitting.The selected activations were used only during training and not during testing.The parameter was controlled by the number of activations that the dropout layer referred to as keep probability.
on, as shown in Figure 7.In this study, RNN received 10 features as inputs to differentiate landslides from other objects (cut slope, bare soil, and vegetation).RNN consisted of an LSTM layer with 50 hidden units, two fully connected layers, a dropout layer, and a softmax layer.The back-propagation technique was used in trained the RNN model with Adam optimizer and a batch size of 128.To avoid overfitting, a dropout layer was used in the RNN model and the NN learned weights from the training dataset.However, overfitting may occur when new data are inputted.The dropout layer randomly set some selected activations to zero, thus alleviating overfitting.The selected activations were used only during training and not during testing.The parameter was controlled by the number of activations that the dropout layer referred to as keep probability.

Optimization of Model Hyper-Parameters
The hyper-parameters of the RNN and MLP-NN models were optimized via a systematic grid search in scikit-learn [51] for 100 epochs.Despite its high computational cost, the systematic grid search provides better results because it systematically tunes the hyper-parameter values.Parameter combinations were selected for the models.The models were evaluated using a 10-fold crossvalidation method.Among the evaluated parameters, the model with the highest validation accuracy was selected.Table 2 presents the most optimized parameters obtained for the models.

Optimization of Model Hyper-Parameters
The hyper-parameters of the RNN and MLP-NN models were optimized via a systematic grid search in scikit-learn [51] for 100 epochs.Despite its high computational cost, the systematic grid search provides better results because it systematically tunes the hyper-parameter values.Parameter combinations were selected for the models.The models were evaluated using a 10-fold cross-validation method.Among the evaluated parameters, the model with the highest validation accuracy was selected.Table 2 presents the most optimized parameters obtained for the models.

Supervised Approach for Optimizing Segmentation
The supervised approach was employed to optimize the parameters (i.e., scale, shape, and compactness) of the multiresolution segmentation algorithm for landslide identification and for differentiation from non-landslides (bare soil, cut slope, and vegetation).The optimized parameters rapidly increased the accuracy of classification to the optimum level by delineating the segmentation boundaries of the landslide.The application of optimized segmentation parameters allowed for the spatial and textural identification of features (landslide and non-slides).In our proposed method, accurate segmentation results should be first obtained prior to performing subsequent steps.
The optimal parameters of the multiresolution segmentation algorithm were obtained.The selected values for the three parameters are shown in Table 3.The initial segmentation parameters set in the supervised approach were 50, 0.1, and 0.1 for scale, shape, and compactness, respectively.After 100 iterations with these initial values, the optimal values obtained for scale, shape, and compactness were 75.52, 0.4, and 0.5, respectively in the analysis area.Meanwhile, the test area values were 100, 0.45 and 0.74, respectively.Figure 8a,b show the initial and optimal segmentation processes.The results of optimized segmentation accurately delineated landslide objects in the analysis and test areas.spatial and textural identification of features (landslide and non-slides).In our proposed method, accurate segmentation results should be first obtained prior to performing subsequent steps.
The optimal parameters of the multiresolution segmentation algorithm were obtained.The selected values for the three parameters are shown in Table 3.The initial segmentation parameters set in the supervised approach were 50, 0.1, and 0.1 for scale, shape, and compactness, respectively.After 100 iterations with these initial values, the optimal values obtained for scale, shape, and compactness were 75.52, 0.4, and 0.5, respectively in the analysis area.Meanwhile, the test area values were 100, 0.45 and 0.74, respectively.Figures 8a and b show the initial and optimal segmentation processes.The results of optimized segmentation accurately delineated landslide objects in the analysis and test areas.

Relevant Feature Subset Based on a CFS Algorithm
In this study, the feature input consisted of 39 items of LiDAR-derived data (i.e., slope, height, and intensity), texture features (i.e., GLCM StdDev and GLCM homogeneity), and visible band.The optimal combination of features was selected via ten experiments using a CFS algorithm.Selection began from (1,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,39) of the features.The most relevant feature subsets were obtained after 100 iterations in every experiment; this result is in line with the procedure proposed by Sameen et al. [52].High classification accuracy was achieved when 10 of the features were applied, indicating that LiDAR-derived data, visible bands, and textural features were more effective in detecting the landslide location.Table 3 shows the most significant results of feature selection based on the CFS algorithm.

Relevant Feature Subset Based on a CFS Algorithm
In this study, the feature input consisted of 39 items of LiDAR-derived data (i.e., slope, height, and intensity), texture features (i.e., GLCM StdDev and GLCM homogeneity), and visible band.The optimal combination of features was selected via ten experiments using a CFS algorithm.Selection began from (1,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,39) of the features.The most relevant feature subsets were obtained after 100 iterations in every experiment; this result is in line with the procedure proposed by Sameen et al. [52].High classification accuracy was achieved when 10 of the features were applied, indicating that LiDAR-derived data, visible bands, and textural features were more effective in detecting the landslide location.Table 3 shows the most significant results of feature selection based on the CFS algorithm.

Results of Landslide Detection
Classification techniques affect the quality of the classification maps.Many classification algorithms have been established for each category, and each has its merits and demerits.In the present work, the RNN and MLP-NN models with optimized parameters were used for landslide detection with good accuracy.Figure 9 shows the classification results of the RNN and MLP-NN models in the analysis area.The qualitative assessment of the RNN model yielded high-quality results, as shown in Figure 9A.Well-defined landslide boundaries were detected and correctly differentiated from other objects (cut-slope, bare soil and vegetation).On the other hand, the qualitative assessment of MLP-NN produced low-quality results, as shown in Figure 9B.
algorithms have been established for each category, and each has its merits and demerits.In the present work, the RNN and MLP-NN models with optimized parameters were used for landslide detection with good accuracy.Figure 9 shows the classification results of the RNN and MLP-NN models in the analysis area.The qualitative assessment of the RNN model yielded high-quality results, as shown in Figure 9A.Well-defined landslide boundaries were detected and correctly differentiated from other objects (cut-slope, bare soil and vegetation).On the other hand, the qualitative assessment of MLP-NN produced low-quality results, as shown in Figure 9B.The proposed models were evaluated using another LiDAR dataset (test site) from the Cameron Highlands.All features (all existing objects) of the test area were carefully considered.Segmentation parameters were optimized using the FbSP optimizer.A 10-fold cross-validation approach introduced by Bartels et al. [53] was used to resolve this issue with high accuracy.Environmental conditions and differences in landslide characteristics resulted in misclassification [9].Differences in the sensors used, illumination conditions, and the spatial resolutions of images are some of the challenges faced by the proposed NN models [54].The results of qualitative assessment indicated that the proposed NNs with optimized techniques correctly detected landslide locations in the test site, as shown in Figure 10.The qualitative assessment of the RNN model yielded high-quality results, as shown in Figures 10A and B. On the other hand, the qualitative assessment of the MLP-NN model produced low-quality results, as shown in Figure 10.The proposed models were evaluated using another LiDAR dataset (test site) from the Cameron Highlands.All features (all existing objects) of the test area were carefully considered.Segmentation parameters were optimized using the FbSP optimizer.A 10-fold cross-validation approach introduced by Bartels et al. [53] was used to resolve this issue with high accuracy.Environmental conditions and differences in landslide characteristics resulted in misclassification [9].Differences in the sensors used, illumination conditions, and the spatial resolutions of images are some of the challenges faced by the proposed NN models [54].The results of qualitative assessment indicated that the proposed NNs with optimized techniques correctly detected landslide locations in the test site, as shown in Figure 10.The qualitative assessment of the RNN model yielded high-quality results, as shown in Figure 10A,B.On the other hand, the qualitative assessment of the MLP-NN model produced low-quality results, as shown in Figure 10.It is crucial to take the required measures to avoid the issue of the landslide separation from the bare land.The morphology characteristics of the landslide map is different from other types of land cover.For example, the shape, slope and other characteristics (i.e.dip direction, width and length) of the surface terrain may be changed after landslide occurs.Therefore, by using relevant features derived from very high resolution LiDAR, data such as texture and geometric features can be used to separate between landslides and bare land.In addition, applying different optimization techniques It is crucial to take the required measures to avoid the issue of the landslide separation from the bare land.The morphology characteristics of the landslide map is different from other types of land cover.For example, the shape, slope and other characteristics (i.e., dip direction, width and length) of the surface terrain may be changed after landslide occurs.Therefore, by using relevant features derived from very high resolution LiDAR, data such as texture and geometric features can be used to separate between landslides and bare land.In addition, applying different optimization techniques helped us to improve the classification accuracy in landslide detection over other landcover classes, such as bare land, man-made, etc., as described previously by Pradhan and Mezaal [9].Their results demonstrated that using optimized techniques with very high resolution LiDAR data (0.5) enabled them to separate landslide and other types of land cover.In addition, the most relevant features in Table 4 were optimized during this study.Furthermore, authors of [16] suggested that using the object feature from LiDAR data is a suitable solution for landslide identification.The landslide detection results showed that the proposed model is robust.Optimizing the segmentation parameters, namely, scale, shape, and compactness, using the fuzzy logic supervised approach resulted in the effective differentiation of landslide from non-landslide (bare soil, cut slope and vegetation) objects.Creating accurate objects through the optimized segmentation process allowed the use of spatial, orthophoto, and textural features for feature detection.Landslides should be differentiated from non-landslides based on the accurate segmentation of spatial and textural features.The selection of relevant features in landslide detection relies on the experience of the analysts.Thus, a feature selection method is crucial for accurate and reliable landslide detection.The optimal features selected via the CFS method simplified landslide detection by the NN model.Computation time and reliance on the expert knowledge of the analyst were reduced.Moreover, the optimized parameters of the NN models improved the performance of the models, reduced the complexity of the models, and decreased overfitting in the training sample.

Performance of the MLP-NN and RNN
The models were implemented in Python using the open source TensorFlow deep learning framework developed by Google [26].Meanwhile, the accuracy of the proposed NN models was tested using a 10-fold cross-validation method.The results are presented in Table 5.The best accuracy of 83.33% in the analysis area was achieved by the RNN model.The MLR-NN model achieved an accuracy of 78.38% in the same area.Furthermore, the RNN model outperformed the MLR-NN model in terms of stability of accuracy across different folds of the tested dataset.In the test area, accuracies of 81.11% and 74.56% were achieved with the RNN and MLP-NN models, respectively.These results indicated that the RNN model has better accuracy than the MLR-NN model in the analysis and test areas and indicated the high stability of the RNN model in detecting the spatial distribution of landslides.However, producing neural network models such as LSTM and convolution layers with fully connected networks is a crucial task.Complex networks with more hidden units and many modules often tend to have a better overfit due to the detection ability with respect to any possible interaction so the model becomes too specific to the training dataset.Thus, optimizing the network structures is very crucial for avoiding over-fitting.This study indicated that the hyperparameters in both models have a significant effect on their results.For example, the effect of learning rate varied from 0.1, 0.01, 0.05 and 0.001 in landslide detection.The highest accuracy was obtained when a learning rate reached 0.001.In contrast, increasing the learning rate to 0.1 significantly reduced the accuracy in both models.The batch size parameter in both models had significant effects on the result accuracy.The results of MLR-NN and RNN models showed high accuracies with batch sizes of 64 and 128, respectively.This indicates that RNN model achieved high accuracy with the increase of the batch size, whereas the accuracy of MLB-NN model was decreased.
Furthermore, it was revealed that the dropout rate had a substantial influence on the results of the RNN model.The RNN model showed higher accuracy when the dropout rate reached 0.6.The results of the RNN model indicated that the accuracy increased when the dropout rate parameter was increased.
The results of two models (Table 5), show that the accuracies of the RNN model outperformed the MLP-NN model in both study areas.This is due to several reasons, for example the fact that the MLP-NN model uses only local contexts and therefore it does not capture the temporal and spatial correlation in the dataset.Meanwhile, the hidden units of the RNN model contain historical information from the previous step.This indicates RNN model has more information about the data structure and accurate as compared to the MLP-NN model.

Sensitivity Analysis
The optimization of network architecture is necessary and should be considered over the use of standard parameters [28] because network architecture models are principally influenced by the analytical task and data type.Data could differ in size, relationships between independent and dependent variables, and complexity.Therefore, the neural architecture of the RNN and MLP-NN networks was enhanced using a grid search implemented in SciPy-python.The combinations of 10 parameters that can best identify landslide locations in densely vegetated areas were optimized.
The Adam optimizer is the most suitable algorithm for the optimization of the two NN models.Using the Adam optimizer with default parameters (learning rate µ = 0.001, beta β 1 = 0.9, epsilon = 1e-08 and weight decay = 0.0) yielded an accuracy of 0.77 and 0.825 for MLP-NN and RNN models, respectively, as shown in Figure 11.Rmsprop and Nadam optimizers also achieved excellent results for the two models.Overall, the Adam algorithm is more suitable for analyzing landslide data.However, better accuracy was obtained when Adadelta was used with the RNN model.Meanwhile, adding the weight decay in the neural network did not affect the results.
Using the Adam optimizer with default parameters (learning rate = 0.001, beta = 0.9, epsilon ∊ = 1e-08 and weight decay = 0.0) yielded an accuracy of 0.77 and 0.825 for MLP-NN and RNN models, respectively, as shown in Figure 11.Rmsprop and Nadam optimizers also achieved excellent results for the two models.Overall, the Adam algorithm is more suitable for analyzing landslide data.However, better accuracy was obtained when Adadelta was used with the RNN model.Meanwhile, adding the weight decay in the neural network did not affect the results.parameters that can best identify landslide locations in densely vegetated areas were optimized.
The Adam optimizer is the most suitable algorithm for the optimization of the two NN models.Using the Adam optimizer with default parameters (learning rate = 0.001, beta = 0.9, epsilon ∊ = 1e-08 and weight decay = 0.0) yielded an accuracy of 0.77 and 0.825 for MLP-NN and RNN models, respectively, as shown in Figure 11.Rmsprop and Nadam optimizers also achieved excellent results for the two models.Overall, the Adam algorithm is more suitable for analyzing landslide data.However, better accuracy was obtained when Adadelta was used with the RNN model.Meanwhile, adding the weight decay in the neural network did not affect the results.Overfitting can be avoided when dropouts are controlled through the number of parameters in the RNN model.Figure 13 illustrates the sensitivity analysis of the effects of dropout rate with various keep probability parameters on the RNN model.The results showed that the appropriate dropout rate is 0.6 for the RNN model.The selected dropout rate considerably affects the performance of NN models.The keep probability was selected in each dataset and analysis was conducted via a systematic grid search.
Overfitting can be avoided when dropouts are controlled through the number of parameters in the RNN model.Figure 13 illustrates the sensitivity analysis of the effects of dropout rate with various keep probability parameters on the RNN model.The results showed that the appropriate dropout rate is 0.6 for the RNN model.The selected dropout rate considerably affects the performance of NN models.The keep probability was selected in each dataset and analysis was conducted via a systematic grid search.

Field Investigation
The reliability of the proposed methods was validated via field investigation using a handheld Global Position System (GPS) device (GeoExplorer 6000) to locate landslides (Figure 14) and to produce a precise and reliable inventory map of the Cameron Highlands.The more detailed information (landslide extent, source area, deposition, and volume) was obtained from in situ measurements which ultimately demonstrate the reliability of the produced inventory map in the field through use of a GeoExplorer 6000 handheld GPS.The results illustrated that the neural network techniques were able to detect true landslide locations which occurred in past years.Therefore, the results of this study verified that the proposed models can detect landslide locations and generate a reliable landslide inventory map.

Field Investigation
The reliability of the proposed methods was validated via field investigation using a handheld Global Position System (GPS) device (GeoExplorer 6000) to locate landslides (Figure 14) and to produce a precise and reliable inventory map of the Cameron Highlands.The more detailed information (landslide extent, source area, deposition, and volume) was obtained from in situ measurements which ultimately demonstrate the reliability of the produced inventory map in the field through use of a GeoExplorer 6000 handheld GPS.The results illustrated that the neural network techniques were able to detect true landslide locations which occurred in past years.Therefore, the results of this study verified that the proposed models can detect landslide locations and generate a reliable landslide inventory map.
Overfitting can be avoided when dropouts are controlled through the number of parameters in the RNN model.Figure 13 illustrates the sensitivity analysis of the effects of dropout rate with various keep probability parameters on the RNN model.The results showed that the appropriate dropout rate is 0.6 for the RNN model.The selected dropout rate considerably affects the performance of NN models.The keep probability was selected in each dataset and analysis was conducted via a systematic grid search.

Field Investigation
The reliability of the proposed methods was validated via field investigation using a handheld Global Position System (GPS) device (GeoExplorer 6000) to locate landslides (Figure 14) and to produce a precise and reliable inventory map of the Cameron Highlands.The more detailed information (landslide extent, source area, deposition, and volume) was obtained from in situ measurements which ultimately demonstrate the reliability of the produced inventory map in the field through use of a GeoExplorer 6000 handheld GPS.The results illustrated that the neural network techniques were able to detect true landslide locations which occurred in past years.Therefore, the results of this study verified that the proposed models can detect landslide locations and generate a reliable landslide inventory map.

Conclusions
The Cameron Highlands, Malaysia form an ideal site for testing the feasibility of RNN and MLP-NN models for landslide detection based on high-resolution LiDAR data.The optimization of segmentation parameters is crucial for improving model performance and computational efficiency with different spatial subsets in the Cameron Highlands.Furthermore, optimization is essential for feature selection to improve the classification accuracy and the computational efficiency of the proposed methodology.The optimization of NN model parameters helped improve the performance of the model by reducing model complexity and preventing overfitting in the training sample.The RNN model exhibited better accuracy in the analysis and test areas than the MLR-NN model.This investigation showed that network architectures based on optimized techniques, very high resolution (VHR) airborne LiDAR-derived data, and spatial features could be used to effectively identify landslide locations in tropical regions.Therefore, this proposed automatic landslide detection method is a potential geospatial solution for managing landslide hazards and conducting landslide risk assessments.
Given that the proposed RNN model is more efficient than the MLP-NN model and has the potential to process the most relevant features, further studies should be conducted to fully optimize network structures for higher flexibility and eligibility for landslide detection.More theoretical tasks are recommended to enhance the representation of variables and data structure by the RNN model and the storage capacity of the data.Faster and more accurate NN techniques for landslide detection should be developed to overcome all the limitations related to accuracy and time.In addition, the RNN model can be integrated with other NN techniques to help improve other landslide applications.

Figure 1 .
Figure 1.Location of the study area.The red boundary represents the analysis area and the yellow boundary represents the test area.

Figure 1 .
Figure 1.Location of the study area.The red boundary represents the analysis area and the yellow boundary represents the test area.

Figure 2 .
Figure 2. Overview of the proposed method.LiDAR: light detection and ranging; RNN: recurrent neural networks; MLP-NN: multi-layer perceptron neural networks; CFS: correlation-based feature selection; DEM: digital elevation model.

Figure 2 .
Figure 2. Overview of the proposed method.LiDAR: light detection and ranging; RNN: recurrent neural networks; MLP-NN: multi-layer perceptron neural networks; CFS: correlation-based feature selection; DEM: digital elevation model.

Figure 3 .
Figure 3. Shows the locations of landslide in the study area.

Figure 3 .
Figure 3. Shows the locations of landslide in the study area.

3. 9 .
Neural Network Models 3.9.1.MLP-NN This study proposed the network architectures RNN and MLP-NN.Figure 6 depicts the MLP-NN model architecture, which has two hidden layers of 50 hidden units.Ten features were taken as inputs in the model to detect different types of objects, such as landslide, cut slope, bare soil, and vegetation.The MLP-NN model was trained through a back-propagation technique with the Adam optimizer and a batch size of 64.The hyper-parameters used in this NN were carefully selected through grid search and a 10-fold cross validation process.

Figure 5 .
Figure 5. Structure of a memory cell in long short-term memory (LSTM)-RNN.

3. 9 .
Neural Network Models 3.9.1.MLP-NN This study proposed the network architectures RNN and MLP-NN.Figure 6 depicts the MLP-NN model architecture, which has two hidden layers of 50 hidden units.Ten features were taken as inputs in the model to detect different types of objects, such as landslide, cut slope, bare soil, and vegetation.The MLP-NN model was trained through a back-propagation technique with the Adam optimizer and a batch size of 64.The hyper-parameters used in this NN were carefully selected through grid search and a 10-fold cross validation process.

Figure 5 .
Figure 5. Structure of a memory cell in long short-term memory (LSTM)-RNN.

3. 9 .
Neural Network Models 3.9.1.MLP-NN This study proposed the network architectures RNN and MLP-NN.Figure 6 depicts the MLP-NN model architecture, which has two hidden layers of 50 hidden units.Ten features were taken as inputs in the model to detect different types of objects, such as landslide, cut slope, bare soil, and vegetation.The MLP-NN model was trained through a back-propagation technique with the Adam optimizer and a batch size of 64.The hyper-parameters used in this NN were carefully selected through grid search and a 10-fold cross validation process.

Figure 7 .
Figure 7. Architecture of the RNN model.

Figure 7 .
Figure 7. Architecture of the RNN model.

Figure 8 .
Figure 8. Parameter optimization of the multiresolution segmentation algorithm: (a) initial segmentation and (b) optimized segmentation.

Figure 8 .
Figure 8. Parameter optimization of the multiresolution segmentation algorithm: (a) initial segmentation and (b) optimized segmentation.

Figure 9 .
Figure 9. Results of the qualitative assessment of (A) RNN and (B) MLP-NN for the analysis area.

Figure 9 .
Figure 9. Results of the qualitative assessment of (A) RNN and (B) MLP-NN for the analysis area.

20 Figure 10 .
Figure 10.Results of the qualitative assessment of (A) RNN and (B) MLP-NN for the test area.

Figure 10 .
Figure 10.Results of the qualitative assessment of (A) RNN and (B) MLP-NN for the test area.

Figure 11 .Figure 11 .
Figure 11.Impact of the optimization algorithm on the performance of MLP-NN and RNN models; SGD: Stochastic Gradient Descent.

Figure 11 .Figure 12 .
Figure 11.Impact of the optimization algorithm on the performance of MLP-NN and RNN models; SGD: Stochastic Gradient Descent.

Figure 13 .
Figure 13.Influence of dropout rate on the performance of the RNN model.

Figure 14 .
Figure 14.Field photographs showing landslide locations during field investigation in (A) Tanah Rata and (B) Tanah Runtuh.

Figure 13 .
Figure 13.Influence of dropout rate on the performance of the RNN model.

Figure 13 .
Figure 13.Influence of dropout rate on the performance of the RNN model.

Figure 14 .
Figure 14.Field photographs showing landslide locations during field investigation in (A) Tanah Rata and (B) Tanah Runtuh.

Figure 14 .
Figure 14.Field photographs showing landslide locations during field investigation in (A) Tanah Rata and (B) Tanah Runtuh.

Table 1 .
Number of selected training objects in four classes.

Table 4 .
Correlation-based feature selection (CFS) results for the most relevant feature subset at a scale of 75.52;StdDe: Standard deviation, DTM: Digital terrain model, GLCM: Gray level co-occurrence matrix.

Table 5 .
Cross-validation accuracy results of the proposed models.