Improving Landslide Detection from Airborne Laser Scanning Data Using Optimized Dempster – Shafer

A detailed and state-of-the-art landslide inventory map including precise landslide location is greatly required for landslide susceptibility, hazard, and risk assessments. Traditional techniques employed for landslide detection in tropical regions include field surveys, synthetic aperture radar techniques, and optical remote sensing. However, these techniques are time consuming and costly. Furthermore, complications arise for the generation of accurate landslide location maps in these regions due to dense vegetation in tropical forests. Given its ability to penetrate vegetation cover, high-resolution airborne light detection and ranging (LiDAR) is typically employed to generate accurate landslide maps. The object-based technique generally consists of many homogeneous pixels grouped together in a meaningful way through image segmentation. In this paper, in order to address the limitations of this approach, the final decision is executed using Dempster–Shafer theory (DST) rule combination based on probabilistic output from object-based support vector machine (SVM), random forest (RF), and K-nearest neighbor (KNN) classifiers. Therefore, this research proposes an efficient framework by combining three object-based classifiers using the DST method. Consequently, an existing supervised approach (i.e., fuzzy-based segmentation parameter optimizer) was adopted to optimize multiresolution segmentation parameters such as scale, shape, and compactness. Subsequently, a correlation-based feature selection (CFS) algorithm was employed to select the relevant features. Two study sites were selected to implement the method of landslide detection and evaluation of the proposed method (subset “A” for implementation and subset “B” for the transferrable). The DST method performed well in detecting landslide locations in tropical regions such as Malaysia, with potential applications in other similarly vegetated regions.


Introduction
Landslide inventory maps may provide baseline information about landslide types, distribution, location, and boundaries in landslide-prone areas.Information pertaining to slope and displacement measurements affecting failure may be deduced from landslide inventories [1].Furthermore, landslide inventories could be deployed for various purposes, such as implementation of landslide susceptibility, hazard assessment, risk assessment, and landslide magnitude recording.It is a quite challenging task to map landslide inventory in tropical areas due to densely vegetative cover which obscures the underlying landforms [2].However, most of the available traditional landslide detection methods, such as optical and aerial photographs, high spatial resolution multispectral images, synthetic aperture radar (SAR) images, very high resolution (VHR) satellite images, and moderate-resolution digital terrain models (DTM) [3][4][5][6] are not sufficiently quick and accurate enough to map landslide inventory due to rapid vegetation growth in tropical areas.Thus, a fast and accurate approach is required in landslide inventory mapping.Therefore, high-resolution light detection and ranging (LiDAR)-derived digital elevation models (DEMs) may provide highly valuable information on landslide-prone terrain shielded by densely vegetative cover [7].High-resolution DEM may be employed to detect landforms and provide useful information about densely vegetated and rocky areas; consequently, minimal changes in topographic features may be easily detected [8][9][10].Generally, LiDAR data could propose significant benefits, owing to their ability to penetrate dense vegetation and provide valuable information on topographic conditions [2].These benefits make LiDAR data unique and different from other data sources, such as aerial photographs for slope failure detection under dense vegetation [4,6].
Pixel-based approaches often exhibit a "salt and pepper" appearance when very high spatial resolution images are employed [11].However, an object-based method is widely utilized in landslide mapping to resolve the aforementioned pixel-based limitations [12].According to Moosavi et al. [13], it is usually assumed that a pixel is very likely to belong to the same class as its neighboring pixels.In contrast to the information obtained from individual pixels, object-based image analysis offers additional geometric and contextual information that can be derived from image objects [5,12].In an object-based approach, one of the most challenging issues is selecting the optimum combination of segmentation parameters [13].According to Zeng et al. [14], the Dempster-Shafer theory (DST) is a precise fusion algorithm utilizing belief uncertainty intervals to present the belief of assumptions based on evidence of multiple observations.The algorithm utilizes reasoning, weight, and probability-driven evidences contained within the dataset [14].The ability to handle incomplete data and an associated degree of uncertainty gives DST the strength to exploit data from many sensors in a processing train [15].The reasoning-based system [16] and the empirical determination of parameters in DST make the concept generally applicable without depending on satellite imagery [15].Furthermore, it is an economical and time-effective alternative, as it decreases the time required to select training sites [17].From the literature, it is apparent that the DST method has previously been employed for multisensor images [18].In the current study, the DST method was applied to fuse three object-based classified maps at the feature level.This approach (DST) has been tested in many other applications, such as fingerprint verification, forensics, ontologies, and the military [19][20][21].Also, the model has been used in remote sensing applications such as mapping landslide susceptibility [22][23][24][25], groundwater assessment and potentials [26], and risk assessment of groundwater pollution [27].In 2017, Mezaal et al. [28] applied the DST method for automatic detection of landslide locations using very high LiDAR-derived data and orthophoto imagery.
Landslides have the capability to display heterogeneous sizes that require information with higher spatial resolutions in order to produce complete event inventories [29].Effective feature selection, such as texture, image band information, and geometric features, are needed to improve the quality of landslide inventory mapping.However, handling huge amount of irrelevant features can lead to overfitting [3].Landslide identification in a particular area may be improved by selecting the most significant features [3,[6][7][8][9].According to Van Westen et al. [8], selection of the most significant features may aid in distinguishing between landslides and non-landslides.Researchers such as Stumpf and Kerle [30] have attempted to improve the efficiency of feature selection in detecting the location of landslides.Some object-based investigations have taken care of feature selection to detect landslides using LiDAR data [3,31].Recently, Pradhan and Mezaal [2] effectively utilized a correlation-based selection method (CFS) to optimize the feature selection for detecting landslides in tropical areas.
Therefore, the present study proposes an improved approach to detect landslide locations by employing a Dempster-Shafer theory (DST) fusion technique.In this technique, three object-based classified outputs are fused together (in contrast to multisensor data) to achieve more efficient and accurate results.

Materials and Methods
The investigation started with the preprocessing of LiDAR data and landslide inventories, an essential step typically taken before the commencement of subsequent steps to eliminate the noise and outliers from data.Subsequently, a high-resolution DEM (0.5 m) was derived from LiDAR point clouds and utilized to produce other LiDAR-derived products and landslide conditioning factors (i.e., aspect, slope, height, normalized digital surface model (nDSM), intensity, and hillshade).The LiDAR-derived products and orthophotos were combined by correcting their geometric distortions and were projected in one coordinate system.Lastly, they were prepared in GIS for feature extraction.Thereafter, a fuzzy-based segmentation parameter optimizer (FbSP optimizer), developed by Zhang et al. [32], was employed to acquire suitable parameters (i.e., shape, scale, and compactness) in differing levels of segmentation.Next, three object-based approaches-support vector machine (SVM), random forest (RF), and K-nearest neighbor (KNN) classifiers-were applied for both analysis and test areas.Subsequently, DST was employed to combine the outputs of the aforementioned classifiers in MATLAB R2015b.Model transferability was applied to another part of the study area (i.e., test site).The results were then validated and compared based on quantitative and qualitative methods (i.e., confusion matrix and precision/recall).Figure 1 illustrates a detailed flowchart of the methodology and the overall framework.

Materials and Methods
The investigation started with the preprocessing of LiDAR data and landslide inventories, an essential step typically taken before the commencement of subsequent steps to eliminate the noise and outliers from data.Subsequently, a high-resolution DEM (0.5 m) was derived from LiDAR point clouds and utilized to produce other LiDAR-derived products and landslide conditioning factors (i.e., aspect, slope, height, normalized digital surface model (nDSM), intensity, and hillshade).The LiDARderived products and orthophotos were combined by correcting their geometric distortions and were projected in one coordinate system.Lastly, they were prepared in GIS for feature extraction.Thereafter, a fuzzy-based segmentation parameter optimizer (FbSP optimizer), developed by Zhang et al. [32], was employed to acquire suitable parameters (i.e., shape, scale, and compactness) in differing levels of segmentation.Next, three object-based approaches-support vector machine (SVM), random forest (RF), and K-nearest neighbor (KNN) classifiers-were applied for both analysis and test areas.Subsequently, DST was employed to combine the outputs of the aforementioned classifiers in MATLAB R2015b.Model transferability was applied to another part of the study area (i.e., test site).The results were then validated and compared based on quantitative and qualitative methods (i.e., confusion matrix and precision/recall).Figure 1 illustrates a detailed flowchart of the methodology and the overall framework.

Study Area
The present investigation was carried out in the tropical, densely vegetated rainforest of Cameron Highlands, Malaysia.The rationality for choosing this study area was the frequent occurrence of landslides there.Geographically, the area under consideration is located in the north of peninsular Malaysia (latitude 4 • 26 09"N-4 • 27 30"N and longitude 101 • 23 02"E-101 • 23 47"E), covering an area of 26.7 km 2 .The study area records an annual average rainfall of about 2660 mm and average temperatures of 24 • C and 14 • C during day and night times, respectively.A significant portion of the study area (80%) is forested landform, ranging from flat terrain (0 • ) to hilly area (80 • ).
In the present investigation, two subsets were selected to implement the proposed method, shown in Figure 2. A training area was employed to enhance the methodology for detecting the location of landslides.Similarly, a testing site was employed for testing purposes.Meticulous care was taken in selecting the test site to avoid deficiencies in the number of classes.Moreover, the training sample size was assessed using stratified random sampling method in order to enhance the accuracy of the subsets under consideration (i.e., training and test sites).

Study Area
The present investigation was carried out in the tropical, densely vegetated rainforest of Cameron Highlands, Malaysia.The rationality for choosing this study area was the frequent occurrence of landslides there.Geographically, the area under consideration is located in the north of peninsular Malaysia (latitude 4°26′09″N-4°27′30″N and longitude 101°23′02″E-101°23′47″E), covering an area of 26.7 km 2 .The study area records an annual average rainfall of about 2660 mm and average temperatures of 24 °C and 14 °C during day and night times, respectively.A significant portion of the study area (80%) is forested landform, ranging from flat terrain (0°) to hilly area (80°).
In the present investigation, two subsets were selected to implement the proposed method, shown in Figure 2. A training area was employed to enhance the methodology for detecting the location of landslides.Similarly, a testing site was employed for testing purposes.Meticulous care was taken in selecting the test site to avoid deficiencies in the number of classes.Moreover, the training sample size was assessed using stratified random sampling method in order to enhance the accuracy of the subsets under consideration (i.e., training and test sites).Generally speaking, landslide history is an important aspect of landslide detection.It gives an idea of its occurrences in a particular region.In this regard, landslide history was collected from many sources such as newspapers, national reports, technical papers, etc.The preliminary classification was based on age for landslide inventory mapping in accordance with visible morphologic criteria captured on aerial photographs.Landslide deposits and scars are disequilibrium landforms that evolve through morphologic stages with age [33].Specifically, relevant information from previous investigations and landslide inventory information spanning over a decade was prepared for the whole Cameron Highlands.In an earlier paper by Pradhan and Lee [34], a database of 324 landslide incidents was prepared for 293 km 2 of the Cameron Highlands to assess the number of landslides and their corresponding surface area.It was observed that the landslides are shallow rotational and a few translational types.The data was put together based on historical landslide records.In March 2011, Generally speaking, landslide history is an important aspect of landslide detection.It gives an idea of its occurrences in a particular region.In this regard, landslide history was collected from many sources such as newspapers, national reports, technical papers, etc.The preliminary classification was based on age for landslide inventory mapping in accordance with visible morphologic criteria captured on aerial photographs.Landslide deposits and scars are disequilibrium landforms that evolve through morphologic stages with age [33].Specifically, relevant information from previous investigations and landslide inventory information spanning over a decade was prepared for the whole Cameron Highlands.In an earlier paper by Pradhan and Lee [34], a database of 324 landslide incidents was prepared for 293 km 2 of the Cameron Highlands to assess the number of landslides and their corresponding surface area.It was observed that the landslides are shallow rotational and a few translational types.The data was put together based on historical landslide records.In March 2011, AIRSAR data was deployed to prepare the landslide inventory map.In 2014, Samy and Marghany [35] presented the landslide history of 273 landslides of different sizes collected from the archive data of the Department of Mineral and Geosciences, Malaysia.In the same year, Murakmi et al. [36] prepared a database of historical landslide occurrences in the Malaysian Peninsula.In 2015, Shahab and Hashim [37] reported a landslide history prepared from different sources, such as field surveys, published reports, and digital aerial photographs (DAP) covering a 25-year period.Therefore, landslides can be characterized as old and new events using visual inspection and overlaying the last landslides events on the Cameron Highlands.Due to vegetation cover, landslides which have occurred between 2008 and 2010 are considered old landslides.On the other hand, new landslides are those which have occurred after 2010.New landslides less than 5-years old are apparently barren in nature and can be observed by red, green, and blue (RGB) sensors.However, the ones older than 5 years are usually covered by vegetation and thus cannot be recognized in visible bands.Figures 3 and 4 show the landslide inventory map for both subsets (training and test sites), respectively.
Remote Sens. 2018, 10, x FOR PEER REVIEW 5 of 26 AIRSAR data was deployed to prepare the landslide inventory map.In 2014, Samy and Marghany [35] presented the landslide history of 273 landslides of different sizes collected from the archive data of the Department of Mineral and Geosciences, Malaysia.In the same year, Murakmi et al. [36] prepared a database of historical landslide occurrences in the Malaysian Peninsula.In 2015, Shahab and Hashim [37] reported a landslide history prepared from different sources, such as field surveys, published reports, and digital aerial photographs (DAP) covering a 25-year period.Therefore, landslides can be characterized as old and new events using visual inspection and overlaying the last landslides events on the Cameron Highlands.Due to vegetation cover, landslides which have occurred between 2008 and 2010 are considered old landslides.On the other hand, new landslides are those which have occurred after 2010.New landslides less than 5-years old are apparently barren in nature and can be observed by red, green, and blue (RGB) sensors.However, the ones older than 5 years are usually covered by vegetation and thus cannot be recognized in visible bands.Figures 3  and 4 show the landslide inventory map for both subsets (training and test sites), respectively.

Data Used
The LiDAR point cloud data was captured for the Cameron Highlands region on January 15, 2015.The land area spans 26.

Data Used
The LiDAR point cloud data was captured for the Cameron Highlands region on January 15, 2015.The land area spans 26.7 km 2 of the Ringlet at an altitude of 1510 m.The point density is approximately 8 points per square meter, with a 25,000-Hz pulse rate frequency.The accuracy of the LiDAR data was restricted to conform to the root-mean-square errors of 0.15 m and 0.3 m in the vertical and horizontal axes, respectively, as standardized by the Department of Survey and Mapping Malaysia (JUPEM).A high-resolution camera (visible bands) used in the acquisition of LiDAR point cloud data in the study area was deployed to collect the orthophotos.A DSM was generated by interpolating LiDAR point clouds using inverse distance weighting method (IDW) using the ArcGIS 10.3 software (Figure 5A).A DEM of spatial resolution of 0.5 m was interpolated from the LiDAR point clouds after removing the nonground points using inverse distance weighting, with GDM2000 as the spatial reference (Figure 5B).Next, the height feature, i.e., normalized DSM (nDSM), was derived by subtracting DSM from DEM (Figure 5E).Subsequently, the LiDAR-based DEM was used to generate the derived layers in identifying the location of landslides and their features [38].Slope is a major determinant of land stability due to its role in landslide phenomenology [39].Hillshade refers to a map showing sufficient images of terrain movement which aid in landslide mapping [40].The intensity image obtained from LiDAR data also play a significant role in differentiating between landslides and non-landslides [2].Moreover, the quality of the landslide inventory map may be significantly improved by using texture features [2,6].These data provide extensive information on landslide detection [28].The accuracy and ability of DEM to represent the surface are affected by terrain morphology, sampling density, and the interpolation algorithm [41].In the present study, various LiDAR-derived data were employed as follows: DEM, DSM, intensity, height (nDSM), slope, and aspect (shown in Figure 5).Additionally, orthophoto images (i.e., visible bands) and texture features were used to detect the landslides.

Multiresolution Segmentation Algorithm
A multiresolution segmentation (MRS) is a bottom-up approach used in a segmentation algorithm and is based on the pairwise region-merging technique [12].In algorithms, image pixels that possess homogeneous spectral and textural characteristics are usually grouped [42].The smaller objects are substituted in the larger ones based on certain criteria obtainable from three parameters: color, scale, and shape (i.e., smoothness and compactness).The aforementioned three parameters may be determined in this algorithm using the traditional trial-and-error method; however, this is a time-consuming and laborious task [4].Hence, many semiautomatic and automatic approaches have attempted to optimize parameter segmentation [43][44][45].Optimization techniques consider only the scale, without taking into consideration the combination of the parameters [4].A few powerful optimization techniques currently exist, such as the Taguchi optimization method [4] and the fuzzy-based segmentation parameter optimizer (FbSP optimizer) [32].These techniques represent advanced methods employed for automatic combination of segmentation parameters (i.e., scale, shape, and compactness).However, differentiating among image objects at various scales is still a challenge and not all feature selection methods are fully utilized in a particular segmentation scale.Thus, it is suggested that an automatic method should be directly implemented.

Calculation of the Relevant Feature Selection
Classification schemes implemented on segments are more significant than single pixels in an object-based approach.This occurs by incorporating a multitude of additional information such as the texture, shape, and context associated with image objects [46].However, both objective and subjective methods may be used to select important object features in object-based classification.The subjective methods depend on user knowledge and past experience, while the employment of feature selection algorithms is relatively objective [47].According to Stumpf and Kerle [30], several remote sensing applications are able to measure rotation invariance.However, they cannot capture directional patterns in the grey-value distribution.Landslide-affected areas regularly appear in a downslope direction with texture patterns.Stumpf According to Stumpf and Kerle [30], several remote sensing applications are able to measure rotation invariance.However, they cannot capture directional patterns in the grey-value distribution.Landslide-affected areas regularly appear in a downslope direction with texture patterns.Stumpf and Kerle [30] reported that patterns are potential features for detecting and differentiating landslide surfaces and texture patterns oriented at the strike of the slope [30].Gray-level co-occurrence matrix (GLCM) texture features can be calculated with airborne laser scanning data by using eCognition software and the measured texture co-occurrence can be calculated for individual landslide objects in the software.This result comprises Dissimilarity, Contrast, Homogeneity, and Standard Deviation co-occurrence texture measured based on all bands.
In the present study, a correlation-based feature selection algorithm (CFS) was used to select the most important features in detecting the landslide locations.The best search was utilized for determining the feature space, and the five consecutive, fully expanded, nonimproving subsets were set to a stopping criterion in order to avoid searching the entire feature subset space.In this study, the Weka 3.8 package was used to implement this feature selection algorithm.Furthermore, three employed object features resulted in 82 total features-Mean and StdDev LiDAR data, Mean and StdDev visible band, and texture (see Table 1).Mean and StdDev data were extracted from airborne laser scanning data with the use of eCognition software.The correlation coefficient between Mean Intensity and GLCM Dissimilarity, GLCM Angular second moment, had a positive and moderate relationship at (p < 0.01), and it had a negative moderate relationship at (p < 0.01) with StdDev Blue as indicated in Table 2.The relationship between GLCM Homogeneity and GLCM Contrast and GLCM Dissimilarity were negatively correlated at (p < 0.01).However, there was positive significant relationship at (p < 0.01) with GLCM Angular second moment and StdDev Blue.The Mean Slope had a positive significant relationship at (p < 0.01) with Mean DTM and StdDev Blue, however it had a negative significant relationship at (p < 0.05) with StdDe DTM.The Mean Red showed a strong relationship at (p < 0.01) with GLCM Contrast and GLCM Dissimilarity, and it showed a negative relationship at (p < 0.05) with StdDe DTM.The correlation coefficient between Mean DTM and StdDe DTM was negatively significant at (p < 0.05).GLCM Contrast had a strong significant relationship at (p < 0.01) with GLCM Dissimilarity.The relationship between GLCM Dissimilarity and StdDev Blue was negatively significant at (p < 0.05).

Support Vector Machine (SVM)
The dataset was categorized into groups using a supervised nonparametric statistical learning technique consistent with training examples.SVMs are gaining huge popularity in various applications of remote sensing, especially in landslide mapping [13,29,[48][49][50], due to their capability in handling small training datasets and unknown statistical distribution data obtained in the field [51].Huang et al. [52] observed that SVMs containing fewer training dataset yielded more stable results compared to those of decision tree, maximum belief, and artificial neural network classifiers with large training datasets.SVMs are regarded as binary classifiers designed to determine the boundary of the decision region that separates the dataset features or characteristics into two regions in the feature space.In SVM, boundary optimal hyperplanes exhibiting maximum safety margins are chosen closest to the training features.These are called support vectors, which take full advantage of the margin between the classes [29].Kernel functions have been used in the linearization of the decision boundary and were achieved by using the maps of the training data in the higher-dimensional space that have the capacity to linearly separate two classes of hyperplanes [52].A nonlinear transformation of covariates was also conducted, transferring into high-dimensional feature space [53].
In the present study, SVM e1071 package [54] for the R statistical computing software was implemented [55].It was observed that hyperparameters determined the performance of the SVM classifier.Therefore, the selection of these parameters was optimized and their sensitivity analyzed.Three parameters were assessed in the case of SVM, namely, the penalty parameter (C), kernel function, and gamma parameter (7).The most accurate prediction was attained in the radial basis function (RBF), using gamma parameter (7) 0.9 and penalty parameter of 300.This was carried out rapidly by visual inspection of the match between results and reference data.Seventy percent (70%) of the inventory map, along with all the features, were selected as training sets to train the RF model.

Random Forest (RF)
The RF classifier method has been implemented for detecting landslides using many types of remote sensing data [3,30,[56][57][58][59].The algorithm builds upon multiple decision trees that depend on randomly selected subsets of the training dataset.The RF makes use of the high variance among individual trees in a classification problem by assigning the respective classes based on majority votes.The merit of this approach lies in its performance on complex datasets, with fewer attempts for fine-tuning [30].RF is defined as a random subset of the original set of features, whereas a classification and regression tree considers all variables in each node.Users can estimate the number of variables per node by using the square root of the total variable number.Various mechanisms, such as sampling and usage of random variables, in each node may produce entirely different uncorrelated trees.Moreover, large numbers of trees are absolutely required to obtain variability in the training data and attain accurate classification.A total vote of all the trees in the forest is used to determine a feature and the class will then be assigned on the basis of the majority vote.
RF package [54], an open-source statistical language R, was used in this research.The parameters used in the analysis were the number of variables in the random subset at each node and the number of trees in the forest.The number (500) of trees was chosen, which is a typical value for the RF classifier [30].A single randomly split variable was employed to grow the trees.The inventory map (70%) together with other features and feature subsets were chosen as training sets to train the RF model.The remaining 30% of the inventory was used to evaluate the accuracy of the classification.The mean and standard deviation (stdev) values of the classification accuracies were then obtained from 50 random runs.

K-Nearest Neighbour (KNN)
KNN is a powerful tool utilized in many object-based workflows due to its flexibility and simplicity [60,61].It is used most often for classification in object-based software frameworks, i.e., eCognition (Trimble Geospatial, Munich, Germany).In comparison to model-based learning, KNN allocates the object to the class based on proximity or neighborhood in the feature space, rather than learning from a model [62].The closest K neighbors are obtained from the training set and then used to vote for the final predicted new object.K is usually a tunable parameter, typically assuming a small and positive integer value [5].In this study, an optimal K parameter was used based on cross-validation and bootstrap samples which were used to search for the best K value, where K ranges from 1 to 10 in steps of 1 and the best value is applied to the KNN algorithm implemented in the R package "class".The inventory map (70%) together with all the features were selected as training sets to train the KNN model.

Dempster-Shafer Theory
The DST is built on a frame of discernment, formally defined as the set of mutually exclusive and collectively exhaustive hypotheses, represented by Θ [62,63].By defining two functions (Plausibility Pls and Belief Bel), this theory seeks to model imprecision and uncertainty.The two functions are essentially derived from a mass function (m), with the latter function being applicable to every element of 2 θ in lieu of exclusively to elements of θ.Thus, a rich and flexible modeling behavior is achieved which may potentially address numerous remote sensing applications.Thus, a mass function m(T) allocates belief for each proposition, shown in the following equation: where φ is the empty set.Two common evidential measures within the mass function are belief (Bel) and plausibility (Pls), both defined in equations: where for every S ⊂ Θ, Bel(S) is a measure of the total amount of beliefs committed exactly to every subset of S by m.Pls (S) signifies the degree to which the evidence remains plausible.These two functions, which are regarded as the lower and upper probabilities, respectively, contain following properties: The rule of combination in the proposed DST builds on the mathematical theory substantiating the combination of the mass functions m i obtained from n sources of information given in Equations ( 6) and (7): where K denotes the degree of conflict given in Equation ( 8): Various methods exist for taking a final decision using the DST decision technique as found in the literature: the maximum mass, plausibility, or belief [28].From the probabilistic SVM, RF, and KNN classifiers, the posterior probabilities are converted in the form of mass function (m).These, in turn, are then combined using the DST method.The combination output may be considered a belief function, which defines a posterior probability measure for each thematic.

Fusion of Three Object-Based Classifiers via Dempster-Shafer Theory (DST)
The fusion level analysis (FLA) method groups different classified features together and fuses them into a new class based on the belief confusion matrix [14].In the present study, the DST algorithm was employed to perform the feature-based data fusion.This method comprises well-defined combination rules that are capable of combining several belief functions in the same frame.The theory is built on the formulated basis of harmonizing the information from many sources [64].The evidence is then integrated in a reliable approach to complete the evaluation of the entire body of evidence.DST may serve both uncertainty and imprecision from belief and plausibility functions, while also possessing the ability to compute compound hypotheses [65].
Consequently, the results of three object-based approaches (i.e., SVM, KNN, and RF) were combined by fusing the DST with LiDAR data.The method employed the fused class label contained within each pixel with the maximal belief function.The produced fused pixels were set as an unclassified value in the case of multiple class labels [64].For all classified landslide and non-landslide (frame of discernment), the belief functions were estimated using a precision function, which processes the confidence of a classifier probability.DST considers the majority of classified labels and then assigns that label to the segment.The belief masses for labelled features resulting from the classifiers were calculated using a confusion matrix, which is a text file that fuses the most probable feature label.Fundamentally, belief measurements are divided into four types, namely, precision, accuracy, Kappa, and recall.Due to its high performance in previous works, the precision belief function was selected in the present study to label the belief classes from standalone classifiers.

Optimization of Segmentation Parameters
Various parameters were optimized with the FbSP optimizer, namely, the scale, shape, and compactness of the multiresolution segmentation (MRS) algorithm.This FbSP optimizer has the capability to delineate the boundaries of landslides and non-landslides, such as bare soil, vegetation, and cut slope, respectively.In this study, the values 50, 0.1, and 0.1 were used in the analysis area as the initial segmentation parameters trained in the FbSP optimizer for scale, shape, and compactness, respectively.After three iterations, the optimum values achieved in the optimizer were 75.52, 0.4, and 0.5, standing for scale, shape, and compactness, respectively (shown in Table 3).The initial and optimal segmentation processes are further illustrated in Figure 6.Table 3 illustrates the results of segmentation parameters obtained using the FbSP optimizer.Furthermore, the segmentation of each landslide class is also highlighted.The use of the optimal segmentation parameters yielded accurate results based on the image objects produced in most classes, the results of which are depicted in Figure 6.For accurate detection of landslides, identifying and defining the parameters are extremely crucial.However, the accuracy of the result may affect the final classification map due to a certain degree of under-segmentation in some landslide locations.In case over-segmentation or under-segmentation issues arise, it becomes difficult to use the contextual and spatial features in feature identification, mainly due to the improper definition of target features.Therefore, misclassifications of the spatial and contextual features may occur with other similar features.In order to achieve result-oriented classification, it is necessary to eliminate these errors in segmentation by employing robust approaches.Therefore, combination methods could be viable alternatives to improve the accuracy of landslide classification.
Remote Sens. 2018, 10, x FOR PEER REVIEW 15 of 26 features.In order to achieve result-oriented classification, it is necessary to eliminate these errors in segmentation by employing robust approaches.Therefore, combination methods could be viable alternatives to improve the accuracy of landslide classification.

Feature Subset Selection
In an attempt to obtain the optimum algorithm, CFS was used to select the most relevant feature in order to detect landslide locations.The feature input, used in the experiment, consisted of 82 LiDAR data (height, slope, and intensity), texture features (GLCM homogeneity and GLCM StdDev), and the visible band.These features were deduced using the eCognition software (see Table 4).High classification accuracy was obtained when 10 of the features indicated that visible bands, LiDARderived data, and textural features were applied.The values of these features contributed to separate landslides from other land cover classes such as manmade, bare land, and vegetation.It can be attributed to the landslide characteristics in the area under consideration.This illustrates the fact that visible bands, LiDAR-derived data, and textural features are effective in revealing landslide locations.Table 4 presents the most significant features selected using the CFS algorithm, showing highly important features such as Mean Intensity, GLCM Homogeneity, Mean Slope, and GLCM Angular Second Moment.Mean Red and Texture features are also significant for improving the classification accuracy.The results of the feature selection demonstrated the importance of relevant features in improving the accuracy of classification.The results of the relevant features revealed that the best combination was achieved by CFS, which improved the detection between landslide class and other land cover classes in both areas (training area and test area).Generally, selection of the most relevant feature can decrease computation time, avoid the subjective requirement of expert knowledge, and improve the classifier process.

Feature Subset Selection
In an attempt to obtain the optimum algorithm, CFS was used to select the most relevant feature in order to detect landslide locations.The feature input, used in the experiment, consisted of 82 LiDAR data (height, slope, and intensity), texture features (GLCM homogeneity and GLCM StdDev), and the visible band.These features were deduced using the eCognition software (see Table 4).High classification accuracy was obtained when 10 of the features indicated that visible bands, LiDAR-derived data, and textural features were applied.The values of these features contributed to separate landslides from other land cover classes such as manmade, bare land, and vegetation.It can be attributed to the landslide characteristics in the area under consideration.This illustrates the fact that visible bands, LiDAR-derived data, and textural features are effective in revealing landslide locations.Table 4 presents the most significant features selected using the CFS algorithm, showing highly important features such as Mean Intensity, GLCM Homogeneity, Mean Slope, and GLCM Angular Second Moment.Mean Red and Texture features are also significant for improving the classification accuracy.The results of the feature selection demonstrated the importance of relevant features in improving the accuracy of classification.The results of the relevant features revealed that the best combination was achieved by CFS, which improved the detection between landslide class and other land cover classes in both areas (training area and test area).Generally, selection of the most relevant feature can decrease computation time, avoid the subjective requirement of expert knowledge, and improve the classifier process.

Results of Object-Based SVM, RF, and KNN Classifiers in Training Area
Figure 7 shows the results of RF, SVM, and KNN, which are the three object-based classifiers used in the present study.The results indicate that landslides and non-landslides were accurately classified.The results further indicate that the SVM performed excellently by creating an accurate landslide inventory map with only limited undetected landslides.Furthermore, visual assessment was found to be more reliable in SVM than in KNN and RF.Most of the landslides in the study area were identified with the aid of SVM rather than RF and KNN.The performance recorded in the SVM is a result of the two optimization techniques used to optimize the parameters of segmentation and the most relevant features selected in the classification process.Furthermore, the improvement recorded could be attributed to the parameters used in the classifiers that resulted in the classification quality.It is highly imperative to take the required measures in order to avoid landslide separation from other land cover classes, such as manmade and bare soil.This is due to the fact that the morphology features of the landslide are quite different from other types of land cover classes.For instance, the slope, shape, and other features such as depth, width, dip direction, and length of surface terrain could change after the occurrence of a landslide.Optimized landslide segments can be exploited in analysis to select the significant features.According to Van Westen et al. [8] and Mezaal et al. [64], selecting relevant features is highly imperative in distinguishing between landslides and non-landslides.Improved classification accuracy has been observed when the segments are well-fitted into landslide shapes [66][67][68].After optimization of the feature selection in this study, it was observed that Intensity, GLCM Homogeneity, and Slope features are very significant to detect landslide locations.Hence, those features contribute to the object-based classifiers (i.e., SVM, RF, and KNN) as ancillary data in order to improve the classification results and compare with the standalone LiDAR RGB-orthophoto image.Basically, the nature of various non-landslides (e.g., cut slope and bare soil) is different in term of shape, slope, size, and texture from landslide classes.Thus, the value of the relevant features such as Intensity, GLCM Homogeneity, and Slope help to improve the classification accuracy for proper separation of the landslide classes from the bare soil or manmade classes.The OBIA classifiers were trained using the landslide inventory.Overall, all the training samples exhibited Slope values much higher than the bare soil (above 25 degrees) and the GLCM Homogeneity values are less than the manmade areas such as bright roofs (less than 0.06).Additionally, the Mean Intensity feature contributes greatly in identifying the old landslides from those that have been covered by vegetation canopy after some years.Considering the fact that all forest areas reflect similar intensity pulses, the covered old landslides under forest have higher intensity values in comparison with the surrounding areas [69].Since a historical inventory was used to train the applied OBIA classifiers, it was observed that the Intensity value above 30,000 in forest areas reflects the old landslide location where it cannot be seen by RGB-visible bands of the orthophoto image.Then, all three classifiers expanded the trained examples to the rest of images using their own algorithms to detect the recent or old landslides out of similar non-landslide objects with different accuracy.Thus, by employing the most appropriate features derived from high-resolution LiDAR data and the texture feature could aid in distinguishing between landslides and non-landslides.However, it was observed that there exist misclassifications in identifying landslide and non-landslide classes, such as the aforementioned manmade and bare soil with other classifiers, which may affect further analysis.This would cause over-segmentation in some objects, where the structural and spatial features of landslide areas are not discriminated.In addition, this confusion is due to the degree of similarity of spatial characteristics among the aforementioned classes.
objects, where the structural and spatial features of landslide areas are not discriminated.In addition, this confusion is due to the degree of similarity of spatial characteristics among the aforementioned classes.
The DST method was used in the present work to combine the results of each classifier, landslide class, and belief functions.These were projected using a precision algorithm that measures the confidence of probable classifier on the bases of the resulting feature labels.The belief's masses for feature labels, obtained from each classifier, were computed using a confusion matrix, a text file used to fuse the most probable feature label.The feature fusion of classifiers produced a single and precise landslide inventory map which combines the extracted information from the inputs' object-based classifications and the results of the DST method, as shown in Figure 5.The validated results and maps developed substantiates the notion that the proposed method is reliable for the recognition and mapping of landslide locations.The DST method was used in the present work to combine the results of each classifier, landslide class, and belief functions.These were projected using a precision algorithm that measures the confidence of probable classifier on the bases of the resulting feature labels.The belief's masses for feature labels, obtained from each classifier, were computed using a confusion matrix, a text file used to fuse the most probable feature label.The feature fusion of classifiers produced a single and precise landslide inventory map which combines the extracted information from the inputs' object-based classifications and the results of the DST method, as shown in Figure 5.The validated results and maps developed substantiates the notion that the proposed method is reliable for the recognition and mapping of landslide locations.

Transferability to Test Site
It is necessary to test the transferability of the model developed to other landslide-affected, dense vegetation-cover areas, especially around regions with less anthropogenic activities.In densely vegetated areas such as the Cameron Highlands, landslides are generally covered by a much higher scarp compared to the landslide-free areas.Furthermore, there is a similarity in the characteristics with other land cover classes such as bare soil, cut slope, and manmade classes which pose difficulty in separation among the former classes.Consequently, the results of transferability to validate the proposed method were tested in another part of the study area (testing area), as shown in Figure 6.The results of the consistency in transferability indicate that SVM achieved a consistent accuracy compared with KNN and RF, while the transferability accuracy of RF was better than that of the SVM classifier.Lowered results accuracy shows a decline in results accuracy given the many disadvantages due to similarities in their characteristics as well as the combination of landslide, shape, area, complex topography, and so on [6].Therefore, this study presented a combination of object-based approaches for landslide mapping.The DST process combined the power of each classifier to derive a more powerful classifier, as shown in Figure 6.DST showed more improvement than SVM, RF, and KNN classifiers separately.Moreover, it is worth observing that the improvement margin of DST is better than that of each classifier.As a result, Figure 6 shows the results of fusion DST, indicating a satisfactory result for landslide identification and delineation between landslides and non-landslides.The results show that the fusion of object-based classifiers in the DST method aided in the identification process of landslides and provided complementary information.The DST decision rule played a great role in resolving the conflicts generated from object-based classifiers.It fuses the benefits of classifiers by adding each classifiers' information to the other classifiers.It may be inferred that DST offered an improved, high-resolution LiDAR as well as orthophoto images with an acceptable accuracy.In addition, the transferability results show the importance of features from high-resolution LiDAR data, visible bands, and textures features for landslide mapping, shown in Figure 8.In addition, airborne LiDAR data contributed to the detection of the landslide location due to its capability to obtain differences in size and volume of the landslides [70,71].

Transferability to Test Site
It is necessary to test the transferability of the model developed to other landslide-affected, dense vegetation-cover areas, especially around regions with less anthropogenic activities.In densely vegetated areas such as the Cameron Highlands, landslides are generally covered by a much higher scarp compared to the landslide-free areas.Furthermore, there is a similarity in the characteristics with other land cover classes such as bare soil, cut slope, and manmade classes which pose difficulty in separation among the former classes.Consequently, the results of transferability to validate the proposed method were tested in another part of the study area (testing area), as shown in Figure 6.The results of the consistency in transferability indicate that SVM achieved a consistent accuracy compared with KNN and RF, while the transferability accuracy of RF was better than that of the SVM classifier.Lowered results accuracy shows a decline in results accuracy given the many disadvantages due to similarities in their characteristics as well as the combination of landslide, shape, area, complex topography, and so on [6].Therefore, this study presented a combination of object-based approaches for landslide mapping.The DST process combined the power of each classifier to derive a more powerful classifier, as shown in Figure 6.DST showed more improvement than SVM, RF, and KNN classifiers separately.Moreover, it is worth observing that the improvement margin of DST is better than that of each classifier.As a result, Figure 6 shows the results of fusion DST, indicating a satisfactory result for landslide identification and delineation between landslides and non-landslides.The results show that the fusion of object-based classifiers in the DST method aided in the identification process of landslides and provided complementary information.The DST decision rule played a great role in resolving the conflicts generated from object-based classifiers.It fuses the benefits of classifiers by adding each classifiers' information to the other classifiers.It may be inferred that DST offered an improved, high-resolution LiDAR as well as orthophoto images with an acceptable accuracy.In addition, the transferability results show the importance of features from high-resolution LiDAR data, visible bands, and textures features for landslide mapping, shown in Figure 8.In addition, airborne LiDAR data contributed to the detection of the landslide location due to its capability to obtain differences in size and volume of the landslides [70,71].

Field Investigation
Landslides were identified with the aid of a handheld GPS device in the field investigation, which was carried out to validate the proposed method (GeoExplorer 6000), as illustrated in Figure 9.The information on the pattern, deposition, source area, landslide extent, run out, and volume was obtained from the field measurements, which validated the reliability of the produced landslide map.Based on the field investigation studies, the landslides identified in the proposed method are consistent and accurate.Therefore, this method has the potential to accurately identify landslide locations, differentiate between landslides and non-landslides, as well as yield a reasonable and acceptable landslide inventory map for the Cameron Highlands in Malaysia.In addition, government agencies and land use planners can use the produced results of this study to identify safe regions for inhabitants and update urban planning strategies.Such data can decrease the requirements for performing field surveys by agencies such as departments of surveying.

Field Investigation
Landslides were identified with the aid of a handheld GPS device in the field investigation, which was carried out to validate the proposed method (GeoExplorer 6000), as illustrated in Figure 9.The information on the pattern, deposition, source area, landslide extent, run out, and volume was obtained from the field measurements, which validated the reliability of the produced landslide map.Based on the field investigation studies, the landslides identified in the proposed method are consistent and accurate.Therefore, this method has the potential to accurately identify landslide locations, differentiate between landslides and non-landslides, as well as yield a reasonable and acceptable landslide inventory map for the Cameron Highlands in Malaysia.In addition, government agencies and land use planners can use the produced results of this study to identify safe regions for inhabitants and update urban planning strategies.Such data can decrease the requirements for performing field surveys by agencies such as departments of surveying.

Field Investigation
Landslides were identified with the aid of a handheld GPS device in the field investigation, which was carried out to validate the proposed method (GeoExplorer 6000), as illustrated in Figure 9.The information on the pattern, deposition, source area, landslide extent, run out, and volume was obtained from the field measurements, which validated the reliability of the produced landslide map.Based on the field investigation studies, the landslides identified in the proposed method are consistent and accurate.Therefore, this method has the potential to accurately identify landslide locations, differentiate between landslides and non-landslides, as well as yield a reasonable and acceptable landslide inventory map for the Cameron Highlands in Malaysia.In addition, government agencies and land use planners can use the produced results of this study to identify safe regions for inhabitants and update urban planning strategies.Such data can decrease the requirements for performing field surveys by agencies such as departments of surveying.

Assessing Accuracy
Many established methods of assessing the accuracy of remote sensing products are available in the research domain [72].In this study, the accuracy assessment was conducted based on quantitative and qualitative (i.e., confusion matrix and precision/recall methods) for the determination of classification accuracy of one or all categories.Firstly, the confusion matrix was derived from comparison between reference image pixels and the classified image pixels.The Kappa coefficient was extracted from the confusion matrix.Thus, this coefficient is calculated as shown in Equation ( 9): where θ 1 denotes the ratio of correctly classified areas, while θ 2 represents the proportion of agreement expected by chance.
The results of the confusion matrix model were used to evaluate the pixel coverage for detecting landslides in a qualitative assessment approach.In this model, the landslide areas were compared with real landslide areas in the field.This indicates that the degree of precision in delineated landslide segments were resolved.Based on the results, it was observed that out of all the standalone object-based classifiers, the SVM classifier achieved user accuracy up to 80.57% and delineated the exact border of the individual landslide.This result was followed by RF and KNN classifiers with 78.57% and 75.96%, respectively, for training area (A), in addition to 79.89%, 77.27%, and 72.68% for testing area (B), respectively (see Tables 5 and 6).However, the DST fusion method improved the qualitative accuracy of the standalone classifier with 84.6% and 83.16% for subsets A and B, respectively, as shown in Tables 5 and 6.Therefore, based on the accuracy assessment measurements, it was observed that fusion DST enhanced the landslide detection analysis quantitatively and qualitatively by 10% and 4%, respectively.
In order to obtain the results for evaluation of pixel coverage for quantitatively detecting landslides, the confusion matrix was employed to validate the performance of classifiers for both subsets (i.e., training area and testing area).The findings of SVM, RF, and KNN for both subsets showed 86.98%, 87.69%, and 88.70%, respectively, for training area (A) and 86.83%, 88.53%, and 89.58% for testing area (B), respectively (see Tables 5 and 6).On the other hand, the quantitative assessment of the standalone classifier derived from the DST fusion method was enhanced, with results of 90.02% and 91.1% for subsets A and B, respectively (shown in Tables 5 and 6).The accuracy assessment measurements showed that the fusion DST improved the detection of landslides.Therefore, the proposed method was shown to be effective and may be applied to other regions exhibiting similar conditions to the present study.This improvement is attributed to the proposed methodology, which includes: optimized techniques, capability of LiDAR drive data, and texture features.Secondly, the precision/recall method is one of the renowned methods for quantitative accuracy assessment.This proposed method was evaluated using field observations in each block, as shown in Equations ( 10)-( 12 where the number of correctly detected landslides is referred to as true positive (TP), while the undetected landslides are referred as the false negative (FN).Usually, an FP is identified as a pixel that is falsely recognized as a landslide.The α is a non-negative scalar set to 0.5 in this regard, as suggested by Liu et al. [70].Furthermore, the success rate is computed using another equation to estimate the successful counted rate achievable by dividing segmented numbers over total trees.The precision/recall method was used to measure the accuracy of landslide detection quantitatively.The real number of landslide events were recorded using field surveying as a reference.Thereafter, each classification result was compared in order to observe the inventory map.Landslides are regarded as correct if they are recognized in segments by larger or smaller sizes of segment borders.The major idea in landslide counting is in having at least one segment in an occurred landslide, whereby the area of the landslide should not be significant in the mentioned assessment.
F-measure represents the overall accuracy in counting landslide detection, showing consistency in the result of the trained area (subset A) and the tested area (subset B) for all of the classifiers applied.Table 7 represents the number of landslides detected and counted based on previous landslide occurrence.Moreover, the RF classifier recorded the lowest accuracy in landslide detection (72% and 70% for subsets A and B, respectively), followed by the KNN classifier which showed slightly higher accuracy than RF (75% and 73% for subsets A and B, respectively), as shown in Table 7.However, SVM indicated the highest accuracy among all standalone classification methods before fusion, which gained 77% for both study areas.It is apparent that fusion DST has overtaken all three applied classifiers in terms of the outstanding accuracy for landslide counting of 88% in both subsets.Basically, it has been proven that an applied fusion DST technique can quantitatively improve the accuracy of landslide detection.
The use of the DST technique guarantees significant improvement of the detection accuracies.Each of the various classification classifiers in existence has its own merits and demerits.Therefore, the proposed DST utilized in this research showed improved accuracy.Furthermore, optimized techniques for segmentation parameters and feature selection with the assistance of high-resolution LiDAR, visible bands, and texture features contributed to the simplification of the development of the current research and the improvement of the transferability model [67].
The proposed method was developed in the training area and validated in another part of the study area (test area), and as a result, better accuracy was achieved.

Figure 1 .
Figure 1.Proposed methodology of the current study.

Figure 1 .
Figure 1.Proposed methodology of the current study.

Figure 3 .
Figure 3. Landslide inventory map for the training site.

Figure 3 .
Figure 3. Landslide inventory map for the training site.

Figure 4 .
Figure 4. Landslide inventory map for the testing site.
7 km 2 of the Ringlet at an altitude of 1510 m.The point density is approximately 8 points per square meter, with a 25,000-Hz pulse rate frequency.The accuracy of the LiDAR data was restricted to conform to the root-mean-square errors of 0.15 m and 0.3 m in the vertical and horizontal axes, respectively, as standardized by the Department of Survey and Mapping Malaysia (JUPEM).A high-resolution camera (visible bands) used in the acquisition of LiDAR point

Figure 4 .
Figure 4. Landslide inventory map for the testing site.

Figure 6 .
Figure 6.Segmentation process using supervised approach (fuzzy-based segmentation parameter (FbSP) optimizer) to detect landslide locations.(A,B) represent the initial segmentation, while (C,D) represent the optimized segmentation.

Figure 6 .
Figure 6.Segmentation process using supervised approach (fuzzy-based segmentation parameter (FbSP) optimizer) to detect landslide locations.(A,B) represent the initial segmentation, while (C,D) represent the optimized segmentation.

Figure 8 .
Figure 8.The classification results of three classifiers: (yellow polygon) represents SVM, (red polygon) represents RF, (purple polygon) represent KNN, and (black polygon) represents DST for the testing Area.

Figure 8 .
Figure 8.The classification results of three classifiers: (yellow polygon) represents SVM, (red polygon) represents RF, (purple polygon) represent KNN, and (black polygon) represents DST for the testing Area.

Table 1 .
Feature selection used in the current research.

Table 2 .
Correlation coefficient between the best feature selection.

Table 4 .
Results of feature selection using the correlation-based feature selection (CFS) algorithm.

Table 5 .
Classification assessment (confusion matrices) on the training area.

Table 6 .
Classification assessment (confusion matrices) on the testing area.