Land Subsidence Susceptibility Mapping Using Bayesian , Functional , and Meta-Ensemble Machine Learning Models

To effectively prevent land subsidence over abandoned coal mines, it is necessary to quantitatively identify vulnerable areas. In this study, we evaluated the performance of predictive Bayesian, functional, and meta-ensemble machine learning models in generating land subsidence susceptibility (LSS) maps. All models were trained using half of a land subsidence inventory, and validated using the other half of the dataset. The model performance was evaluated by comparing the area under the receiver operating characteristic (ROC) curve of the resulting LSS map for each model. Among all models tested, the logit boost, which is a meta-ensemble machine leaning model, generated LSS maps with the highest accuracy (91.44%), i.e., higher than that of the other Bayesian and functional machine learning models, including the Bayes net (86.42%), naïve Bayes (85.39%), logistic (88.92%), and multilayer perceptron models (86.76%). The LSS maps produced in this study can be used to mitigate subsidence risk for people and important facilities within the study area, and as a foundation for further studies in other regions.


Introduction
Coal mining was once the driving force of the national industry and economic development in Korea, but this situation changed as demand for coal decreased.Gangwon Province was once Korea's largest coal mining area but most of its mines were closed in the early 1990s.Among the environmental problems that follow mine closures, land subsidence events can threaten human life and damage property and infrastructure, including buildings, houses, railroads, and roads [1][2][3][4].Recovery of surface structures following land subsidence is difficult and costly; therefore, it is necessary to predict land subsidence susceptibility (LSS) zones before subsidence occurs, and to implement management strategies in these zones [3].
Based on existing studies, probability and statistical models using geographical information systems (GIS) have been applied extensively to predict the susceptibly of geohazards, such as landslides, floods, subsidence, and rockfalls [3,[14][15][16].Recently, data mining and machine learning models for addressing nonlinear problems have been developed, which have been applied frequently and had their performances compared in landslide susceptibility mapping [17][18][19][20].In ground subsidence hazard mapping, ground subsidence hazard maps around abandoned underground coal mines (AUCMs) have been constructed by integrating the adaptive neuro-fuzzy inference system and GIS [21].In addition, a fuzzy operator, decision tree with the CHAID and QUEST algorithms, and the frequency ratio have been applied to construct subsidence susceptibility maps at AUCMs in Korea [2,11].In this study, we investigated the performance of some models that have never been applied to land subsidence prediction.Therefore, in this study, we generated LSS maps for a South Korean district containing abandoned subsurface coal mines using machine learning methods, including a logit boost meta-ensemble model, two Bayesian models (Bayes net and NB models) and two functional models (logistic and multilayer perceptron models).The reliability and accuracy of all models were assessed by comparing their area under the receiver operating characteristic (ROC) curves.Data processing was performed using WEKA 3.9.2 and ArcGIS 10.5 software to produce five machine learning algorithms.

Land Subsidence in the Study Area
The study area, Hwajeon, is located in the city of Taebaek, South Korea (Figure 1), at 37 • 11 07"-37 • 11 07" N, 128 • 56 40"-128 • 57 43" E. Underground coal mining activities were carried out in Taebaek for nearly 20 years.The coal seams in this area were irregularly disturbed and inclined with various widths by reverse and thrust faults [22].Therefore, the slant-chute block caving method was mainly used.About 10 million tons of coal were mined from the study area between 1953-1991 [22], and coal was transported to other areas by railroad beginning in 1973 (Figure 1).Since 1990, most of the coal mines have been closed due to reduced coal demand.However, the abandoned underground coal mines are currently causing land subsidence in the study area [11,[21][22][23].Additionally, infrastructure has been damaged by the land subsidence, as shown in photographs in a previous report [11].
Subsidence is caused by a variety of contributing factors, including geological discontinuities, presence of water, mining depth, and weak overburden [24,25].The two forms of subsidence caused by underground coal mining are trough and sinkhole subsidence [25].In the study area, a very irregular sinkhole occurred due to many complex underground coal mine pits excavated via slant-chute block caving in combination with the aforementioned factors [22].After a mine cavity is excavated, roof stability becomes unstable over time due to changes in the strength and stress of the roof strata.Under such conditions, additional contributing factors can lead to the occurrence of sinkholes [25].The Coal Industry Promotion Board [11,26] has reported 24 land subsidence events within the study area.Figure 1 shows a representative land subsidence from location S1 to location S6 of a subsidence event reported in 1999.Table 1 provides a description of the land subsidence.Locations S1 to S5 of this land subsidence mainly occurred along railways and at elevations above 800 m.Location S6 occurred in residential areas and at a lower elevation than S1-S5.Also, the depth of subsidence of S6 is the deepest (508 mm).Some photographs providing evidence of the land subsidence have been published [11,23].Table 1.Description of representative land subsidence in the study area [22].

Construction of Spatial Database
It is necessary to determine the factors affecting the land subsidence of a coal mine area.The lithology of the overburden rocks, geological discontinuities, ground slope, scope of the mined cavity, extent and depth of mining, mechanical characteristics of the rock mass rating (RMR), and flow of groundwater are considered the main factor [11,25,27,28]. Spatial data for all of these factors may be difficult to collect and may not be available.The available spatial databases used in this study were constructed using ArcGIS 10.5.
The surface geology with cross section lines was constructed using a digital geological map with 1:50,000 scale [29] published by the Korea Institute of Geoscience and Mineral Resources (KIGAM).The geological formations include the Manhan, Jangseong, Hambaegsan, Dosagog, and Alluvium horizons (Figure 2a, Figure 3).Most of the coal was mined from the Jangseong Formation with a thickness of 80-15 m [22,30].This formation includes four to five cyclothems consisting of dark-gray sandstone, black shale, and coal seam (Table 2).The land use was constructed from a digital land characteristics map with 1:5,000 scale [31] supplied by the National Geographic Information Institute (NGII).Land use for the study area was classified into 10 categories: wood land, railroad, river, field, plot, road, school, hybrid land, brook and unclassified area (Figure 2b).The rate of land subsidence compared to the area of each category was higher in the railroad and school classes [21].The surface slope was calculated from a digital elevation model (DEM) constructed from a digital elevation contour line with 1:5000 scale [32] published by NGII (Figure 2c).Surface slope was considered an affecting factor because land subsidence can change surface slope, differential horizontal strain, and vertical displacement [33].Distance from drift was calculated from a digital drift map provided by the Mine Reclamation Corporation (MIRECO) [26] (Figure 2d).The map is important because it identifies the areas of mining activity in this region.Geological discontinuities are considered to be factors affecting land subsidence, but no geological lineaments appear in the study area on the available 1:50,000 geological map.Therefore, geomorphological lineament was visually extracted from an IKONOS satellite image by a field geologist (Figure 2e).If the location is near a lineament, the value of distance from lineament is low.
The borehole data in the study area, provided by the Mine Reclamation Organization (MIRECO) in 1996 [26], were collected from 29 boreholes (Figure 1 and Table 3).The depths of the boreholes ranged from a minimum of 19.5 m to a maximum of 200 m.The data included hydrologic properties and rock mass information [34].The depth of groundwater, rock mass rating (RMR), and permeability were obtained from 16, 19, and 6 boreholes, respectively (Table 3 and Figure 2f,g,h).The maximum depth of groundwater was 42.5 m.On the railroad, the upper part of the railroad had a deeper groundwater depth and lower elevation than the lower part of the railway.The RMR was classified as classes 1-5, representing very good, good, fair, poor, and very poor, respectively.In this study, the RMR ranged from 2-4.5.The lowest RMRs appeared in the northwest and southeast portions of the railroad.Permeability was classified as classes 1-6, representing very highly (>1 cm/s), highly (1-10 −2 cm/s), moderately (10 −2 -10 −3 cm/s), slightly (10 −3 -10 −5 cm/s), and very slightly (10 −5 -10 −7 cm/s) permeable and practically impermeable (<10 −7 cm/s), respectively.In this study, the permeability grade ranged from 4-4.5 (slightly permeable).The groundwater data were collected from a report published in May 1996 by the Coal Industry Promotion Board.Borehole point data should be converted into raster data for spatial analysis, and the accuracy of a raster map depends on the number of data points.However, the available borehole data were limited in this study.Therefore, raster maps from the limited borehole data were constructed using an inverse distance weighting (IDW) interpolation method, which is useful for predicting values at unmeasured locations where data are insufficient [11].
Eight control factors influencing land subsidence were constructed with 2 m × 2 m grid data, resulting in 775 columns and 860 rows, for a total of 666,500 cells within the study area.In total, 24 land subsidence areas as 24 vector-type polygons were converted to 2 m × 2 m grid data for a total of 3863 cells with a value of 1.The 3863 cells of land subsidence were randomly classified into training and validation sets, with a 50% (1931 cells) and 50% (1932 cells) distribution, respectively, to evaluate model performance.-Four-five cyclothems consisting of dark-gray sandstone, black shale, and coal seam.Abundant plant fossils occur in the shale above the coal seam, the most valuable anthracite bed, of the 3rd-4th cyclothem from the bottom.
-Mainly dark-gray-black shale and dark-gray fine sandstone intercalated with dark-gray limestone lenses and two to three thin coal seams Manhang (Cm) 250-300 -Mainly purple, greenish-gray, or light-green shale and light-green-green or light-gray medium-very coarse sandstone intercalated with three-four limestone lenses.
Conglomerates with a thickness of a few meters occur at the base in some places.
-In the upper part, gray-dark gray limestone intercalated with dolomite   Table 2. Description of geological stratigraphy in Taebaek [30].-Four-five cyclothems consisting of dark-gray sandstone, black shale, and coal seam.Abundant plant fossils occur in the shale above the coal seam, the most valuable anthracite bed, of the 3rd-4th cyclothem from the bottom.

Geological
-Mainly dark-gray-black shale and dark-gray fine sandstone intercalated with dark-gray limestone lenses and two to three thin coal seams Manhang (Cm) 250-300 -Mainly purple, greenish-gray, or light-green shale and light-green-green or light-gray medium-very coarse sandstone intercalated with three-four limestone lenses.
Conglomerates with a thickness of a few meters occur at the base in some places.
-In the upper part, gray-dark gray limestone intercalated with dolomite

Bayes Net (BN)
The BN algorithm applies Bayes' theorem to produce graphical representations of the probability distribution [35].BN is commonly used to model complex systems [36].BN has not yet been used to model land subsidence; however, Pham et al. (2016) [37] applied this algorithm to evaluate landslide risk.The distinct universal probability of a subsidence event for a set of input factors can be estimated as follows: where  = ( ,  , … ,  ) represents the subsidence input factors,   Π =  | is a common probability distribution for input factors X , and n is the number of subsidence input factors [37].

Naïve Bayes (NB)
The NB algorithm is a classification system that applies Bayes' theorem under the assumption of conditional independence for all attributes [10,38].The NB classifier is easy to build, without any need for complicated iterative parameter-estimation schemes [38].The NB algorithm estimates the probability P(yj/ xi) for all possible output classes as shown in Equation 2. The class with the largest posterior probability is predicted as follows:  The BN algorithm applies Bayes' theorem to produce graphical representations of the probability distribution [35].BN is commonly used to model complex systems [36].BN has not yet been used to model land subsidence; however, Pham et al. (2016) [37] applied this algorithm to evaluate landslide risk.The distinct universal probability of a subsidence event for a set of input factors can be estimated as follows: where X = (X 1 , X 2 , . . . ,X n ) represents the subsidence input factors, P B X 1 |Π x i = θ x i |Π x i is a common probability distribution for input factors X i , and n is the number of subsidence input factors [37].

Naïve Bayes (NB)
The NB algorithm is a classification system that applies Bayes' theorem under the assumption of conditional independence for all attributes [10,38].The NB classifier is easy to build, without any need for complicated iterative parameter-estimation schemes [38].The NB algorithm estimates the probability P(y j /x i ) for all possible output classes as shown in Equation ( 2).The class with the largest posterior probability is predicted as follows: {subsidence, no subsidence} where x i is the input factor, y j is the output class, P(y j ) is the prior probability, and P(y j /x i ) is the conditional probability.
The conditional probability is calculated as where µ is the mean and σ is the standard deviation of x i .

Logistic Regression (LR)
LR is a statistical technique that allows the predictor to analyze several types of variables [39][40][41].LR does not require the normality assumption, which is an advantage over linear and log-linear regression.The inclusion of multiple parameters offers the user the ability to select the best predictors for use in the model [39].The LR model is formulated as follows [42]: where x 1 , x 2 , . . ., x n are the input factors, c 0 is the model intercept, and c 1 , . . ., c n are the regression coefficients to be approximated.In this study, P is the probability of subsidence occurrence and 1 − P is probability that subsidence will not occur.The function f(x) is represented as logit (P).

Multilayer Perceptron (MLP)
MLP is an artificial neural network classifier that is widely used in various fields [12,43].MLP neural nets consist of three structures: Input, hidden, and output layers.In this study, the input layers represent factors that affect land subsidence, and the inputs are processed to become outputs within the hidden layers.The classification results, dividing land subsidence and non-subsidence, are shown in the output layers [12,44].Two processes are required to train data from MLP neural nets: 1) Forward propagation of the inputs through the hidden layers to obtain output and compare output values to initial values, and 2) adjustment of the connection weights using differences between subsequent values to generate the best results [44,45].In this study, t = t i , i = 1, 2, . . ., 8 is a vector containing eight land-subsidence conditioning factors, and φ = φ j , j = 1, 2 represents the land subsidence and non-subsidence classes.The MLP neural net function is then determined as follows: where f(t) is an unknown function that is improved during the training process by adjustable network weights for a given network architecture.
An advantage of MLP is that the user is not required to decide the relative importance of the various input measurements; most inputs can be selected during the training process, based on weight adjustment [46].Additionally, MLP does not require assumptions about the distribution of the training dataset.

Logit Boost (LB)
LB is a famous machine-learning algorithm introduced by Friedman et al., 2000 [47] that effectively reduces bias and variance; it is a slight modification of the most popular boosting method (AdaBoost) for handling noisy data [48], which reduces training errors and improves classification accuracy [49].LB has been widely applied in binary classification problems [50], medical science [51], and computer science [52]; however, it has not yet been applied to land-subsidence problems [53].
Compute the working response and weights: Fit the function by weighted least-squares regression of r i to x i using weights ω i .c.
Update the function as:

. Model Evaluation and Comparison
During the modeling and validation phases, model efficiency should be evaluated and compared [44].We quantitatively evaluated and compared the efficiency of the models according to the area under the ROC curve (%).This technique has been applied to assess risk models of various hazards including subsidence [9], landslides [54], and sinkholes [55]; it is a standard method to quantitatively evaluate the quality of probabilistic and statistical models [56].The x and y axes of the curve are sensitivity and specificity, respectively [56], and the area under the curve ranges from 0.5-1, with higher values indicating higher model accuracy and prediction capability.

LSS Mapping
Figure 5 shows the LSS maps produced by the five algorithms: Bayes net (Figure 5a), NB (Figure 5b), logistic (Figure 5c), multilayer perceptron (Figure 5d), and logit boost (Figure 5e).To generate the LSS maps, we used the LSS index (LSSI) to classify susceptibility events into four classes: Very high (5% of total area), high (5%), moderate (5%), and low (85%).The probability of land subsidence was predicted for each class, and subsidence hazard was predicted for residential areas.The susceptibility indexes from the five algorithms were similar.The region with very high susceptibility appeared from the western part of the region to the eastern part as railroad area, which is marked by the red color.In the Bayes net result, the very high susceptibility area did not appear as often as in the other models.In the middle of the region, the Bayes net result has a low index, whereas the rest of the models have a very high or high index.Some very high indexes also appear in the northeastern part of the region, as elementary school area, but most of the region has a low susceptibility index rank for subsidence.

Validation
The land subsidence susceptibility (LSS) analysis results were validated by comparison with 1932 land subsidence cells (i.e., 50% of the total subsidence data) that had not been used in the However, there are some differences for the medium-susceptibility index rank, marked by the green color.The area with medium susceptibility of land subsidence is spreading and has a different pattern in each model result.For example, the NB and logit boost results show the northern part of the region is mostly covered by the medium susceptibility index.In contrast, the multilayer perceptron shows the medium index in the southern part of the region.Meanwhile, in the Bayes net and logistic models, the medium index is diffusely distributed from the northern to the middle part of the study area.

Validation
The land subsidence susceptibility (LSS) analysis results were validated by comparison with 1932 land subsidence cells (i.e., 50% of the total subsidence data) that had not been used in the analysis.A quantitative comparison among all models of the receiver operating characteristic (ROC) curves for model performance is shown in Figure 6.The land subsidence susceptibility index (LSSI) values of all cells were sorted in descending order, divided into 100 classes [57], and associated with the cumulative number of subsidence events for each class (Figure 6).The model with the highest area under the ROC curve was considered to be the model with the best predictive performance.The area under the curve values for the Bayes net, naïve Bayes (NB), logistic, multilayer perceptron, and logit boost models were 0.8640, 0.8539, 0.8892, 0.8676, and 0.9144, respectively; thus, the respective LSS mapping accuracy rates were 86.42, 85.39, 88.92, 86.76, and 91.44%.Although all models had sufficient performance, the different applied models had different prediction performances using same training data.In particular, the logit boost model had a higher predictive accuracy (by about 2.52, 4.68, 5.02, and 6.05%, respectively) than the logistic, multilayer perceptron, Bayes net, and NB.Therefore, model reliability followed the order logit boost > logistic > multilayer perceptron > Bayes net > NB.The percentage differences of the validation result are discussed in Section 6.
Appl.Sci.2018, 8, x FOR PEER REVIEW 12 of 16 analysis.A quantitative comparison among all models of the receiver operating characteristic (ROC) curves for model performance is shown in Figure 6.The land subsidence susceptibility index (LSSI) values of all cells were sorted in descending order, divided into 100 classes [57], and associated with the cumulative number of subsidence events for each class (Figure 6).The model with the highest area under the ROC curve was considered to be the model with the best predictive performance.The area under the curve values for the Bayes net, naïve Bayes (NB), logistic, multilayer perceptron, and logit boost models were 0.8640, 0.8539, 0.8892, 0.8676, and 0.9144, respectively; thus, the respective LSS mapping accuracy rates were 86.42, 85.39, 88.92, 86.76, and 91.44%.Although all models had sufficient performance, the different applied models had different prediction performances using same training data.In particular, the logit boost model had a higher predictive accuracy (by about 2.52, 4.68, 5.02, and 6.05%, respectively) than the logistic, multilayer perceptron, Bayes net, and NB.Therefore, model reliability followed the order logit boost > logistic > multilayer perceptron > Bayes net > NB.The percentage differences of the validation result are discussed in Section 6.

Discussion
Recently, there has been great interest within the hazard prediction community toward improving the performance of hazard susceptibility models.In various fields, machine learning techniques have been shown to be effective in terms of performance [58][59][60][61][62].In particular, ensemble learning has improved machine learning results by combining several models [17,63,64].The results of different applied models under the same conditions (i.e., study area, input data, ratio of training, and validation datasets) can be compared to the quantitative accuracy values of the area under the ROC to present the predictive power of the model.Models with similar (different) accuracy values can be said to have similar (differing) performances.Therefore, the reliabilities of the models can be ordered according to the accuracies of the models.
In this study, the logit boost model, based on ensemble machine learning, had a 91.44% accuracy and a predictive accuracy that was higher (by 2.52-6.05%)than those of the logistic, multilayer perceptron, Bayes net, and NB based on machine learning.Similarly, a previous study [2] found that

Discussion
Recently, there has been great interest within the hazard prediction community toward improving the performance of hazard susceptibility models.In various fields, machine learning techniques have been shown to be effective in terms of performance [58][59][60][61][62].In particular, ensemble learning has improved machine learning results by combining several models [17,63,64].The results of different applied models under the same conditions (i.e., study area, input data, ratio of training, and validation datasets) can be compared to the quantitative accuracy values of the area under the ROC to present the predictive power of the model.Models with similar (different) accuracy values can be said to have similar (differing) performances.Therefore, the reliabilities of the models can be ordered according to the accuracies of the models.
In this study, the logit boost model, based on ensemble machine learning, had a 91.44% accuracy and a predictive accuracy that was higher (by 2.52-6.05%)than those of the logistic, multilayer perceptron, Bayes net, and NB based on machine learning.Similarly, a previous study [2] found that a decision tree model (the CHAID algorithm) produced LSS maps with higher accuracy (94.01%) than the QUEST decision tree (90.37%) and frequency ratio (86.70%).The other algorithms examined in the current study also exhibited high accuracy.Thus, the Bayes net, NB, logistic, and multilayer perceptron models can also be used as alternative models for mapping land subsidence hazard risk.Even though the logit boost model, as an ensemble model, had not been used to predict land subsidence in previous research, the results of the current study indicate that it can achieve high accuracy.
However, some limitations of the models might be a consideration for future studies.For example, the Bayes net model assumes no missing values, and this model also needs to be updated, especially for estimating the conditional probabilities [65].The benefits and drawbacks of the machine learning models are influenced by several factors, such as the availability of datasets, characteristics of the study area, and condition of the region [18].The use of Bayesian algorithms, such as the Bayes net and Naïve Bayes, has not been fully verified in natural hazard assessments [18].According to Mezaal [66], the multilayer perceptron algorithm also has limitations, such as overlearning and high computational complexity.
It has been reported that the sinkhole subsidence attributable to underground mining is caused by shallow depth, weak overburden, geological discontinuities, solution of rocks, rainfall, groundwater, and earthquakes [25].However, this study used a spatial database obtained from previous studies due to the limitation of available data.No further surveys or new surveys on land subsidence have been conducted in the study area for 14 years.If real-time monitoring data and additional data are obtained in the study area, a 4D underground subsidence model [67] with 3D geological modeling could be constructed to predict land subsidence hazard areas accurately.Thus, continuous monitoring and detailed new surveying for causative factors are essential in the study area.The maps produced in this study can be used as basic data for policymakers and further research.Future studies should develop alternative models and methods to determine the relative influence of factors affecting LSS, so that these methods can be applied in other regions.

Conclusions
Land subsidence is a hazardous effect of coal mine abandonment, including that in Korea.To prevent damage and loss of life in the Taebaek region, it is necessary to predict areas with high subsidence risk effectively.In this study, we used Bayesian (i.e., Bayes net and NB), functional (i.e., logistic, multilayer perceptron), and meta-ensemble (i.e., logit boost) machine learning models to perform LSS assessments.Although all models had sufficient performance, the logit boost meta-ensemble machine learning model had the highest accuracy (91.44%) among the five models.The logit boost model also had higher predictive accuracy (by 2.52%, 4.68%, 5.02%, and 6.05%, respectively) than the logistic, multilayer perceptron, Bayes net, and NB models.According to previous studies [11,57] in the same study area, the fuzzy operator with 84.40-88.98%accuracy, frequency ratio with 86.70% accuracy, CHAID decision tree with 94.01%accuracy, and QUEST decision tree with 90.37% accuracy have been applied to the subsidence hazard assessment, but the five models used in this study had been rarely applied.Based on these case studies, the land subsidence hazard rating can be applied to future policy decisions using additional data.

Figure 1 .
Figure 1.The study area in Taebaek, South Korea.

Figure 1 .
Figure 1.The study area in Taebaek, South Korea.

--
Mainly milky white-light green coarse-very coarse sandstone with greenish-gray-gray shale interbeds.Intercalations of pinkish sandstone, purple shale, and grayish-green sandy shale in the upper part.The sandstone is less compact than that of the Hambaegsan Formation.Mainly milky white-light gray coarse sandstone with some interbeds of black shale with thickness of 2-3 m.Some pebbly sandstones occur at the base.Jangseong (Pj) 80-150

--
Mainly milky white-light green coarse-very coarse sandstone with greenish-gray-gray shale interbeds.Intercalations of pinkish sandstone, purple shale, and grayish-green sandy shale in the upper part.The sandstone is less compact than that of the Hambaegsan Formation.Mainly milky white-light gray coarse sandstone with some interbeds of black shale with thickness of 2-3 m.Some pebbly sandstones occur at the base.Jangseong (Pj) 80-150

Figure 3 .
Figure 3. Geological cross sections in the study area.

Figure 3 .Table 3 .
Figure 3. Geological cross sections in the study area.Table 3. Borehole data in the study area.
As shown in Figure4, the mapping process consisted of five steps: (a) Spatial database construction, (b) random categorization of land subsidence locations into training and validation datasets at a ratio of 1:1, (c) selection of land subsidence conditioning factors, (d) application of machine learning methods to map LSS, and (d) validation and comparison of the five models.

Figure 4 .
Figure 4. Flowchart for the generation of land subsidence susceptibility (LSS) maps using various machine learning models including Bayes net, naïve Bayes (NB), logistic, multilayer perceptron, and logit boost models.

Figure 4 .
Figure 4. Flowchart for the generation of land subsidence susceptibility (LSS) maps using various machine learning models including Bayes net, naïve Bayes (NB), logistic, multilayer perceptron, and logit boost models.

Figure 6 .
Figure 6.Susceptibility index rank (x-axis) and subsidence occurrence (y-axis) of the five algorithms.

Figure 6 .
Figure 6.Susceptibility index rank (x-axis) and subsidence occurrence (y-axis) of the five algorithms.

Table 3 .
Borehole data in the study area.