A Unique Conditions Model for Landslide Susceptibility Mapping

: Several methods and approaches have been proposed to assess landslide susceptibility. The likelihood of landslides occurring can be determined by applying statistical models to historical landslides, taking into account controlling factors. Popular methods for predicting the probability of landslides are weights-of-evidence and logistic regression. We discuss the assumptions and interpretations of these methods, the relationships between them, and their strengths and weaknesses in case of categorical factors. Of particular interest is the conditional independence of the controlling factors and its effect on model bias. To avoid lack of conditional independence of factors and model bias, we present a unique conditions model that is always unbiased. To illustrate the theoretical developments, a practical application is given using observed landslides and geo-environmental factors from a previous study. The unique conditions model appears superior to the other models.


Introduction
Landslide susceptibility models are used to predict the spatial occurrence of slope failure given a range of geo-environmental conditions, allowing identification of landslideprone areas, and supporting spatial planning to reduce landslide risk [1][2][3].Various methods and approaches have been proposed to assess landslide susceptibility, such as heuristic or index-based zoning techniques, physically-based models, and statistically based classification methods [4][5][6].
The likelihood of occurrences can be determined by fitting a statistical model to historical landslides, taking into account explanatory factors affecting landslides, such as geology, topography, hydrometry, land use, etc.The main characteristics of statistical models are a high efficiency and better understanding of the relationships between the spatial factors used to identify areas prone to landslides [7,8].Lee [5] assessed the status of landslide susceptibility mapping based on 776 papers published over a 20-year period (1999-2018) and found that commonly used statistical methods were logistic regression, the frequency ratio, and weights-of-evidence.A review of statistically based modeling of landslide susceptibility, including 565 peer-reviewed articles from 1983 to 2016, was presented by Reichenbach et al. [6], who noted that the most applied statistical methods for modeling of landslide susceptibility were logistic regression, neural network analysis, and weights-of-evidence.
Weights-of-evidence is a very popular technique for landslide susceptibility mapping because it is easy to use and can easily be incorporated in geographic information systems [7][8][9]; recent examples are [10][11][12][13][14][15][16].The purpose of weights-of-evidence is to weigh and combine the controlling factors to predict the probability of landslide occurrence.However, weights-of-evidence is hampered by the assumption of conditional independence of the controlling factors, which is often untrue in practice.Violation of conditional independence between factors has received much attention in geosciences in mineral prospectivity modeling.When there is significant conditional dependency between factors, the probabilities derived from weights-of-evidence are biased and generally too large compared to the observations [9,17].Several attempts have been proposed to account for model bias or to relax the conditional independence assumption, such as modified weighing [18,19], additive mixed terms [20,21], semi-Naïve Bayes approaches [22,23], or machine learning algorithms such as decision trees, random forest, and artificial neural networks [24][25][26][27].However, to date, no generally accepted improvement has been found.
Logistic regression is one of the most common methods for modeling landslide susceptibility because it is easy to implement and very efficient for analyzing relationships between a binary response variable and numerical or categorical explanatory variables [28][29][30].Budimir et al. [31] presented a review of landslide probability mapping using logistic regression, based on 75 peer-reviewed papers, and concluded that there is no consistent methodology for applying logistic regression analysis for landslide susceptibility.In particular, the method by which explanatory factors or factor classes are selected is often not well explained.Furthermore, the majority of the published papers apply a combination of frequency ratio and logistic regression, where factor classes are replaced by their landslide frequency ratio and logistic regression is only applied to the factors; recent examples are [32][33][34][35][36][37].
Most studies using logistic regression to predict landslide susceptibility provide little or no information on the conditional independence of factors and model bias.In mineral prospectivity modeling, on the other hand, it is well known that logistic regression models always produce unbiased estimates, regardless of whether the controlling factors are conditionally independent with respect to the target variable, as opposed to weightsof-evidence [19,38].Moreover, it is well known that weights-of-evidence and logistic regression produce similar results if the predictor factors are categorical and conditionally independent [18][19][20][21]38,39].
A disadvantage of logistic regression is that estimated regression coefficients can have large variances unless there is conditional independence of the controlling factors [38].However, in the case where the factors are categorical, interaction terms in logistic regression models can compensate for violations of conditional independence [21,39].Therefore, combinations of factors or factor classes can be added to the model as additional terms to compensate for the lack of conditional independence of the factors.Additional interaction terms result in a hierarchy of models, where each former model is a special case of the successive latter model and is therefore more restrictive [39].However, the practical application of this method has been questioned because the number of additive terms can increase rapidly, so that the estimation of the logistic regression coefficients becomes increasingly difficult, if not impossible, given the accuracy of the numerical solution procedure and the limited number of training data [20,38].
In this study, we focus on statistical methods for landslide susceptibility mapping, which predict the conditional probability of landslides with categorical controlling factors, in particular, weights-of-evidence and logistic regression.We investigate how modeling techniques, conditional independence of the factors, and model bias are related.A unique conditions model is proposed that reproduces observed landslide probabilities for any combination of categorical controlling factors, without any bias.The feasibility, strengths, and weaknesses of the modeling approaches are illustrated and tested through application to a practical case study.

Preliminaries
We denote the observation of a landslide in the study area with a binary indicator x 0 , such that x 0 = 1 indicates the presence and x 0 = 0 the absence of a landslide.Similarly, landslide controlling factors are denoted with a set of binary indicators x = x ij , where i = 1, . . .n refers to n factor types and j = 1, . . .n i to n i subtypes of factor i, so that x ij = 1 indicates the presence and x ij = 0 the absence of factor class x ij .We assume that all factors completely overlap the study area and that the classes of each factor do not overlap, so (1) We also define the unconditional landslide probability p 0 = p(x 0 = 1), generally denoted as the prior probability; the conditional landslide probability for a single factor class p ij = p x 0 = 1|x ij ; and the conditional landslide probability for all factors combined p x = p( x 0 = 1|x), which is commonly referred to as the posterior probability.When factor class x ij promotes landslides, p ij exceeds p 0 , and vice-versa.The same applies to the combined set of factors; when p x is larger than p 0 , the environmental conditions are more favorable for landslides to occur.
Estimates of the prior probability and the conditional probability for a given factor class can be obtained directly from landslide observations as follows: where A is the total area of the study domain and A ij is the area occupied by factor class x ij = 1.The posterior probability p x is not easy to derive from the data, and finding a suitable model to predict posterior probabilities is the main goal of a landslide susceptibility study.Various statistically based methods and approaches have been applied in practice, but little attention is paid to whether estimated probabilities are reliable.Model bias refers to systematic errors, which can result from inaccurate data or from bias in the algorithms used to validate the model.In geosciences, it is common to verify that the mean of the posterior probability is equal to the observed prior probability [9,17], so In addition, one can also verify whether the conditional landslide probability for a single factor class agrees with the observations, i.e., 1 Note that if Equation (5) holds, then Equation ( 4) is also satisfied, because which can be generalized as: any area partitioned into unbiased sub-areas is unbiased.However, in practice, models for landslide susceptibility mapping are often biased, resulting in incorrect predictions, which are generally overlooked or ignored.

Weights-of-Evidence
Weights-of-evidence is a very popular and widely used method for predicting the probability of landslides.In weights-of evidence, the posterior probability p x is derived from Bayes' theorem as Similarly, the posterior probability for absence of landslides p( x| x 0 = 0) is derived as Combined, this leads to an expression for the odds of the presence versus the absence of a landslide: Using the logit function, this can be rewritten as Furthermore, weights-of-evidence assumes a conditional independence of the controlling factors, so that the joint probabilities on the right-hand side of Equation ( 10) can be derived from the product of individual probabilities, so where w ij are factor class weights given by Equation ( 11) is a statistical equation.To use it as a model for the prediction of landslide probabilities in a domain, it must be reformulated in algebraic form as clearly showing that the weights only apply if the corresponding factor class is present.
In practice, estimates of the probabilities in Equation ( 12) can be obtained from observed landslides, so the weights are derived as showing that there is a one-to-one relationship between the weight w ij and the observed landslide probability p ij of a factor class.Because landslides are rare, landslides may not be observed if the area of a factor class is small, so that p ij = 0, which poses a problem for the application of Equation ( 14) because the logarithm of zero is infinite.In such a case, one is accustomed to setting the weight equal to zero, although this violates Equation ( 14) and introduces a model bias because w ij = 0 implies that p ij = p 0 , which contradicts what is observed.
In the case of conditional independence of all factors, Equations ( 4) and (5) hold, showing that the posterior probabilities are unbiased; the proof is given in Section 2.3.However, in practice, the conditional independence of controlling factors is generally not guaranteed, so the posterior probabilities obtained by weight-of-evidence are biased.Usually, violation of conditional independence results in posterior probabilities that are too large, and conversely, if the model results are found to be biased, this may be due to a lack of conditional independence of the factors.

Logistic Regression
Multiple logistic regression is probably the most commonly used technique to predict posterior landslide probabilities.Starting from Equation ( 13), the idea arises to derive the weights by logistic regression.However, there is a complication, namely that the factor classes are linearly dependent, as shown by Equation (1), which is not allowed in multiple regression.To get around this, one class in each factor must be removed: usually the first class, although any class will do.Therefore, the logistic regression model is formulated as follows: where β 0 and β ij are model parameters to be estimated by maximum likelihood, which is a measure of fit between predicted probabilities and the observed data.The log-likelihood is given by [40] as Maximum likelihood is obtained by setting the derivatives of the log-likelihood for each parameter equal to zero, so which can be solved to determine β 0 and β ij .In practice, this requires specialized software because the logistic regression model is non-linear.Note that these equations are equivalent to Equations ( 4) and ( 5), which express the model bias.Therefore, the maximum likelihood solution also ensures that the posterior probabilities predicted by the model are unbiased, which is an important advantage of using logistic regression.
In the case of conditional independence of the factors, logistic regression and weightsof-evidence are equivalent [21,37,39].To prove the correspondence between weights-ofevidence and logistic regression, eliminate the first class of each factor in Equation ( 13) by substituting Comparison with Equation (15) shows the correspondence between the weights and the logistic regression coefficients as Similar expressions have been presented in the literature, for example [21,37,39].
A final note about logistic regression is that it cannot handle missing data.Therefore, factor classes without observed landslides should be excluded from the model.

Unique Conditions
One way to avoid conditional dependency is to overlay factors to create combined factor classes, which can improve conditional independence [20].Ultimately, all factors can be overlaid and all factor classes combined to identify unique conditions, which can be indicated as where the subscript j indicates any factor class of factor i in the range 1 to n i , and y k is a binary indicator such that y k = 1 indicates the presence and y k = 0 the absence of a unique combination of factor classes.There are ∏ n i=1 n i possibilities for y k , but most of these will not occur because many factor classes do not overlap.Furthermore, all occurring combinations are categorical and non-overlapping, and fully cover the study area, so Therefore, the unique conditions form a conditionally independent set y = {y k } of controlling factors, so that a model for predicting the conditional landslide probability can be obtained as where p y is the posterior landslide probability, given that the set of unique conditions y and w k are weights that can be obtained from landslide observations, similarly as in Equation ( 14), so that where p k is the frequency of landslides observed in area A k occupied by factor y k = 1, given by When Equation ( 25) is inserted into Equation ( 24) and Equation ( 23) is used, it follows that This shows that the predicted posterior probability p y is constant in the area occupied by y k = 1 and equal to the probability p k observed in that area.Since the study area is completely covered by the set y and each factor class is covered by a subset of y, Equations ( 4) and (5) apply, which shows that there is no model bias.Furthermore, the model is unbiased in any subset that can be composed of y.Therefore, we can assume that there is no other model that can produce better results than this.
When all factors are conditionally independent, the weights-of-evidence model and the unique conditions model are equivalent, because both methods solve Equation (10) exactly and predict the same posterior probabilities: the latter by combining the observed probabilities of all possible combinations of the factor classes, and the former by combining the observed probabilities of the factor classes, which should lead to the same result if the factors are conditionally independent.Since the unique conditions model is unbiased, weights-of-evidence must also be unbiased because the results are the same if the factors are conditionally independent.

Test Case
For demonstration and discussion, we consider a case study derived from Kayastha et al. [41] and De Smedt et al. [23].In these studies, conditional probabilities were calculated to predict landslides in a river basin in Nepal.The basin covers an area of 124.26 km 2 and 295 landslides were observed, ranging in size from about 400 m 2 to 0.1 km 2 .The total area of landslides is 2.35 km 2 or 1.9% of the total basin.The prior probability of landslides is therefore p 0 = 0.019.The geo-environmental conditions that influence landslides consist of eight factors with three to nine classes, as listed in Table 2.All factors are categorical and given in the form of raster maps with a resolution of 20 m, as shown in Figure 1.Details can be found in the original studies.
Weights-of-evidence results are obtained using the R Information package [42], and for logistic regression, we used the glm generalized maximum likelihood fitting procedure from the R stats package [43].The unique conditions method can be performed using standard GIS techniques or programmed numerically accordingly, as done in this study (Figure 2).  2.
Weights-of-evidence results are obtained using the R Information package [42], and for logistic regression, we used the glm generalized maximum likelihood fitting procedure from the R stats package [43].The unique conditions method can be performed using standard GIS techniques or programmed numerically accordingly, as done in this study (Figure 2).

Conditional Independence of the Factors
We first present a simple illustrative application by considering only a single controlling factor, namely geology, which is the most important predictor of landslides, as found in the previous studies [23,41].Because the nine geology classes are non-overlapping, they

Conditional Independence of the Factors
We first present a simple illustrative application by considering only a single controlling factor, namely geology, which is the most important predictor of landslides, as found in the previous studies [23,41].Because the nine geology classes are non-overlapping, they are conditionally independent, so the weights-of-evidence model and the unique conditions model become identical.The results for weights-of-evidence and logistic regression are presented in Table 1.The second column lists the observed prior probability p j for each geology class (subscript i is deleted because there is only one factor), which are also the probabilities p k predicted by the unique conditions model.The third column gives the weights w j derived from the observed probabilities p j as given by Equations ( 14) or (25).Thus, predictions of the posterior landslide probability with weights-of-evidence given geology as the only controlling factor correspond exactly to the observed probabilities, because the geology classes are conditionally independent.
Table 1.Results obtained by weights-of evidence, logistic regression, and the unique conditions model, using only geology as a controlling factor: geology classes, p j : observed and unique conditions model probabilities, w j : weights-of evidence, and β j : logistic regression coefficients; the last row lists the prior probability p 0 , logit(p 0 ), and the intercept of the logistic regression model β 0 , respectively.The last column gives the coefficients β j of the logistic regression model obtained by maximum likelihood, which relate to the w j values as given by Equation (21).For example, β 2 = −0.536 is equal to w 2 − w 1 = −1.780+ 1.244 = −0.536.The last row of the table lists the prior probability p 0 , logit(p 0 ) and the intercept of the logistic regression β 0 , the latter corresponding to Equation (20) since β 0 = −5.193 is equal to logit(p 0 ) + w 1 = −3.949− 1.244 = −5.193.Hence, all model results are equivalent and unbiased due to the conditional independence of the controlling factor.

Conditional Dependence of the Factors
In the following illustrative application, all controlling factors are considered.The results for weights-of-evidence and logistic regression are presented in Table 2.The unique conditions model is also applied, but these results are too extensive to present in tabular form.The column marked by p ij lists the observed landslide probability for all factor classes.The column denoted by w ij gives the weights of evidence, which relate to the observed prior probabilities as given by Equation (14).However, predictions of the posterior landslide probability with weights-of-evidence do not match the observed probabilities, because there is no conditional independence of the factors.For example, the mean of the posterior probability is 0.027, which is larger than the observed prior probability p 0 = 0.019.Likewise, the mean of the posterior probabilities in each factor class area is not equal to the observed probability in that area.The weights-of-evidence model is therefore biased because the controlling factors are not conditionally independent.The β ij coefficients of the logistic regression model are again obtained by maximum likelihood.However, unlike the previous case, the resulting β ij values are not related to the w ij values as expressed by Equation (21).For example, for the second geology class, we obtain β 62 = −0.116,which is clearly different from w 62 − w 61 = −0.53.Moreover, the logit of the prior probability logit(p 0 ) and the estimated intercept of the logistic regression β 0 , listed in the last row of Table 2, do not correspond to Equation (20) since β 0 = −7.438,while logit(p 0 ) + ∑ n i=1 w i1 = -3.949− 4.793 = −8.742,because the factors are not conditionally independent.Nevertheless, predictions of the posterior probability with logistic regression for each factor class satisfy Equation (5) and agree with the observed probabilities.Thus, the logistic model is unbiased with respect to the factor classes and the overall domain, while the weights-of-evidence model is not.
The results obtained with the unique conditions model are also unbiased because the unique condition areas do not overlap, and are therefore conditionally independent.This model therefore precisely predicts the observed landslide probabilities in the total domain and in all factor classes.So, in this case, weights-of-evidence is biased while logistic regression and the unique conditions model are unbiased.Furthermore, logistic regression is unbiased for the factor classes and the total domain, while the unique conditions model is unbiased for all possible combinations of factor classes.Thus, the models are not equivalent, logistic regression is better than weights-of-evidence, and the unique conditions model performs best.
Maps of the estimated posterior probabilities obtained with the different models are presented in Figure 3.Note that the distribution of the probabilities is very skewed.Most values are lower than the prior probability p 0 = 0.019 and correspond to areas where landslide susceptibility is very low.These areas are marked with the blue color in the figures and cover a large part of the study area.Posterior probabilities that are greater than the prior probability are marked by the green and yellow colors in the maps.These are areas prone to landslide and mainly occur in the eastern and southern parts of the study area.Note that the map obtained with weights-of-evidence shows more of these areas than the other maps, which is the result of the model bias leading to an overprediction of landslide susceptibility.Also note that that the unique conditions model indicates some landslideprone areas in the northeastern part of the domain that are absent or less pronounced on the other maps.Orange and red colors indicate areas with very high landslide susceptibility.Such areas are present in the western part of the map obtained with the unique conditions model and, to a lesser extent, in the map obtained with weights-of-evidence, but the latter is likely due to overprediction due to model bias.
There appears to be a close similarity between the logistic regression and unique conditions maps, but the logistic map is more blurred, while the unique conditions map is more crisp.It appears that the map obtained by logistic regression is a smoothed version of the map obtained by the unique conditions model.
areas than the other maps, which is the result of the model bias leading to an overprediction of landslide susceptibility.Also note that that the unique conditions model indicates some landslide-prone areas in the northeastern part of the domain that are absent or less pronounced on the other maps.Orange and red colors indicate areas with very high landslide susceptibility.Such areas are present in the western part of the map obtained with the unique conditions model and, to a lesser extent, in the map obtained with weights-ofevidence, but the latter is likely due to overprediction due to model bias.The performance of the models for classification of landslide susceptibility is evaluated by the receiver operating characteristic curve method and the resulting area under the curve (AUC) [44].The results are shown in Figure 4.The AUC value is 0.760 for weightsof-evidence, 0.772 for logistic regression, and 0.928 for the unique conditions model.Thus, weights-of-evidence and logistic regression have almost equal discriminatory power for landslide susceptibility classification.The bias of weights-of-evidence has little effect on its discriminatory power.Nevertheless, the resulting landslide susceptibility classification will have little physical meaning due to the model bias.It is clear that the unique conditions model outperforms the other models because its discriminatory power is much higher and the AUC value is close to one (i.e., perfect classification).
model.Thus, weights-of-evidence and logistic regression have almost equal discrimina-tory power for landslide susceptibility classification.The bias of weights-of-evidence has little effect on its discriminatory power.Nevertheless, the resulting landslide susceptibility classification will have little physical meaning due to the model bias.It is clear that the unique conditions model outperforms the other models because its discriminatory power is much higher and the AUC value is close to one (i.e., perfect classification).

Discussion
The theoretical developments and illustrative examples show that conditional independence of the controlling factors and model bias are related, as also reported in the literature [17][18][19][20][21][22]38,39].In this study, it is clearly shown that in the case where controlling factors are conditionally independent, the weights-of-evidence, logistic regression, and unique conditions model are equivalent, meaning they will yield the same posterior probabilities.Therefore, in practice, one can choose any of these methods based on simplicity of the technique or the skill and experience of the user.The equivalence between weightsof-evidence and logistic regression in the case of conditional independence of the factors has been demonstrated in other studies [18][19][20][21]38,39], but the equivalence with the unique conditions method is a new contribution from this study.
Conditional independence of the controlling factors is, in practice, the exception rather than the rule.When there is no conditional independence, weights-of-evidence produces biased posterior probabilities, which is usually ignored or disregarded in practice, especially in landslide susceptibility studies, where weights-of-evidence has proven to be a very popular technique [4][5][6].On the other hand, logistic regression provides unbiased posterior probabilities for individual factor classes and the overall study area, but not for higher levels when factors are combined.Methods have been proposed in the literature to improve weights-of-evidence and logistic regression by including so-called mixed terms, that is, combinations of factors, based on trial and error or search algorithms that improve

Discussion
The theoretical developments and illustrative examples show that conditional independence of the controlling factors and model bias are related, as also reported in the literature [17][18][19][20][21][22]38,39].In this study, it is clearly shown that in the case where controlling factors are conditionally independent, the weights-of-evidence, logistic regression, and unique conditions model are equivalent, meaning they will yield the same posterior probabilities.Therefore, in practice, one can choose any of these methods based on simplicity of the technique or the skill and experience of the user.The equivalence between weightsof-evidence and logistic regression in the case of conditional independence of the factors has been demonstrated in other studies [18][19][20][21]38,39], but the equivalence with the unique conditions method is a new contribution from this study.
Conditional independence of the controlling factors is, in practice, the exception rather than the rule.When there is no conditional independence, weights-of-evidence produces biased posterior probabilities, which is usually ignored or disregarded in practice, especially in landslide susceptibility studies, where weights-of-evidence has proven to be a very popular technique [4][5][6].On the other hand, logistic regression provides unbiased posterior probabilities for individual factor classes and the overall study area, but not for higher levels when factors are combined.Methods have been proposed in the literature to improve weights-of-evidence and logistic regression by including so-called mixed terms, that is, combinations of factors, based on trial and error or search algorithms that improve the likelihood of the predictions [4,21,33,35].Such methods may be justified, but it seems likely that their results will never match the results of the unique conditions model.
The above discussion is further illustrated by considering the ROC curves and AUC values obtained with the different models, as shown in Figure 4.When only landslide susceptibility classification is considered, weights-of-evidence and logistic regression perform almost equally.Because weights-of-evidence is easier to perform, it may be preferable in practice if only classification is pursued.However, none of these methods can produce the discriminatory power of the unique conditions model.Moreover, Figure 3 shows that the posterior probability map obtained with the unique conditions model is more detailed and covers the entire range of probabilities from zero to one, while for logistic regression, the predicted probabilities range only from zero to a maximum of 0.36, and for weights-of-evidence from zero to 0.81-the latter likely due to overestimation due to the model bias.
It is generally accepted that direct estimation of conditional probabilities for all combinations of controlling factors is infeasible due to the excessive computational requirements.When the controlling factors consists of n binary patterns, 2 n unique combinations are possible, making it very difficult, if not impossible, to directly estimate the conditional probabilities of all combinations [17,20].For instance, in the present case, there are a total of 47 factor classes, which would imply more than 10 14 possible combinations.In practice, however, this is not the case, because classes of a same factor do not overlap.In such a case, the possible unique combinations reduce to the product of the number of classes in each factor.In the present case, this would amount to 629,856 possible combinations, which is still a large number.However, in addition, not all classes of different factors overlap, so the actual number of unique combinations may be much smaller.Hence, the trick is to consider only the combinations that actually occur and ignore the rest.This can be achieved, as shown in Figure 2, by comparing the combination of factors in a unit area with all other unit areas in the study domain and, if the conditions match, counting the number of unit areas and the number of observed landslides for this combination, making it possible to estimate the prior landslide probability using Equation (26).Because the unique combinations of all factors do not overlap, they are conditionally independent, such that the posterior probabilities are equal to the observed prior probabilities as given in Equation ( 27).The numerical derivation can be tedious if the unit area is small relative to the total domain.In the present case, there are 310,649 unit cells, which is large but achievable with modern computing power.
The number of unique conditions actually occurring is 28,605, which is much less than the theoretical possible number of combinations.The size of the unique conditions areas ranges from 1 to 467 grid cells, i.e., 400 m 2 to 187 ha.The average size of the unique conditions area is 4.34 ha.In general, the areas with unique conditions are quite small, so in many of these areas, no landslide has been observed.In fact, 66% of the total domain appears to be free of landslides.This could be interpreted as missing data, and the posterior probability could be set equal to the prior value.However, this would conflict with the unbiasedness of the model, so we chose to be consistent and set the posterior probability equal to zero.Furthermore, 76% of the domain is found to have a posterior probability lower than the prior probability, and thus can be assumed to have low landslide susceptibility.There are also 171 unit areas found with unique conditions and observed landslides, implying a landslide probability of one.These represent 3% of all observed landslides in the study area.
At first glance, one might conclude that the unique conditions model does not provide information about the importance of each factor class.However, this is not the case, because the model is unbiased and precisely predicts the average posterior landslide probability in each factor class area.Because these are equal to the observed landslide probabilities, the p ij values or the corresponding weights w ij are measures of the importance and predictive power of each factor class.Such information could be used prior to modelling to discard factors with classes that exhibit little or negligible discriminatory power.For instance, in the present case, one might decide to remove the slope shape factor because it has low w ij values.This reduces the number of unique conditions to 13,341, which can save computing time.The size of the unique conditions areas now ranges from 1 to 1090 grid cells, i.e., 400 m 2 to 436 ha, with an average of 11.4 ha.Now, 54% of the total area is free of landslides and 74% has a posterior probability that is lower than the prior probability.However, the resulting posterior landslide probabilities are not much different from the previous results and the AUC value becomes 0.90.Removing the slope shape factor therefore has little effect, apart from the gain in computing time.
Landslide probability estimates are not well suited for mapping landslide susceptibility, because the distribution over the study area is very skewed and there is no clear rule for classifying the probability values in landslide susceptibility categories.Therefore, we propose a landslide susceptibility index (LSI), similar to [23], defined as LSI = logit p y − logit(p 0 ), (28) which for the unique conditions model becomes The resulting LSI map for the present case is shown in Figure 5. Since the probabi values can be zero or one, the weights can go to infinity.Therefore, the weights are limi to a range of −10 to +10.The LSI map shows much more detail than the posterior landsl probability maps.LSI values around zero are represented by the yellow color and co spond to areas that are no more or less prone to landslides than observed.Negative values, represented by the green and blue colors, indicate areas not prone to landslid These take up a large part of the basin.On the other hand, areas with orange and colors are prone to landslides and cover only some small parts of the basin.Thus, tra formation of posterior landslide probability into corresponding LSI values allows a sim and clear interpretation of landslide susceptibility.

Conclusions
We examined three statistical methods for landslide susceptibility mapping, wh predict the conditional probability of landslides with categorical controlling factor weights-of-evidence, logistic regression, and a unique conditions model-by consider all possible combinations of the controlling factors.The strengths and weaknesses of models were illustrated and tested through application to a practical case study.

Conclusions
We examined three statistical methods for landslide susceptibility mapping, which predict the conditional probability of landslides with categorical controlling factors-weightsof-evidence, logistic regression, and a unique conditions model-by considering all possible combinations of the controlling factors.The strengths and weaknesses of the models were illustrated and tested through application to a practical case study.
It is shown that when all factors are conditionally independent, all models are equivalent and result in unbiased predictions of the posterior landslide probability.When there is a conditional dependency between factors, the posterior probabilities derived from weights-of-evidence are biased and generally too large compared to the observations.However, the bias of the weights-of-evidence has little effect on its discriminatory power.On the other hand, logistic regression produces unbiased estimates with respect to the factor classes and the overall study area, regardless of whether the controlling factors are conditionally independent.
The unique conditions model is always unbiased because the unique condition areas do not overlap and are therefore conditionally independent.Therefore, this model predicts the landslide probabilities without bias in the total domain, in all factor classes, and in any area that can be composed by combining factor classes.Moreover, the unique conditions model outperforms the other models because the discriminatory power is much higher, and the AUC value is close to one.The application of the unique conditions model can become computationally cumbersome if there are too many controlling factors and overlaps.This can also lead to unique conditions areas becoming too small to be meaningful.To avoid this, the most important factors can first be selected based on the observed landslide probabilities.
Because landslide probability estimates are not well suited for landslide susceptibility mapping, we propose a landslide susceptibility index that has the advantage of being easy to interpret without subjective judgment.
Although the landslide dataset used in this study does not include all possible landslide-causing factors, the results of this study are promising and show potential for broader practical use.However, quality and quantity of input data are important for achieving good results, so further research is necessary.We therefore recommend that future research consider other field cases with more complete thematic layers and provide a geomorphological evaluation and cross-validation of the predictions.In future research, we also recommend validating the findings of this work with other innovative data-driven methods such as machine learning and/or deep learning models.

Figure 1 .
Figure 1.Raster maps of the observed landslides indicator  and the controlling factors  ; the legend labels of the factor classes are listed in the same order as in Table2.

Figure 1 .Figure 2 .
Figure 1.Raster maps of the observed landslides indicator x 0 and the controlling factors x ij the legend labels of the factor classes are listed in the same order as in Table2.Geosciences 2024, 14, x FOR PEER REVIEW 8 of 17

Figure 2 .
Figure 2. Workflow for the unique conditions model explained in pseudocode.

Figure 3 .
Figure 3. Raster maps of the posterior landslide probability obtained with weights-of-evidence, logistic regression, and unique conditions model.

Figure 3 .
Figure 3. Raster maps of the posterior landslide probability obtained with weights-of-evidence, logistic regression, and unique conditions model.

Figure 4 .
Figure 4. ROC curves obtained using weights-of evidence, logistic regression, and the unique conditions model.

Figure 4 .
Figure 4. ROC curves obtained using weights-of evidence, logistic regression, and the unique conditions model.

29 )
Such a classification has the advantage of being easy to interpret, since positive values indicate areas prone to landslides and conversely, negative values indicate areas less prone to landslides.Moreover, landslide susceptibility classes can be easily obtained without subjective judgment by dividing the LSI values into equal intervals.The resulting LSI map for the present case is shown in Figure5.Since the probability values can be zero or one, the weights can go to infinity.Therefore, the weights are limited to a range of −10 to +10.The LSI map shows much more detail than the posterior landslide probability maps.LSI values around zero are represented by the yellow color and correspond to areas that are no more or less prone to landslides than observed.Negative LSI values, represented by the green and blue colors, indicate areas not prone to landslides.These take up a large part of the basin.On the other hand, areas with orange and red colors are prone to landslides and cover only some small parts of the basin.Thus, transformation of posterior landslide probability into corresponding LSI values allows a simple and clear interpretation of landslide susceptibility.rulefor classifying the probability values in landslide susceptibility categories.Therefo we propose a landslide susceptibility index (LSI), similar to[23], defined as LSI = logit�  � − logit( 0 ), ( which for the unique conditions model becomes LSI = ∑ [logit(  ) − logit( 0 )]   = ∑      .( Such a classification has the advantage of being easy to interpret, since positive val indicate areas prone to landslides and conversely, negative values indicate areas prone to landslides.Moreover, landslide susceptibility classes can be easily obtained w out subjective judgment by dividing the LSI values into equal intervals.

Figure 5 .
Figure 5. Map of the landslide susceptibility index (LSI) obtained with the unique conditions mo

Figure 5 .
Figure 5. Map of the landslide susceptibility index (LSI) obtained with the unique conditions model.

Table 2 .
Results obtained by weights-of-evidence and logistic regression, using all controlling factors: factor classes, p ij : observed probabilities, w ij : weights-of evidence, and β ij : logistic regression coefficients; the last row lists p 0 : the prior probability, logit(p 0 ), and β 0 : the intercept of the logistic regression model, respectively.