Ecoregion-Based Landslide Susceptibility Mapping: A Spatially Partitioned Modeling Strategy for Oregon, USA

Xu, Zhixiang; Zuo, Peng; Zhao, Wen; Zhou, Zeyu; Shao, Xiangyu; Yu, Junpo; Yu, Haize; Wang, Weijie; Gan, Junwei; Duan, Jinshun; Jin, Jiming

doi:10.3390/app152011242

Open AccessArticle

Ecoregion-Based Landslide Susceptibility Mapping: A Spatially Partitioned Modeling Strategy for Oregon, USA

by

Zhixiang Xu

^1,†

,

Peng Zuo

^1,†

,

Wen Zhao

^2,†

,

Zeyu Zhou

¹,

Xiangyu Shao

¹

,

Junpo Yu

¹

,

Haize Yu

¹,

Weijie Wang

¹,

Junwei Gan

¹,

Jinshun Duan

¹ and

Jiming Jin

^1,*

¹

College of Resources and Environment, Yangtze University, Wuhan 430100, China

²

Lanzhou Institute of Arid Meteorology, China Meteorological Administration/Key Laboratory of Arid Climate Change and Reducing Disaster of Gansu Province/Key Open Laboratory of Arid Climate Change and Reducing Disaster, China Meteorological Administration, Lanzhou 730020, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Appl. Sci. 2025, 15(20), 11242; https://doi.org/10.3390/app152011242

Submission received: 30 September 2025 / Revised: 16 October 2025 / Accepted: 17 October 2025 / Published: 20 October 2025

Download

Browse Figures

Versions Notes

Abstract

Conventional non-partitioned Landslide Susceptibility Mapping (LSM), which neglects geospatial heterogeneity, often has limitations in accurately capturing local risk patterns. To address this challenge, this study investigated the effectiveness of localized modeling in the environmentally diverse state of Oregon, USA, by comparing ecoregion-based local models with the non-partitioned model. We partitioned Oregon into seven distinct units using the U.S. Environmental Protection Agency (EPA) Level III Ecoregions and developed one global and seven local models with the eXtreme Gradient Boosting (XGBoost) algorithm. A comprehensive evaluation framework, including the Area Under the Curve (AUC), Landslide Density (LD), and the Total Deviation Index (TDI), was used to compare the models. The results demonstrated the clear superiority of the partitioned strategy. Moreover, different ecoregions were found to have distinct dominant landslide conditioning factors, revealing strong spatial non-stationarity. Although all models generated high AUC values (>0.93), LD analysis showed that the local models were significantly more efficient at identifying high-risk zones. This advantage was particularly pronounced in critical, landslide-prone western areas; for instance, in the Willamette–Georgia–Puget Lowland, the local model’s LD value in the ‘very high’ susceptibility class was over 3.5 times that of the global model. High TDI values (some >35%) further confirmed fundamental spatial discrepancies between the risk maps obtained by the two strategies. This research substantiated that, in geographically complex terrains, partitioned modeling is an effective approach for more accurate and reliable LSM, providing a scientific basis for developing targeted regional disaster mitigation policies.

Keywords:

landslide susceptibility; partitioned modeling; machine learning; ecoregion; spatial non-stationarity

1. Introduction

Landslides are one of the most widespread and destructive natural hazards globally, posing a severe threat to human life, infrastructure, and economic development [1,2]. Primarily triggered by intense rainfall, seismic activity, and anthropogenic engineering activities, landslides cause substantial economic losses worldwide each year [3,4]. The U.S. National Research Council (NRC) has reported that landslides affect every U.S. state, causing an average of 25 to 50 fatalities and economic damages of several billion USD annually [5]. Furthermore, analysis by the United States Geological Survey (USGS) indicates that approximately 44% of the country’s land area is at risk of landslides, underscoring the urgent demand for effective hazard assessment and mitigation strategies [6].

Landslide Susceptibility Mapping (LSM) is a critical tool for spatial planning and disaster management [7]. It aims to identify and delineate areas prone to landslides by analyzing the relationships between historical landslide events and a series of conditioning factors, such as topography, geology, hydrological conditions, and land cover [8,9]. Over the past few decades, LSM methodologies have evolved from qualitative assessments to quantitative statistical models and, subsequently, to machine learning algorithms. Early approaches were mostly qualitative, relying on expert knowledge and heuristic models [10]. The rapid advancement of geospatial technologies and computational power has since facilitated the widespread use of statistical models such as the frequency ratio and logistic regression [11,12]. In recent years, Machine Learning (ML) algorithms have revolutionized the field, demonstrating superior performance in capturing the complex, non-linear relationships between landslide occurrence and conditioning factors [13,14,15]. For instance, Chowdhury et al. (2024) applied various models, including Random Forest (RF), in Bangladesh, producing highly reliable landslide susceptibility maps with Area Under the Receiver Operating Characteristic Curve (AUC) values exceeding 0.90 on test data [16]. Similarly, Sun et al. (2024) used a Support Vector Machine (SVM) model, optimizing its hyperparameters with a Bayesian algorithm, to achieve a high predictive accuracy of 96.32% on the test set for highways on the Qinghai–Tibet Plateau [17]. These successful applications highlight the robust capability of ML methods.

However, a key challenge persists in large-scale LSM research: spatial heterogeneity. Most studies tend to apply a “one-size-fits-all” model to a large area, such as an entire state or a major river basin [18,19,20]. This approach relies on the implicit assumption of spatial stationarity, implying that the relationship between landslide occurrence and its conditioning factors remains consistent across the entire study area. However, this assumption is frequently untenable in areas characterized by significant environmental heterogeneity, where the dominant conditioning factors may exhibit substantial variations among different sub-regions [21]. A global model, which seeks to identify general patterns, essentially “smooths out” the diverse local relationships, producing a generalized or averaged trend. Consequently, such models may exhibit suboptimal predictive performance at local scales, leading to an underestimation of risk in some areas and an overestimation in others [22,23]. Therefore, achieving effective LSM in large-scale, spatially heterogeneous regions represents a critical and urgent scientific challenge.

To address this issue, a promising approach is to partition large, heterogeneous region into several smaller, relatively homogeneous sub-regions for independent modeling. This strategy has been explored in recent research. For example, Yu et al. (2020) [24] applied Geographically Weighted Regression to their study of China’s Three Gorges Reservoir area. Based on the spatial patterns of key environmental factors such as elevation and distance to rivers, they partitioned the area into 18 predictive units and then developed a separate susceptibility model for each [24]. Similarly, Triplett et al. (2025) partitioned the state of Minnesota into five regions based on landscape characteristics, then built a distinct susceptibility model for each region to better address landslide issues in its post-glacial landscape [25]. While these pioneering studies effectively underscore the necessity of partitioning based on geographical homogeneity, the criteria for defining these partitions often rely on a limited set of variables or on landscape classifications specific to a particular environment. This leaves an important gap in exploring whether a robust framework for geographical stratification can yield more reliable and interpretable results. The key challenge, therefore, is to identify a partitioning basis that is not only data-driven but also conceptually grounded in the integrated ecosystem characteristics—encompassing climate, geology, soils, and land use—that collectively govern landslide processes.

This study selected the state of Oregon, a region with significant spatial heterogeneity, to investigate an ecoregion-based modeling strategy using the eXtreme Gradient Boosting (XGBoost) algorithm [26,27,28], which was chosen for its strong predictive ability. We used the Level III Ecoregions classification system developed by the U.S. Environmental Protection Agency (EPA) as the basis for regional partitioning [29]. The primary objectives of this research were: (1) to develop a statewide landslide susceptibility model for Oregon and customized models for its individual Level III ecoregions; (2) to comprehensively compare the two modeling strategies by evaluating their quantitative performance, spatial patterns, and risk identification capabilities; and (3) to analyze and compare the dominant landslide conditioning factors within different ecoregions to reveal the spatial non-stationarity of landslide-controlling mechanisms.

2. Materials and Methods

2.1. Study Area and Ecoregions

The study area is the state of Oregon, USA (Figure 1), with an area of about 250,000 km². The state exhibits a pronounced climatic gradient: the western part is influenced by a maritime climate with abundant precipitation, whereas the eastern part lies in the rain shadow of the Cascade Range and is comparatively arid. This climatic disparity has sculpted a complex and diverse geomorphological landscape, ranging from forest-covered coastal mountains and glacier-capped volcanic peaks to high plains dominated by shrubs and grasslands, and it also features a variety of landforms, including fertile agricultural river valleys, coastal beaches, desert salt flats, and wetlands [30,31]. This intricate geological and environmental context results in the widespread distribution of landslide hazards, with potentially significant regional variations in their formative mechanisms [32,33]. Oregon is divided into 10 Level III ecoregions, which were consolidated into seven modeling units for the purpose of this study.

The Ecoregion classification system developed by EPA was used as the basis for spatial partitioning. Ecoregions are defined as areas of general similarity in ecosystems and in the type, quality, and quantity of environmental resources. This system scientifically classifies areas by identifying differences in ecological carrying capacity and the potential response of the environment to natural or anthropogenic disturbances. The delineation of ecoregion boundaries is based on the core principle of integrated analysis, which involves analyzing the spatial patterns and combinations of a suite of biotic and abiotic factors that collectively reflect an ecosystem’s status. These factors typically include geology, geomorphology, climate, soils, and land use [34,35,36]. Level III ecoregions were selected for this study to balance macro-scale patterns with local characteristics. This level was deemed optimal because lower levels, such as Level I and Level II (both with three regions), are too coarse to capture Oregon’s significant environmental variations. Conversely, the higher-level Level IV classification is overly detailed (65 regions), which would hinder uniform pattern analysis at the statewide scale.

The seven Level III ecoregions exhibit significant differences in their environmental characteristics. The Coast Range is the most landslide-active region, with extreme maritime rainfall and fragile Tertiary sedimentary geology. The Cascades feature steep volcanic topography and loose volcanic ash soils, where high rainfall and snowmelt readily trigger debris flows. The Eastern Cascades Slopes and Foothills form a semi-arid transition zone with sparse vegetation and erodible soils, rendering the area highly sensitive to anthropogenic disturbances such as road construction. The Willamette Valley ecoregion and the Oregon portion of the Puget Lowland/Strait of Georgia, the latter of which covers only a very small area within the state, were combined to form a single composite modeling unit named the “Willamette–Georgia–Puget Lowland”. This region is a low-lying agricultural and population center where landslides primarily occur on localized slopes affected by river erosion and human engineering activities. The Klamath Mountains are distinguished by a highly complex and fragmented geological and tectonic setting, which creates an inherent, landslide susceptibility. The Columbia Plateau is arid, with landslides largely confined to the slopes of incised river valleys affected by human hydrological activities such as irrigation. Finally, the Blue Mountains, Northern Basin and Range, and Snake River Plain ecoregions were consolidated into a composite modeling unit. This decision was based on two primary factors. First, these adjacent areas share a semi-arid to arid climate and a complex geological background. Second, historical landslide data in this part of the state is sparse, making robust individual modeling for each ecoregion challenging. The goal of this consolidation was to develop a robust model for Oregon’s eastern geological environments, where landslides are relatively infrequent. This composite region is hereafter referred to as the ‘Blue Mountains Complex’.

2.2. Data Preparation

2.2.1. Landslide Inventory and Sample Construction

The landslide inventory data were sourced from the official Statewide Landslide Information Database for Oregon (SLIDO), maintained by the Oregon Department of Geology and Mineral Industries (DOGAMI) [37]. We used the latest version, SLIDO-4.5, from which 15,386 landslide points were extracted as positive samples. For the global model, negative samples (non-landslide points) were generated using a random sampling method with buffer exclusion. First, a 5 km radius buffer was created around each landslide point. Subsequently, an equal number of 15,386 negative samples were randomly generated outside these buffer zones. The final dataset for the global model comprised 30,772 samples with a 1:1 positive-to-negative sample ratio. The buffer radius was selected based on previous studies and the typical impact area of landslides in the region [38], aiming to mitigate sampling risks associated with the influence zones of known landslides and potential inventory incompleteness.

Within each ecoregion, all landslide points were selected as positive samples. An equal number of negative samples were then generated within that ecoregion using the identical strategy applied to the global model (buffer exclusion and random sampling). This independent construction process ensured that each partitioned dataset reflects its local geographical characteristics, while maintaining methodological comparability between the local models and the global model.

To ensure all models were trained and evaluated under identical conditions, a standardized dataset partitioning scheme was adopted. The entire process was based on sample labels (landslide = 1, non-landslide = 0) and used a fixed random seed (random_state = 42) to guarantee reproducibility. Both the total dataset for the global model and the independent sample sets for ecoregions were randomly split into a training set (64%), a validation set (16%), and a test set (20%). All sets were used as follows: the training set for model parameter learning; the validation set for hyperparameter tuning and overfitting prevention; and the test set, which was withheld during training and tuning, for a final, objective evaluation of the model’s generalization capability.

2.2.2. Conditioning Factors

Based on landslide formation mechanisms, previous research, and data availability for the study area, a total of 12 environmental variables were selected as initial factors (see Table 1) [39,40,41]. These factors were grouped into four categories, and their spatial distributions are shown in Figure 2. The categories are as follows: (1) Topographic and geomorphological factors: Elevation, Slope, Aspect, Curvature, Topographic Wetness Index (TWI), and Distance to Rivers (Dist_Rivers); (2) Climatic factors: Mean Annual Precipitation (MAP) and Mean Annual Temperature (MAT); (3) Surface cover factors: Lithology, Soil Type, and Land Cover; and (4) Anthropogenic factor: Distance to Roads (Dist_Roads). The codes and corresponding full names for the classes within the categorical factors (e.g., Lithology, Soil Type) are detailed in Table 2. All environmental variables were processed into 30-m resolution raster data layers and standardized to the NAD_1983_UTM_Zone_10N projected coordinate system to ensure spatial consistency [42]. These preprocessing steps were performed using ArcMap 10.8 and Python 3.9.

2.3. Methodological Workflow

The technical workflow of this study is illustrated in Figure 3 and consists of three main stages: (1) Data Preparation, which includes 12 conditioning factors (fully consistent with details in Figure 2) and the factor filtering process; (2) Model Development, which involves variable selection followed by the constructing one global and seven local models using the XGBoost algorithm; and (3) Validation and Comparison, which involves systematically comparing the two modeling strategies based on model performance, risk level classification, and factor contribution analysis.

2.4. Multicollinearity Analysis Method

In landslide susceptibility modeling, high correlation among environmental variables—known as multicollinearity—can compromise model stability and interpretability, hindering an accurate assessment of each variable’s independent contribution. To select independent and effective predictors, this study used a two-step diagnostic approach combining the Pearson Correlation Coefficient and the Variance Inflation Factor (VIF) [43,44].

First, the Pearson Correlation Coefficient (r) was used for an initial screening. This metric measures the strength and direction of the linear relationship between two continuous variables. Its formula is:

r_{x y} = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sqrt{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}} \sqrt{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}}

(1)

where

x_{i}

and

y_{i}

are the observed values of two different variables,

\bar{x}

and

\bar{y}

are their respective means, and n is the number of samples. The value of r ranges from −1 to 1; the closer its absolute value is to 1, the stronger the linear correlation between the two variables, and vice versa.

Second, the Variance Inflation Factor (VIF) was used for a more comprehensive assessment. VIF quantifies the extent to which an independent variable can be explained by all other independent variables in the model, thereby providing a more effective diagnosis of multicollinearity in a multiple regression context. The VIF for the i-th variable is calculated as:

{VIF}_{i} = \frac{1}{1 - R_{i}^{2}}

(2)

where

R_{i}^{2}

is the coefficient of determination from a linear regression of the i-th variable on all other independent variables. The VIF value ranges from

[1, + \infty)

, with a baseline of 1 indicating complete independence (no collinearity). A higher VIF value signifies more severe multicollinearity.

2.5. eXtreme Gradient Boosting (XGBoost) Model

This study used the eXtreme Gradient Boosting (XGBoost) algorithm as the core model for LSM. Proposed by Chen and Guestrin (2016), XGBoost is an advanced, efficient, and scalable implementation of the Gradient Boosting Decision Tree (GBDT) algorithm [45]. It has been widely recognized for its exceptional predictive accuracy in complex classification and regression tasks [46]. The algorithm sequentially builds an ensemble of decision trees, where each new tree is trained to correct the residuals of the preceding ones. The core advantage of XGBoost lies in its objective function, which incorporates a regularization term to control model complexity and effectively prevent overfitting. In the t-th iteration, the objective function can be expressed as:

{Obj}^{(t)} = \sum_{i = 1}^{n} l (y_{i}, ŷ_{i}^{(t - 1)} + f_{t} (x_{i})) + Ω (f_{t})

(3)

where l is the loss function measuring the discrepancy between the predicted and true values,

y_{i}

is the true label of the i-th sample,

ŷ_{i}^{(t - 1)}

is the cumulative prediction from the previous t−1 iterations, and

f_{t} (x_{i})

is the output of the t-th tree.

Ω (f_{t})

is the regularization term that penalizes model complexity, defined as:

Ω f_{t} = γ T + \frac{1}{2} λ {‖w‖}^{2}

(4)

Here, T is the number of leaf nodes in the decision tree, and w is the vector of leaf weights. γ and λ are hyperparameters that control the penalty strength. To optimize model performance, this study utilized a combination of k-fold cross-validation and grid search for hyperparameter tuning on the validation set. All model development and analysis were conducted in a Python 3.9 environment, primarily using the XGBoost library (version 2.1.3).

2.6. Evaluation Metrics

To evaluate the models, we established a holistic framework that incorporated not only conventional predictive performance metrics but also specialized spatial validation metrics.

First, to assess model predictive performance, a suite of metrics based on the Confusion Matrix and the Receiver Operating Characteristic (ROC) curve was used. The confusion matrix clearly illustrates the correspondence between model predictions and true labels through four categories: True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN) [47]. Based on this, we calculated several metrics—including Accuracy, Precision, Recall, and F1-Score—to assess the model’s classification effectiveness from different perspectives. Furthermore, to evaluate the model’s overall discriminative ability across various classification thresholds, we plotted the ROC curve, which depicts the relationship between the True Positive Rate (TPR, i.e., Recall) and the False Positive Rate (FPR) [48]. The Area Under the Curve (AUC) serves as a quantitative measure of the ROC curve’s performance. A value closer to 1 indicates superior model performance, while a value approaching 0.5 suggests performance no better than random chance. The evaluation metrics used in this study are defined as follows:

A c c u r a c y = \frac{TP + TN}{TP + TN + FP + FN}

(5)

P r e c i s i o n = \frac{TP}{TP + FP}

(6)

R e c a l l = \frac{TP}{TP + FN}

(7)

F 1 S c o r e = 2 \times \frac{Precision \times Recall}{Precision + Recall}

(8)

Second, two key metrics were used to assess the spatial predictive efficacy and distributional validity of the susceptibility maps: Landslide Density (LD) and the Total Deviation Index (TDI).

The LD was used to quantify the predictive efficiency of each susceptibility class. For the i-th susceptibility class, its LD (

{L D}_{i}

) is calculated as:

L D_{i} = \frac{N_{L S i} / N_{L S_T o t a l}}{A_{i} / A_{T o t a l}}

(9)

where

N_{L S i}

is the number of landslide points within the i-th susceptibility class;

N_{L S_T o t a l}

is the total number of landslide points in the study area;

A_{i}

is the area covered by the i-th susceptibility class; and

A_{T o t a l}

is the total area of the study region. The theoretical range of this metric is

[0, + \infty)

. An LD value of 1 indicates that the landslide density within that class is equal to the average landslide density of the study area. A value greater than 1 signifies a density higher than the average, while a value less than 1 indicates a density lower than the average. The higher the LD value, the higher the predictive efficiency of that risk class.

The TDI was used to measure the overall inconsistency in the spatial patterns between two classified maps. It is calculated by taking half the sum of the absolute differences in the area percentages for all corresponding risk classes between the two models. The formula is as follows:

T D I = \frac{1}{2} \sum_{i = 1}^{k} |P_{A r e a_L o c a l, i} - P_{A r e a_G l o b a l, i}| \times 100 %

(10)

where k is the total number of susceptibility classes (in this study, k = 5);

P_{A r e a_L o c a l, i}

is the percentage of the area occupied by the i-th susceptibility class in the local model map; and

P_{A r e a_G l o b a l, i}

is the percentage of the area occupied by the i-th susceptibility class in the global model map. The TDI value ranges from 0% to 100%. A value of 0% indicates perfect spatial agreement between the two maps, whereas 100% signifies complete disagreement. The result can be interpreted as the percentage of the area that would need to be reclassified to make the spatial distributions of the two maps identical. A higher TDI value indicates a greater discrepancy between the assessment results of the two strategies.

3. Results

3.1. Multicollinearity Analysis

The multicollinearity analysis resulted in the selection of a final set of independent variables for the global model and for each of the seven local models.

Figure 4 presents the Pearson correlation matrix for the global model. The correlation coefficient (r) ranges from −1 (strong negative correlation) to +1 (strong positive correlation), where the strength of the correlation increases as its absolute value, |r|, approaches 1. The analysis revealed a strong negative correlation between Elevation and Temperature (r = −0.88), which exceeded the critical threshold for high correlation (|r| ≥ 0.7). Given that elevation is a primary conditioning factor influencing slope stability, the Temperature factor was removed to avoid multicollinearity. A VIF test conducted on the remaining 11 variables confirmed that all values were below 5.0, well under the commonly accepted threshold of 10. This indicated that no significant multicollinearity issues remained among the selected variables.

The same multicollinearity analysis was performed independently for the conditioning factors within each of the seven ecoregions. Due to differences in the natural conditions of each ecoregion, their variable correlation patterns varied, leading to the removal of different variables for each local model. After targeted variable selection for each ecoregion, the final set of input variables for all seven local models was confirmed to have VIF values below 5.0, further mitigating potential multicollinearity. The final selection of factors for each model is shown in Figure 5, where a dot indicates that the factor (y-axis) was used in the corresponding model (x-axis).

3.2. Conditioning Factor Importance Analysis

As shown in Figure 6, the dominant conditioning factors for landslides exhibited significant spatial heterogeneity across the ecoregions, a variability closely linked to each region’s unique geological environment and the intensity of human activity. In the Coast Range and Klamath Mountains, the fragile geology and complex lithology made Soil Type a core conditioning factor, while Distance to Roads highlighted the exacerbating effect of human disturbance on unstable geological conditions. In the Cascades, the combination of loose volcanic ash soils and steep terrain rendered Distance to Roads a key factor. Similarly, in the Eastern Cascades Slopes and Foothills, sparse vegetation and erodible soils offered minimal resistance to anthropogenic excavation, making Distance to Roads equally important. For the Willamette–Georgia–Puget Lowland and the Columbia Plateau, although the causes differ (river erosion and urban development in the former, agricultural irrigation in the latter), both demonstrated that human modification of the natural environment was the primary driver of landslides, making Land Cover the decisive factor in these two regions. In the Blue Mountains Complex, where landslide events are sparse, Distance to Roads emerged as the primary factor, indicating that the limited number of landslides were mainly triggered by localized human-induced cut-slope disturbances. In stark contrast to these models that reveal localized mechanisms, the global model identified MAP as the primary conditioning factor. This reflected the macro-scale climatic gradient that governed the regional distribution pattern of landslides across the entire state.

3.3. Model Performance Comparison

The performance of each model on the independent test set is presented in Figure 7 and Table 3. All models achieved AUC values greater than 0.93. The “Blue Mountains Complex” local model recorded the highest AUC value at 0.9882, followed closely by the “Entire Oregon” global model with an AUC of 0.9864. In terms of the F1-score, which balances precision and recall, the “Entire Oregon” global model achieved the highest score (0.9447). The “Columbia Plateau” local model exhibited the lowest recall (0.7667) and F1-score (0.8364) among all models.

Although all models demonstrated high predictive accuracy based on conventional metrics like AUC and F1-score, these macro-level indicators did not fully reveal the fundamental differences in spatial prediction efficiency and risk level classification. To this end, the spatial predictive efficiency of the susceptibility maps was evaluated using the LD and TDI metrics.

3.4. Spatial Distribution of Susceptibility

Global and local landslide susceptibility maps were generated by independently classifying the susceptibility indices for each region using the Natural Breaks (Jenks) method. This approach ensured that each regional map had its own unique class thresholds. This method divides the continuous values into five classes, reflecting the natural clustering characteristics within each area. Figure 8 illustrates the landslide susceptibility maps generated from the seven local models, which revealed distinct spatial patterns that directly correlate with each ecoregion’s unique geographical environment and landslide mechanisms. Coast Range (Figure 8a): High-susceptibility zones were widely distributed in mountainous areas with significant topographic relief. Cascades (Figure 8b) and Eastern Cascades Slopes and Foothills (Figure 8c): High-susceptibility zones were primarily distributed in linear and zonal patterns along road networks. Willamette–Georgia–Puget Lowland (Figure 8d): In this relatively flat region, high-susceptibility zones were scattered, concentrated in areas of local topographic variation such as riverbanks, terrace edges, and isolated hills. Klamath Mountains (Figure 8e): The spatial pattern of high susceptibility was complex, showing an overlay of block-like patterns related to lithology and soil, and linear patterns influenced by roads. Columbia Plateau (Figure 8f): High-susceptibility zones were extremely localized, mainly concentrated along the sides of incised canyons. Blue Mountains Complex (Figure 8g): High-susceptibility zones were also prominently distributed along road and river valley networks.

Figure 9 compares the statewide susceptibility maps produced by the two modeling strategies. Figure 9a shows the composite statewide susceptibility map, which was created by mosaicking the seven independently classified local maps (Figure 8). To ensure a fair comparison, the map in Figure 9b was generated through a region-specific classification of the global model’s output. The process involved clipping the statewide continuous susceptibility map into seven subsets based on ecoregion boundaries. Each subset was then reclassified using the unique Natural Breaks thresholds derived from its corresponding local model. These seven reclassified subsets were subsequently mosaicked to produce the final map in Figure 9b. This protocol guarantees that the subsequent qualitative pattern assessment (Section 3.4) and quantitative accuracy validation (Section 3.5) are founded on an impartial and directly comparable basis.

At a macro level, both maps show similar trends, indicating that the overall susceptibility in western Oregon is higher than in the east. However, the mosaicked local model map (Figure 9a) exhibits greater spatial detail, particularly in the eastern part of the state, where it clearly delineates linear high-risk zones along roads and rivers. In contrast, the global model (Figure 9b) produces a smoother and more generalized pattern with less distinct boundaries for high-risk zones. For instance, in the western region, the global model classifies large areas as continuous “Very High” risk zones, whereas the local models identify more internal variation within the same area. Similarly, the global model lacks the ability to capture the linear high-susceptibility features that the local models identified in the eastern region.

3.5. Analysis of Spatial Pattern Differences

Building on the qualitative pattern assessment in Section 3.4, we applied the established evaluation protocol to quantitatively validate the models’ accuracy. This protocol eliminates biases from the classification method, ensuring that the subsequent LD and TDI calculations reflect genuine differences in model performance.

As illustrated in Figure 10, the two modeling strategies yielded significant differences in susceptibility classification and landslide identification. In the humid western regions (e.g., Coast Range and Willamette–Georgia–Puget Lowland), the global model tended to over-predict risk, classifying large areas as “High” and “Very High” without a corresponding gain in landslide capture. In the arid eastern regions (e.g., Columbia Plateau), it tended to under-predict risk, designating over 90% of the area as “Very Low” and consequently failing to identify many landslides captured by the local model’s higher-risk zones. In contrast, the local models performed consistently across all ecoregions, showing a sharp increase in identified landslides with each successive susceptibility level.

The LD analysis (Table 4) further confirms these findings. In the four landslide-prone ecoregions in the west, the local models consistently outperformed the global model in the “Very High” risk class, particularly in the Willamette–Georgia–Puget Lowland, where the LD values were 6.87 and 1.94, respectively. However, in the three eastern arid to semi-arid ecoregions, the global model exhibited higher LD values in the “Very High” risk class.

The TDI results (Table 4) revealed the spatial discrepancies between the susceptibility maps generated by the two strategies. The TDI values were high across all ecoregions, with particularly prominent values in the Columbia Plateau (43.3%), Willamette–Georgia–Puget Lowland (35.11%), and Coast Range (34.13%). In these regions, over a third of the land area was classified into different risk levels by the two models. This result clearly demonstrated that the “smoothing” effect of the global model failed to capture critical local geographical features, leading to a spatial risk zonation that severely diverged from the more realistic patterns produced by the local models.

4. Discussion

In this study, all models achieved AUC values above 0.93, demonstrating the strong performance and suitability of the XGBoost algorithm for this study area. However, a high AUC value did not directly equate to a highly practical susceptibility map. The global model, despite its high AUC value (0.9864), exhibited poor predictive performance in critical regions. This phenomenon revealed the potential limitations of over-relying on macro-level metrics like AUC. As Lobo et al. (2008) pointed out, AUC has several inherent flaws in the context of spatial prediction models [49]. While AUC is effective at evaluating a model’s overall ranking ability, in large-scale study areas, its value can be easily inflated by vast stable areas (such as eastern Oregon), thereby masking the model’s underperformance in critical high-risk zones. In the landslide-prone ecoregions of western Oregon, local models were proven to be significantly more effective than the global model. For instance, the LD analysis showed that in the Willamette–Georgia–Puget Lowland, the local model’s LD value was over 3.5 times that of the global model. Furthermore, high TDI values (some exceeding 35%) revealed fundamental discrepancies in the spatial patterns produced by the two modeling strategies. Therefore, we propose the establishment of a comprehensive framework for evaluating LSMs that incorporates a suite of metrics, including AUC, LD, and TDI. Such a framework is crucial to ensure that assessment results genuinely reflect a map’s practical value for disaster prevention and mitigation.

The spatial non-stationarity of landslide-driving mechanisms is a common issue in large-scale regions [22,50]. The global model identified “Mean Annual Precipitation” as the primary conditioning factor, which reflected the macro-scale climatic gradient across the state but was essentially a generalized conclusion that “smoothed out” diverse local relationships. In contrast, the ecoregion-based modeling, by conducting analyses within relatively homogeneous units, revealed more nuanced and diverse local dominant conditioning factors. An interesting finding was that variables related to human activity (“Distance to Roads” and “Land Cover”) were key drivers of landslide risk in six of the seven ecoregions. This finding was consistent with several previous studies in this region. Forest management history, such as logging and road construction, can exert a greater influence on landslide frequency and scale than even extreme weather events. This conclusion is supported by long-term research from the H.J. Andrews Experimental Forest in Oregon, led by Catalina Segura [51]. The classic study by Wemple, Jones, and Grant (1996) revealed the physical mechanism: the construction of forest roads significantly increases landslide frequency by altering local hydrological pathways [52]. This indicated that while the natural environment provided the foundational conditions for landslide development, high-intensity human activities had become the most direct and active drivers of risk. The primary advantage of the partitioned approach was its ability to capture the critical, human-dominated local geomorphic processes that were overlooked by the global model.

When addressing spatial heterogeneity, a central challenge in partitioned modeling is how to scientifically define the “local” units for analysis. Compared to methods based on artificial administrative boundaries or single geographic variables, the EPA Level III Ecoregions used in this study provide a more physically interpretable basis for modeling, as these units are delineated through an integrated analysis of multiple interacting factors, including geology, climate, and soils. The results strongly validated the effectiveness of this approach. The models for different ecoregions not only identified distinct combinations of conditioning factors (Figure 6) but also generated susceptibility maps with unique spatial “fingerprints” that closely correspond to these dominant factors (Figure 8). For example, in the Cascades, where road influence is dominant, high-susceptibility zones exhibit a clear linear pattern. In contrast, in the Coast Range, where Soil Type is the primary factor, high-susceptibility zones show a block-like distribution. These significantly different spatial patterns indicated that each local model had captured the dominant geomorphic processes within its region. Therefore, we concluded that ecoregion classification provided a scientific and effective geographical partitioning framework for addressing spatial heterogeneity in large-scale LSM, which could significantly enhance the local accuracy and reliability of the results.

This study also faces several limitations that require further exploration. Regarding the landslide inventory, this research relies on a point-based landslide inventory, which may be subject to incompleteness and simplifies the true morphology of landslides. Second, concerning spatial resolution, while standardizing all factors to a 30 m resolution proved to be an effective strategy for this study area, coarser-resolution source data, particularly for climate, may not fully capture local rainfall variations in complex terrain even when resampled to 30 m. Future research could address these limitations by utilizing polygon-based inventories to better represent landslide morphology and integrating ground-based rain gauge measurements to enhance the precision of the precipitation factor.

5. Conclusions

This study demonstrates the significant superiority of an ecoregion-based partitioned approach for large-scale LSM. This conclusion is based on a systematic comparison with a single global model in the geographically heterogeneous state of Oregon, USA. This research demonstrated that while all models achieved high AUC values (>0.93), this macro-level metric concealed critical differences in spatial predictive performance. Local models demonstrated vastly superior predictive performance in the key landslide-prone western regions, as measured by the Landslide Density (LD) index. In the Willamette–Georgia–Puget Lowland, for example, the local model’s LD value in the ‘Very High’ risk class was over 3.5 times that of the global model. Furthermore, the risk maps generated by the two strategies exhibited fundamental disagreements in their spatial patterns (TDI > 35% in some regions), highlighting the practical value of the partitioned models in accurately identifying high-risk zones. This advantage stemmed from the ability of partitioned modeling to address spatial non-stationarity. While the global model tended to identify macro-climatic factors (e.g., precipitation), the local models revealed diverse combinations of conditioning factors within different ecoregions. A core finding of this study is that anthropogenic variables, specifically “Distance to Roads” and “Land Cover,” have emerged as the dominant drivers of landslide risk across the majority of ecoregions. Therefore, this research validates the use of ecoregion classification as an effective scientific framework for partitioning in large-scale LSM to address spatial heterogeneity. This study underscored the importance of abandoning a “one-size-fits-all” global approach in geographically complex regions. A partitioned strategy adapted to local characteristics is the critical pathway to achieving accurate risk assessment and formulating effective disaster mitigation policies. Future work could build upon the partitioned framework presented in this study by incorporating additional environmental variables and advanced deep learning models to further refine prediction accuracy. The potential for transferability of this approach stems from its use of ecologically homogeneous units to address spatial non-stationarity. This methodology is applicable to other large, heterogeneous regions, provided that a scientifically sound partitioning framework, a reliable landslide inventory, and relevant conditioning factor data are available.

Author Contributions

Conceptualization, Z.X. and W.Z.; methodology, Z.X. and P.Z.; software, Z.X. and J.Y.; validation, Z.X. and X.S.; investigation, Z.X. and W.W.; resources, Z.X., J.D. and J.G.; data curation, Z.X. and H.Y.; visualization, Z.X. and P.Z.; supervision, Z.Z.; writing—original draft preparation, Z.X.; writing—review and editing, W.Z. and J.J.; project administration, J.J.; funding acquisition, W.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the Science and Technology Projects of the Xizang Autonomous Region, China (No. XZ202402ZD0001); the Major Program of Gansu Joint Scientific Research Fund (Grant No. 25JRRA1106): Development of a High-Precision Short-Term Forecasting System for New Energy Power Generation in Gansu; and the Basic Research Program of Qinghai Province (2024-ZJ-904).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Acknowledgments

The authors acknowledge the use of Gemini 2.5 Pro (Google) and DeepSeek (available online: https://www.deepseek.com/, accessed on 1 August 2025) for their valuable assistance in refining the language and enhancing the clarity of this manuscript. These tools played a supplementary role in improving the overall readability and presentation of the paper.

Conflicts of Interest

The authors declare no conflicts of interest. The funding sources had no involvement in the study’s design, data collection, analysis, or interpretation, nor in the writing of the manuscript or the decision to publish the results.

References

Turner, A.K. Social and environmental impacts of landslides. Innov. Infrastruct. Solut. 2018, 3, 70. [Google Scholar] [CrossRef]
Carrión-Mero, P.; Montalván-Burbano, N.; Morante-Carballo, F.; Quesada-Román, A.; Apolo-Masache, B. Worldwide Research Trends in Landslide Science. Int. J. Environ. Res. Public Health 2021, 18, 9445. [Google Scholar] [CrossRef] [PubMed]
Song, C.; Yu, C.; Li, Z.; Utili, S.; Frattini, P.; Crosta, G.; Peng, J. Triggering and recovery of earthquake accelerated landslides in Central Italy revealed by satellite radar observations. Nat. Commun. 2022, 13, 7278. [Google Scholar] [CrossRef] [PubMed]
Li, S.; Feng, W.; Yi, X.; Liu, K.; Guo, C.; Tang, X.; Wu, Z. Clustered shallow landslides triggered by heavy rainfall in May 2022 in Wuping County, Fujian Province, China. Bull. Eng. Geol. Environ. 2025, 84, 257. [Google Scholar] [CrossRef]
Mirus, B.B.; Jones, E.S.; Baum, R.L.; Godt, J.W.; Slaughter, S.; Crawford, M.M.; Lancaster, J.; Stanley, T.; Kirschbaum, D.B.; Burns, W.J.; et al. Landslides across the USA: Occurrence, susceptibility, and data limitations. Landslides 2020, 17, 2271–2285. [Google Scholar] [CrossRef]
Mirus, B.B.; Belair, G.M.; Wood, N.J.; Jones, J.; Martinez, S.N. Parsimonious High—Resolution Landslide Susceptibility Modeling at Continental Scales. AGU Adv. 2024, 5, e2024AV001214. [Google Scholar] [CrossRef]
Chicas, S.D.; Li, H.; Mizoue, N.; Ota, T.; Du, Y.; Somogyvári, M. Landslide susceptibility mapping core-base factors and models’ performance variability: A systematic review. Nat. Hazards 2024, 120, 12573–12593. [Google Scholar] [CrossRef]
Chang, Z.; Huang, J.; Huang, F.; Bhuyan, K.; Meena, S.R.; Catani, F. Uncertainty Analysis of Non-Landslide Sample Selection in Landslide Susceptibility Prediction Using Slope Unit-Based Machine Learning Models. Gondwana Res. 2023, 117, 307–320. [Google Scholar] [CrossRef]
Agboola, G.; Beni, L.H.; Elbayoumi, T.; Thompson, G. Optimizing landslide susceptibility mapping using machine learning and geospatial techniques. Ecol. Inf. 2024, 81, 102583. [Google Scholar] [CrossRef]
Zhu, A.X.; Wang, R.; Qiao, J.; Qin, C.Z.; Chen, Y.; Liu, J.; Du, F.; Lin, Y.; Zhu, T. An expert knowledge-based approach to landslide susceptibility mapping using GIS and fuzzy logic. Geomorphology 2014, 214, 128–138. [Google Scholar] [CrossRef]
Shano, L.; Raghuvanshi, T.K.; Meten, M. Landslide susceptibility mapping using frequency ratio model: The case of Gamo highland, South Ethiopia. Arab. J. Geosci. 2021, 14, 623. [Google Scholar] [CrossRef]
Jennifer, J.J.; Saravanan, S.; Abijith, D. Application of Frequency Ratio and Logistic Regression Model in the Assessment of Landslide Susceptibility Mapping for Nilgiris District, Tamilnadu, India. Indian Geotech. J. 2021, 51, 773–787. [Google Scholar] [CrossRef]
Ado, M.; Amitab, K.; Maji, A.K.; Jasińska, E.; Gono, R.; Leonowicz, Z.; Jasiński, M. Landslide susceptibility mapping using machine learning: A literature survey. Remote Sens. 2022, 14, 3029. [Google Scholar] [CrossRef]
Shao, X.; Yan, W.; Yan, C.; Zhao, W.; Wang, Y.; Shi, X.; Dong, H.; Li, T.; Yu, J.; Zuo, P.; et al. Explainable Machine Learning for Mapping Rainfall-Induced Landslide Thresholds in Italy. Appl. Sci. 2025, 15, 7937. [Google Scholar] [CrossRef]
Zuo, P.; Zhao, W.; Yan, W.; Jin, J.; Yan, C.; Wu, B.; Shao, X.; Wang, W.; Zhou, Z.; Wang, J. Landslide Susceptibility Mapping Using an LSTM Model with Feature-Selecting for the Yangtze River Basin in China. Water 2025, 17, 167. [Google Scholar] [CrossRef]
Chowdhury, M.S.; Rahman, M.N.; Sheikh, M.S.; Sayeid, M.A.; Mahmud, K.H.; Hafsa, B. GIS-based landslide susceptibility mapping using logistic regression, random forest and decision and regression tree models in Chattogram District, Bangladesh. Heliyon 2024, 10, e23424. [Google Scholar] [CrossRef]
Sun, K.; Li, Z.; Wang, S.; Hu, R. A support vector machine model of landslide susceptibility mapping based on hyperparameter optimization using the Bayesian algorithm: A case study of the highways in the southern Qinghai–Tibet Plateau. Nat. Hazards 2024, 120, 11377–11398. [Google Scholar] [CrossRef]
Woodard, J.B.; Mirus, B.B.; Crawford, M.M.; Or, D.; Leshchinsky, B.A.; Allstadt, K.E.; Wood, N.J. Mapping Landslide Sus-ceptibility Over Large Regions With Limited Data. J. Geophys. Res. Earth Surf. 2023, 128, e2022JF006810. [Google Scholar] [CrossRef]
Lizárraga, J.J.; Buscarnera, G. Probabilistic modeling of shallow landslide initiation using regional scale random fields. Landslides 2020, 17, 1979–1988. [Google Scholar] [CrossRef]
Loche, M.; Alvioli, M.; Marchesini, I.; Bakka, H.; Lombardo, L. Landslide susceptibility maps of Italy: Lesson learnt from dealing with multiple landslide types and the uneven spatial distribution of the national inventory. Earth-Sci. Rev. 2022, 232, 104125. [Google Scholar] [CrossRef]
Hong, H.; Pradhan, B.; Sameen, M.I.; Kalantar, B.; Zhu, A.; Chen, W. Improving the accuracy of landslide susceptibility model using a novel region-partitioning approach. Landslides 2018, 15, 753–772. [Google Scholar] [CrossRef]
Chalkias, C.; Polykretis, C.; Karymbalis, E.; Soldati, M.; Ghinoi, A.; Ferentinou, M. Exploring spatial non-stationarity in the relationships between landslide susceptibility and conditioning factors: A local modeling approach using geographically weighted regression. Bull. Eng. Geol. Environ. 2020, 79, 2799–2814. [Google Scholar] [CrossRef]
Li, Y.; Huang, S.; Li, J.; Huang, J.; Wang, W. Spatial Non-Stationarity-Based Landslide Susceptibility Assessment Using PCAMGWR Model. Water 2022, 14, 881. [Google Scholar] [CrossRef]
Yu, X.; Gao, H. A landslide susceptibility map based on spatial scale segmentation: A case study at Zigui-Badong in the Three Gorges Reservoir Area, China. PLoS ONE 2020, 15, e0229818. [Google Scholar] [CrossRef]
Triplett, L.D.; Hammer, M.N.; DeLong, S.B.; Gran, K.B.; Jennings, C.E.; Engle, Z.T.; Bartley, J.K.; Blumentritt, D.J.; Breckenridge, A.J.; Day, S.; et al. Factors influencing landslide occurrence in low-relief formerly glaciated landscapes: Landslide inventory and susceptibility analysis in Minnesota, USA. Nat. Hazards 2025, 121, 11799–11827. [Google Scholar] [CrossRef]
Kavzoglu, T.; Teke, A. Predictive Performances of Ensemble Machine Learning Algorithms in Landslide Susceptibility Mapping Using Random Forest, Extreme Gradient Boosting (XGBoost) and Natural Gradient Boosting (NGBoost). Arab. J. Sci. Eng. 2022, 47, 7367–7385. [Google Scholar] [CrossRef]
Sahin, E.K. Comparative analysis of gradient boosting algorithms for landslide susceptibility mapping. Geocart. Int. 2022, 37, 2441–2465. [Google Scholar] [CrossRef]
Kavzoglu, T.; Teke, A. Advanced hyperparameter optimization for improved spatial prediction of shallow landslides using extreme gradient boosting (XGBoost). Bull. Eng. Geol. Environ. 2022, 81, 201. [Google Scholar] [CrossRef]
Ecoregions of North America. Available online: https://www.epa.gov/eco-research/ecoregions-north-america (accessed on 1 July 2025).
Orr, E.L.; Orr, W.N. Oregon Geology, 6th ed.; Oregon State University Press: Corvallis, OR, USA, 2012. [Google Scholar]
Orr, W.N.; Orr, E.L. Geology of the Pacific Northwest, 3rd ed.; Waveland Press: Long Grove, IL, USA, 2018. [Google Scholar]
Oregon Department of Geology and Mineral Industries. Landslide Hazards in Oregon; Oregon Department of Geology and Mineral Industries: Portland, OR, USA, 2008. Available online: https://d3itl75cn7661p.cloudfront.net/dogami/fs/landslide-factsheet.pdf (accessed on 1 August 2025).
Burns, W.J.; Madin, I.P. Protocol for Inventory Mapping of Landslide Deposits from Light Detection and Ranging (Lidar) Imagery; Special Paper 42; Oregon Department of Geology and Mineral Industries: Portland, OR, USA, 2009. Available online: https://d3itl75cn7661p.cloudfront.net/dogami/dds/slido/sp-42_onscreen.pdf (accessed on 1 August 2025).
Omernik, J.M. Ecoregions of the Conterminous United States. Ann. Assoc. Am. Geogr. 1987, 77, 118–125. [Google Scholar] [CrossRef]
Hughes, R.M.; Whittier, T.R.; Rohm, C.M.; Larsen, D.P. A regional framework for establishing recovery criteria. Environ. Manag. 1990, 14, 673–683. [Google Scholar] [CrossRef]
Whittier, T.R.; Hughes, R.M.; Larsen, D.P. Correspondence between ecoregions and spatial patterns in stream ecosystems in Oregon. Can. J. Fish. Aquat. Sci. 1988, 45, 1264–1278. [Google Scholar] [CrossRef]
Statewide Landslide Information Database for Oregon (SLIDO). Available online: https://www.oregon.gov/dogami/slido/Pages/data.aspx (accessed on 1 August 2025).
Gu, T.; Duan, P.; Wang, M.; Li, J.; Zhang, Y. Effects of non-landslide sampling strategies on machine learning models in landslide susceptibility mapping. Sci. Rep. 2024, 14, 7201. [Google Scholar] [CrossRef] [PubMed]
Lee, S. Current and future status of GIS-based landslide susceptibility mapping: A literature review. Korean J. Remote Sens. 2019, 35, 179–193. [Google Scholar] [CrossRef]
El-Haddad, B.A.; Youssef, A.M.; Mahdi, A.M.; Karimi, Z.; Pourghasemi, H.R. Seismic-induced landslides susceptibility mapping of the NEOM area, northwestern Saudi Arabia using machine learning models. Earth Sci. Inform. 2025, 18, 443. [Google Scholar] [CrossRef]
Sun, Y.; Dai, H.L.; Xu, L.; Asaditaleshi, A.; Ahmadi Dehrashid, A.; Adnan Ikram, R.M.; Moayedi, H.; Ahmadi Dehrashid, H.; Thi, Q.T. Development of the artificial neural network’s swarm-based approaches predicting East Azerbaijan landslide susceptibility mapping. Environ. Dev. Sustain. 2025, 27, 6065–6102. [Google Scholar] [CrossRef]
Yu, X.; Chen, H. Research on the influence of different sampling resolution and spatial resolution in sampling strategy on landslide susceptibility mapping results. Sci. Rep. 2024, 14, 1549. [Google Scholar] [CrossRef]
Benesty, J.; Chen, J.; Huang, Y.; Cohen, I. Pearson correlation coefficient. In Noise Reduction in Speech Processing; Springer: Berlin/Heidelberg, Germany, 2009; Volume 2, pp. 1–4. [Google Scholar] [CrossRef]
Daoud, J.I. Multicollinearity and Regression Analysis. J. Phys. Conf. Ser. 2017, 949, 012009. [Google Scholar] [CrossRef]
Chen, T.Q.; Guestrin, C.; Assoc Comp, M. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
Can, R.; Kocaman, S.; Gokceoglu, C. A Comprehensive assessment of XGBoost algorithm for landslide susceptibility mapping in the upper basin of Ataturk Dam, Turkey. Appl. Sci. 2021, 11, 4993. [Google Scholar] [CrossRef]
Yang, J.; Song, C.; Yang, Y.; Xu, C.; Guo, F.; Xie, L. New method for landslide susceptibility mapping supported by spatial logistic regression and GeoDetector: A case study of Duwen Highway Basin, Sichuan Province, China. Geomorphology 2019, 324, 62–71. [Google Scholar] [CrossRef]
Chen, W.; Lei, X.; Chakrabortty, R.; Pal, S.C.; Sahana, M.; Janizadeh, S. Evaluation of different boosting ensemble machine learning models and novel deep learning and boosting framework for head-cut gully erosion susceptibility. J. Environ. Manag. 2021, 284, 112015. [Google Scholar] [CrossRef]
Lobo, J.M.; Jiménez-Valverde, A.; Real, R. AUC: A Misleading Measure of the Performance of Predictive Distribution Models. Glob. Ecol. Biogeogr. 2008, 17, 145–151. [Google Scholar] [CrossRef]
Lu, F.; Zhang, G.; Wang, T.; Ye, Y.; Zhen, J.; Tu, W. Analyzing spatial non-stationarity effects of driving factors on landslides: A multiscale geographically weighted regression approach based on slope units. Bull. Eng. Geol. Environ. 2024, 83, 394. [Google Scholar] [CrossRef]
Forest Landslides’ Frequency, Size Influenced More by Road Building, Logging than Heavy Rain. Available online: https://news.oregonstate.edu/news/forest-landslides%E2%80%99-frequency-size-influenced-more-road-building-logging-heavy-rain (accessed on 1 August 2025).
Wemple, B.C.; Jones, J.A.; Grant, G.E. Channel network extension by logging roads in two basins, Western Cascades, Oregon. J. Am. Water Resour. Assoc. 1996, 32, 1195–1207. [Google Scholar] [CrossRef]

Figure 1. Overview map of the study area.

Figure 2. Spatial distribution of conditioning factors: (a) Elevation; (b) Slope; (c) Aspect; (d) Curvature; (e) Topographic Wetness Index (TWI); (f) Distance to Roads (Dist_Roads); (g) Distance to Rivers (Dist_Rivers); (h) Lithology; (i) Soil Type; (j) Land Cover; (k) Mean Annual Precipitation (MAP); (l) Mean Annual Temperature (MAT).

Figure 3. Technical workflow for partitioned landslide susceptibility mapping.

Figure 4. Pearson correlation coefficient matrix for the initial conditioning factors of the global model.

Figure 5. Final conditioning factors selected for each model. (A) Coast Range; (B) Cascades; (C) Eastern Cascades Slopes and Foothills; (D) Willamette–Georgia–Puget Lowland; (E) Klamath Mountains; (F) Columbia Plateau; (G) Blue Mountains Complex; (H) Entire Oregon (Global Model).

Figure 6. Radar charts of conditioning factor importance for each model: (a) Coast Range; (b) Cascades; (c) Eastern Cascades Slopes and Foothills; (d) Willamette–Georgia–Puget Lowland; (e) Klamath Mountains; (f) Columbia Plateau; (g) Blue Mountains Complex; (h) Entire Oregon (Global Model).

Figure 7. ROC curves and AUC values for each model on the test set.

Figure 8. Landslide susceptibility maps for each ecoregion (generated by local models): (a) Coast Range; (b) Cascades; (c) Eastern Cascades Slopes and Foothills; (d) Willamette–Georgia–Puget Lowland; (e) Klamath Mountains; (f) Columbia Plateau; (g) Blue Mountains Complex.

Figure 9. Comparison of statewide landslide susceptibility maps: (a) Mosaicked result of local models; (b) Result of the global model. Both maps use a unified legend and classification.

Figure 10. Comparison of predictive efficiency between local and global models within each ecoregion. The bar charts show the percentage of the total ecoregion area for each susceptibility class (VL: Very Low, L: Low, M: Moderate, H: High, VH: Very High), and the line graphs show the number of captured landslides. (a) Coast Range; (b) Cascades; (c) Eastern Cascades Slopes and Foothills; (d) Willamette–Georgia–Puget Lowland; (e) Klamath Mountains; (f) Columbia Plateau; (g) Blue Mountains Complex.

Table 1. Data sources and resolution information for the conditioning factors.

Data Name	Data Sources	Resolution/Scale
Landslides Elevation	SLIDO-4.5 GEOHub DEM	- 10 m
Slope	GEOHub DEM	10 m
Aspect	GEOHub DEM	10 m
Topographic Wetness Index	GEOHub DEM	10 m
Curvature	GEOHub DEM	10 m
LandCover	North American Land Cover, 2020 (Landsat, 30 m)	30 m
SoilType	Harmonized World Soil Database version 2.0	1000 m
Lithology	USGS Geologic Maps of US States	1:500,000
Distance to Rivers	National Rivers Inventory (NRI)	1:24,000
Distance to Roads	Oregon All Public Roads Dataset	1:24,000
Mean Annual Temperature	Prism	4000 m
Mean Annual Precipitation	Prism	4000 m

Table 2. Category factor codes and their corresponding full names.

Factors	Code	Description
Lithology	1	Metamorphic, amphibolite
	2	Metamorphic, schist
	3	Ice
	4	Metamorphic, serpentinite
	5	Sedimentary, clastic
	6	Igneous, intrusive
	7	Igneous and Metamorphic, undifferentiated
	8	Igneous, volcanic
	9	Metamorphic, undifferentiated
	10	Igneous and Sedimentary, undifferentiated
	11	Metamorphic and Sedimentary, undifferentiated
	12	Unconsolidated, undifferentiated
	13	Igneous, undifferentiated
	14	Metamorphic, volcanic
	15	Water
Land cover	1	No data
	2	Temperate or sub-polar needleleaf forest
	3	Temperate or sub-polar broadleaf deciduous forest
	4	Mixed forest
	5	Temperate or sub-polar shrubland
	6	Temperate or sub-polar grassland
	7	Wetland
	8	Cropland
	9	Barren land
	10	Urban and built-up
	11	Water
	12	Snow and ice
Soil type	1	Acrisols (umbric)
	2	Acrisols (humic)
	3	Cambisols (umbric)
	4	Phaeozems (greyic)
	5	Phaeozems (luvic)
	6	Kastanozems (haplic)
	7	Kastanozems (luvic)
	8	Luvisols (haplic)
	9	Regosols (eutric)
	10	Solonetz (haplic)
	11	Andosols (vitric)
	12	Vertisols (rendzic)
	13	Regosols (calcaric)
	14	Calcisols (haplic)
	15	Water Bodies
	16	Urban Areas

Table 3. Performance evaluation metrics for the local and global models on the independent test set.

Ecological Zone	Accuracy	Precision	Recall	F1_Score	AUC
Coast Range	0.9368	0.9416	0.9364	0.9390	0.9839
Cascades	0.8469	0.8544	0.8511	0.8528	0.9372
Eastern Cascades Slopes and Foothills	0.9167	0.9000	0.9474	0.9231	0.9690
Willamette–Georgia–Puget Lowland	0.9089	0.9169	0.9049	0.9109	0.9686
Klamath Mountains	0.9183	0.9107	0.9358	0.9231	0.9721
Columbia Plateau	0.8548	0.9200	0.7667	0.8364	0.9719
Blue Mountains Complex	0.9517	0.9315	0.8831	0.9067	0.9882
Entire Oregon	0.9427	0.9324	0.9572	0.9447	0.9864

Table 4. Quantitative comparison of local and global models.

Regions	Level	LD_Local	LD_Global	TDI(%)
Coast Range	Very Low	0.09	0.22	34.13
	Low	0.36	0.34
	Moderate	0.74	0.46
	High	1.15	1.11
	Very High	2.14	1.15
Cascades	Very Low	0.18	0.5	13.48
	Low	0.46	0.49
	Moderate	0.82	0.77
	High	1.73	1.29
	Very High	3.68	1.89
Eastern Cascades Slopes and Foothills	Very Low	0.09	0.19	16.02
	Low	1.54	4.22
	Moderate	3.79	16.11
	High	5.59	38.49
	Very High	8.72	67.13
Willamette–Georgia–Puget Lowland	Very Low	0.13	0.33	35.11
	Low	0.35	0.72
	Moderate	0.94	0.7
	High	2.94	1.13
	Very High	6.87	1.94
Klamath Mountains	Very Low	0.13	0.17	4.82
	Low	0.19	0.22
	Moderate	0.78	0.67
	High	2.65	2.49
	Very High	4.79	3.62
Columbia Plateau	Very Low	0.12	0.37	43.3
	Low	0.29	2.48
	Moderate	1.12	5.58
	High	4.16	16.4
	Very High	21.34	34.67
Blue Mountains Complex	Very Low	0.06	0.17	16.41
	Low	0.64	4.28
	Moderate	2.86	17.72
	High	11.28	87.83
	Very High	29.67	175.24

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, Z.; Zuo, P.; Zhao, W.; Zhou, Z.; Shao, X.; Yu, J.; Yu, H.; Wang, W.; Gan, J.; Duan, J.; et al. Ecoregion-Based Landslide Susceptibility Mapping: A Spatially Partitioned Modeling Strategy for Oregon, USA. Appl. Sci. 2025, 15, 11242. https://doi.org/10.3390/app152011242

AMA Style

Xu Z, Zuo P, Zhao W, Zhou Z, Shao X, Yu J, Yu H, Wang W, Gan J, Duan J, et al. Ecoregion-Based Landslide Susceptibility Mapping: A Spatially Partitioned Modeling Strategy for Oregon, USA. Applied Sciences. 2025; 15(20):11242. https://doi.org/10.3390/app152011242

Chicago/Turabian Style

Xu, Zhixiang, Peng Zuo, Wen Zhao, Zeyu Zhou, Xiangyu Shao, Junpo Yu, Haize Yu, Weijie Wang, Junwei Gan, Jinshun Duan, and et al. 2025. "Ecoregion-Based Landslide Susceptibility Mapping: A Spatially Partitioned Modeling Strategy for Oregon, USA" Applied Sciences 15, no. 20: 11242. https://doi.org/10.3390/app152011242

APA Style

Xu, Z., Zuo, P., Zhao, W., Zhou, Z., Shao, X., Yu, J., Yu, H., Wang, W., Gan, J., Duan, J., & Jin, J. (2025). Ecoregion-Based Landslide Susceptibility Mapping: A Spatially Partitioned Modeling Strategy for Oregon, USA. Applied Sciences, 15(20), 11242. https://doi.org/10.3390/app152011242

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Ecoregion-Based Landslide Susceptibility Mapping: A Spatially Partitioned Modeling Strategy for Oregon, USA

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area and Ecoregions

2.2. Data Preparation

2.2.1. Landslide Inventory and Sample Construction

2.2.2. Conditioning Factors

2.3. Methodological Workflow

2.4. Multicollinearity Analysis Method

2.5. eXtreme Gradient Boosting (XGBoost) Model

2.6. Evaluation Metrics

3. Results

3.1. Multicollinearity Analysis

3.2. Conditioning Factor Importance Analysis

3.3. Model Performance Comparison

3.4. Spatial Distribution of Susceptibility

3.5. Analysis of Spatial Pattern Differences

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI