Next Article in Journal
Pastoral Impact Assessment of Typical Drought Events
Previous Article in Journal
Interpretable Multi-Temporal Landslide Susceptibility Assessment Using Random Forest and Tree-SHAP in the Eastern Himalayan Syntaxis
Previous Article in Special Issue
A Multidimensional Analysis Approach Toward Sea Cliff Erosion Forecasting
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multi-Model Machine Learning Mapping of Gully Erosion Susceptibility in the Heihe Region of the Xiaoxingán Mountains, China

1
Harbin Center for Integrated Natural Resources Survey, China Geological Survey, Harbin 150086, China
2
School of Earth Sciences and Resources, China University of Geosciences (Beijing), Beijing 100083, China
3
Observation and Research Station of Earth Critical Zone in Black Soil, Harbin, Ministry of Natural Resources, Harbin 150086, China
4
College of Geophysics, Chengdu University of Technology, Chengdu 610059, China
5
International Research Center of Big Data for Sustainable Development Goals, Beijing 100094, China
6
Key Laboratory of Digital Earth Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2026, 18(11), 1844; https://doi.org/10.3390/rs18111844
Submission received: 11 April 2026 / Revised: 20 May 2026 / Accepted: 25 May 2026 / Published: 4 June 2026

Highlights

What are the main findings?
  • In Northeast China’s Heihe Mollisol (black soil) belt, anthropogenic factors—land use and the Human Footprint Index—outweighed topography in driving gully erosion, with the highest susceptibility concentrated in the southwestern cultivated lowlands.
  • Tree-based models (XGBoost best, AUC 0.95) outperformed logistic regression, but spatial cross-validation across districts of the Heihe region exposed a 0.11 AUC optimism from random splitting that conventional studies overlook.
What are the implications of the main findings?
  • The Heihe case shows that gully susceptibility mapping in Mollisol farmlands requires spatially explicit validation and interpretable ML to avoid overstated accuracy and to credibly attribute risk to human disturbance.
  • For black-soil regions like Heihe, a 20% NDVI increase could cut high-susceptibility area by 12%, identifying targeted revegetation in the southwestern lowlands (e.g., Beian, Nenjiang) as a practical complement to engineering controls.

Abstract

Gully erosion is a major driver of irreversible soil loss in Northeast China’s Mollisol belt, a region that supplies roughly one-quarter of the national grain output. Existing susceptibility assessments in this region have rarely combined multi-model comparison with spatially explicit cross-validation, and the predictive contribution of composite anthropogenic indicators such as the Human Footprint Index (HFI) has not been quantitatively benchmarked against conventional topographic variables. This study addresses these gaps for the Heihe region by combining an inventory of 4020 gully polygons supported by field checks in Xunke County, 16 VIF-screened environmental factors, three tree-based ensemble models and a logistic regression baseline. Under stratified random splitting, XGBoost achieved the highest discrimination (AUC = 0.95, κ = 0.74); under leave-one-district-out spatial cross-validation all tree-based models retained AUC above 0.83, confirming that random-split metrics overestimate discrimination by approximately 0.11 AUC units due to spatial autocorrelation and inter-district covariate shift. SHAP analysis identified LULC and HFI as the dominant predictors, exceeding all topographic variables, while slope gradient contributed least—consistent with the low-relief, intensively cultivated character of the study area. Susceptibility was highest in the southwestern agricultural lowlands. A one-factor sensitivity test in which only NDVI was increased by 20% suggested a reduction in modelled high-susceptibility area of approximately 12%, although co-occurring land-cover and hydrological changes were not simulated. The multi-model framework, integrating spatial cross-validation and post hoc interpretability, provides an explicit estimate of conventional evaluation optimism and supports spatially differentiated erosion management.

1. Introduction

Gully erosion ranks among the most severe forms of land degradation globally, removing fertile topsoil irreversibly, fragmenting arable land, and delivering substantial sediment loads to downstream waterways [1]. Unlike sheet or rill erosion, gully incision crosses a hydrological threshold beyond which the incised channel becomes self-reinforcing, making recovery without active intervention difficult [1,2]. In Northeast China’s Mollisol (black soil) belt, the process is particularly destructive because it affects some of the world’s most productive agricultural soils and is driven by the combined effects of seasonal freeze–thaw cycling and concentrated summer rainfall on cultivated hillslopes [3,4]. Long-term monitoring shows that woodland-to-farmland conversion increased gully area by a factor of 5.6 between 1968 and 2018, with gully erosion alone responsible for an annual soil loss of 1.46 mm [5]. Given that this region produces roughly one-quarter of China’s grain supply, spatially explicit susceptibility assessment is important from both geomorphic and food-security perspectives.
Methodological approaches to susceptibility assessment have evolved considerably over the past three decades. Early empirical indices such as RUSLE and qualitative morphological classifications [1,6] were followed by process-based models such as WEPP [7], and more recently by data-driven machine learning (ML) frameworks that are now widely used [8,9]. Empirical indices such as RUSLE combine factors multiplicatively but assume stationary, additive-in-log responses to slope, rainfall, and cover, which poorly capture the threshold-dependent dynamics of gully initiation and offer limited ability to discriminate among land-management scenarios. Process-based models require temporally resolved hydromechanical parameterization—including soil erodibility, infiltration capacity, and freeze–thaw-induced cohesion loss [3,10]—data that remain scarce across the Heihe region. By contrast, ML algorithms can learn complex, high-dimensional predictor–response relationships directly from geo-environmental covariates, and their outputs can now be interpreted using post hoc explanation methods. ML is particularly suited to the present setting because (i) the 16 environmental predictors span continuous, categorical, and distance-derived data types that conventional empirical indices cannot integrate within a single functional form; (ii) documented interactions between freeze–thaw intensity, land use, and rainfall on Mollisol hillslopes are nonlinear and at least second-order [3,4]; and (iii) post hoc explanation methods can recover variable-level and sample-level diagnostics that process-based formulations, given current data limitations, cannot easily provide.
Among ML approaches, tree-based ensemble algorithms—Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Gradient Boosting Machine (GBM)—have shown high accuracy for gully susceptibility mapping across diverse geomorphic settings. In Northeast China’s black soil region, Li et al. [9] reported area under the receiver operating characteristic curve (AUC) values of 0.88–0.96, and a recent stacking ensemble study improved individual-model accuracy by up to 27.8% [11]. Alongside model development, gully inventory construction has benefited from sub-meter and UAV imagery combined with object-based image analysis [12], though regional-scale mapping over large and heterogeneous areas still typically relies on multi-resolution imagery interpretation supported by field verification. For model interpretation, SHapley Additive exPlanations (SHAP) have been widely adopted [13], and these analyses often rank land use, NDVI, and terrain indices among the strongest predictors of gully occurrence. Despite these advances, most published studies still rely on a single algorithm and evaluate performance through random train–test splits that ignore the spatial structure of the data; nearby training and test samples share similar environmental conditions, and the resulting autocorrelation can inflate reported accuracy metrics.
Several gaps remain for the Heihe region. First, existing erosion studies in the Northeast China black-soil belt tend to treat rainfall-driven and freeze–thaw processes as independent controls rather than as co-occurring mechanisms that interact differently across the region’s contrasting forested uplands and cultivated plains [4,14]; a spatially integrated assessment capturing both regimes within a single framework is lacking. Second, while anthropogenic pressure plays a central role in gully initiation in this intensively managed landscape, the predictive value of composite indicators such as the Human Footprint Index (HFI)—which integrates road density, human population, land transformation, and other disturbance dimensions beyond what LULC alone captures—has not been systematically tested alongside conventional topographic and climatic variables for this region. Third, spatially explicit cross-validation has rarely been applied in local susceptibility studies, leaving the geographic transferability of reported model performance largely unexamined. Fourth, without a simple linear baseline, the actual gain attributable to nonlinear ensemble complexity cannot be objectively quantified under a common protocol of predictors, partitioning and spatial CV.
The Heihe region, straddling the transition from the Xiaoxing’an Mountains to the Songnen Plain, encompasses contrasting erosion regimes—from freeze–thaw-dominated forested uplands to rainfall-driven gullying in intensively farmed lowlands—within a single administrative unit, making it a suitable case study for addressing the gaps above. This study therefore combines a remote sensing-based gully inventory supported by field surveys, 16 VIF-screened environmental factors, and a four-model ML framework to (1) characterize the spatial distribution of gully erosion across the region; (2) compare RF, XGBoost, GBM, and a logistic regression baseline under both random-split and leave-one-district-out spatial cross-validation; (3) quantify predictor contributions through multi-model importance ranking and SHAP analysis, map susceptibility at five levels, and delineate priority intervention zones; and (4) simulate vegetation restoration scenarios to evaluate the potential of revegetation as a risk-reduction measure.

2. Data and Methods

2.1. Study Area

Heihe (47°42′–51°03′N, 124°45′–129°18′E) is situated in northwestern Heilongjiang Province, China, with a total area of approximately 68,726 km2 (Figure 1). Elevation declines from northwest to southeast: low-to-medium mountains (300–800 m a.s.l.) account for 64.3% of the area, giving way through rolling foothills to alluvial plains at 90–120 m (35%).
The region has a cold-temperate continental monsoon climate. Mean annual temperature ranges from −1.3 to 0.4 °C and annual precipitation averages 500–550 mm, with most rainfall falling in summer. Mean January temperatures range from approximately −24 to −28 °C and mean July temperatures from 20 to 22 °C; multi-decadal extreme minima reach −40 °C only during episodic cold spells, while summer maxima rarely exceed 35 °C. The resulting deep seasonal frost and repeated freeze–thaw cycles progressively disrupt soil structure and contribute to gully headcut advance [15,16].
Dark brown forest soils cover 65.8% of the area on the hilly and mountainous terrain; Mollisols (black soils) occupy 12.9% on the lowland plains, with meadow, marsh, volcanic ash, and paddy soils making up the remainder [17,18]. The Mollisols have a deep, humus-rich A-horizon that supports high agricultural productivity under natural or managed conditions, yet becomes prone to compaction and structural breakdown once surface vegetation is cleared. Parent material composition—Quaternary loess deposits, weathered granite, and fluvial sediments—varies considerably across the landscape, resulting in marked spatial differences in soil erodibility that underlie much of the observed heterogeneity in gully density and distribution [18].

2.2. Data Used in This Study

2.2.1. Remote Sensing Data

The gully distribution inventory was compiled through structured visual interpretation of multi-source, multi-resolution satellite imagery acquired during snow-free and low-vegetation periods (May–October 2023). The primary data sources include Sentinel-2 MSI (10 m multispectral) [19], Gaofen-1 (2 m panchromatic/8 m multispectral) [20], and Gaofen-2 (0.8 m panchromatic/4 m multispectral) [21]. Sensor characteristics and acquisition parameters are summarized in Table 1. The sub-meter Gaofen imagery was the primary source for delineating individual gully polygons, while Sentinel-2 provided regional spectral context.
To improve internal consistency of the visual interpretation, gully polygons were delineated using fixed geomorphic and image-interpretation criteria: elongated incised forms, tonal or shadow contrast indicating channel walls, connection to local drainage or field-margin flow paths, and persistence across high-resolution Gaofen imagery and Sentinel-2 context where available. Ambiguous linear features were rechecked against the multi-source image stack and, for Xunke County, against the field-photo and GPS-waypoint records described below. We did not conduct a formal independent inter-interpreter agreement test (e.g., a double-blind mapping exercise with a kappa statistic); the inventory should therefore be interpreted as a structured visual product supported by field checks in Xunke County rather than as a statistically audited classification map.

2.2.2. Field Survey

The field campaign was conducted during July–August 2023 to ground-check the visually interpreted gully inventory and to characterize typical gully morphology. The campaign was centered on Xunke County, which was selected as the primary survey area because it (i) lies near the geographic center of the Heihe region, traversing the regional forest–cropland transitional belt, (ii) contains the four dominant land-cover classes (forest, dryland cropland, paddy field, and meadow), and (iii) has the second-highest gully density among the six administrative units (0.096 gullies km−2; see Section 3.1). Supplementary field checks were also carried out in neighboring Sunwu, Beian and Nenjiang, though with considerably fewer waypoints than in Xunke.
In total, more than 2000 handheld GPS waypoints were collected across the campaign, the majority along south-to-north traverse segments within Xunke County that intersect representative geomorphic units from forested uplands to cultivated lowlands; the remaining waypoints are distributed among the three supplementary districts. The campaign was designed as a regional-scale field-checking exercise (consistency check against remote sensing interpretation) rather than as a point-level morphometric measurement programme, and per-site quantitative attributes (e.g., wall inclination, cross-section dimensions) were observed and photographically documented but not systematically digitized. The Aihui District, dominated by dense forest with very low gully density, was not visited on the ground; morphological characterization there relies on remote-sensing evidence alone. Representative field photographs are shown in Figure 2.
Qualitative observations indicated that V-shaped cross-sections occurred more frequently on the steeper hillslopes encountered along upland traverses, consistent with active headward incision, while U-shaped profiles were more commonly observed along valley floors and cultivated field margins, consistent with lateral widening or partial infilling. Field-observed gully density was visibly higher in the cropland-dominated southwestern plains and visibly lower in the forested northeastern uplands, qualitatively consistent with the spatial patterns derived from remote-sensing extraction.

2.2.3. Environmental Factors

Sixteen environmental factors (Table 2) were selected based on their established relevance to gully erosion processes [3,9] and data availability. These factors fall into five categories:
Topographic variables (7 factors): elevation (DEM), slope, aspect, relief, topographic wetness index (TWI), topographic position index (TPI), and curvature, all derived from the 12.5 m ALOS PALSAR Radiometric Terrain Corrected (RTC) DEM product [22] and resampled to the 25 m analysis grid. The 12.5 m ALOS product is interpolated from coarser globally available DEM sources (SRTM/JAXA AW3D backbone), so its effective information content is closer to ∼30 m than to the nominal pixel size.
Climatic variables (2 factors): mean annual temperature (MAT) from ERA5-Land reanalysis (0.1° resolution, ∼10 km at study-area latitudes) [23] and total annual precipitation from TerraClimate (∼4 km resolution) [24], both representing long-term climatological means.
Vegetation and land cover (2 factors): NDVI from 2024 Landsat surface reflectance (30 m) and a project-compiled 2024 LULC raster interpreted from Sentinel-2 imagery and stored on the 25 m analysis grid, using CLUD-style class labels. The gully-mapping imagery was acquired in 2023 (Table 1), while the NDVI composite uses 2024 data because the equivalent 2023 growing-season imagery was not available within the project timeline; this one-year offset is noted as a source of predictor uncertainty.
Soil (1 factor): soil type from the HWSD2 database (1 km) [25], retained as a categorical regional-scale lithology proxy.
Anthropogenic and proximity variables (4 factors): the annual Global Human Footprint Index (HFI, 1 km) [26], and Euclidean distances to buildings, roads, and streams derived from national geographic databases. The HFI layer was resampled to 25 m by nearest-neighbor assignment. The three Euclidean-distance fields are derived from buildings, primary/secondary roads, and named streams only; they do not include field-scale agricultural footpaths and tractor tracks, which are a well-documented locus of gully head initiation in black-soil cropland [5]. These variables therefore serve as landscape-scale anthropogenic surrogates rather than direct measures of within-field disturbance.
The spatial distributions of all 16 environmental factors are presented in Figure 3. Topographic factors exhibit a clear northwest–southeast gradient corresponding to the transition from mountainous terrain to low-relief agricultural plains. Climatic variables (MAT and precipitation) display latitudinal zoning patterns, while vegetation and land cover factors reflect the mosaic of forest, cropland, and transitional zones that characterize the study area. Anthropogenic factors (HFI and proximity measures) are concentrated in the southwestern plains, coinciding with the major agricultural and settlement areas.

2.3. Methodology

The methodological framework (Figure 4) integrates multi-source environmental data, field validation, and machine learning to produce gully erosion susceptibility maps. Satellite imagery was preprocessed through geometric correction, radiometric calibration, and atmospheric correction; gully polygons were then delineated through combined visual interpretation and field verification using high-resolution imagery.

2.3.1. Sample Construction (Positive and Negative Samples)

Binary classification of gully versus non-gully pixels requires both positive and negative samples. Positive samples were obtained by rasterizing the 4020 gully polygons (Section 2.2.2) onto the 25 m analysis grid; a random subset of up to 5000 cells was retained from this mask. Negative samples were drawn at random without replacement from cells that (i) lie inside the study-area boundary, (ii) do not intersect any mapped gully polygon in the multi-source inventory (Section 2.2.1), and (iii) have complete values across all 16 environmental layers. The non-gully status of each candidate cell was verified by overlay against the full gully inventory; any cell overlapping a mapped polygon—including partial overlaps—was excluded. A 1:2 positive-to-negative ratio was adopted so that the negative class reflects the predominance of non-gully terrain in the study area while the positive class retains sufficient prevalence to avoid degenerate decision boundaries.
To reduce model-specific bias, four machine learning algorithms were employed and compared: Logistic Regression (LR) as a linear baseline, Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Gradient Boosting Machine (GBM). The three tree-based ensemble algorithms all aggregate multiple decision trees but differ in their ensemble strategies, so comparing them helps assess prediction stability and identify consensus patterns. The inclusion of LR provides a benchmark for quantifying the improvement gained by nonlinear ensemble methods.
Logistic Regression (LR) serves as the linear baseline model, fitting a generalized linear model with a logit link function to estimate gully occurrence probability as a function of the 16 environmental predictors. LR provides a benchmark against which the added value of nonlinear ensemble methods can be quantified.
Random Forest (RF) constructs an ensemble of independent classification trees using bootstrap aggregation (bagging). Each tree is grown on a random bootstrap sample of the training set, and at each split a random subset of features is evaluated. The final prediction is determined by majority voting across all trees. Feature importance is assessed through permutation-based mean decrease in accuracy. RF is implemented with the ranger package in R [27]. Three hyperparameters were tuned by random search over 50 candidate configurations under 5-fold cross-validation on the training set: number of trees num.trees  [ 200 , 1000 ] , number of variables tried per split mtry  [ 2 , 10 ] , and minimum terminal node size min.node.size  [ 1 , 10 ] . The maximization objective for the search was cross-validated AUC. The search ranges and selected best configuration are summarized in Table 3.
Extreme Gradient Boosting (XGBoost) builds trees sequentially, with each new tree fitted to the residual errors of the previous ensemble. Regularization parameters ( L 1 and L 2 penalties) are incorporated to limit overfitting. Six hyperparameters were tuned by random search over 50 candidate configurations under 5-fold cross-validation: learning rate η [ 0.01 , 0.3 ] , maximum tree depth [ 3 , 10 ] , subsample ratio [ 0.5 , 1.0 ] , column sample ratio per tree [ 0.5 , 1.0 ] , minimum child weight [ 1 , 10 ] , and minimum loss reduction γ [ 0 , 5 ] . The maximum number of boosting rounds was capped at 500, with the operational number of rounds for each candidate determined within the inner 5-fold cross-validation by early stopping (patience = 30 rounds). XGBoost is implemented with the xgboost package in R version 4.5.3 (R Core Team, 2026) and natively supports the computation of SHapley Additive exPlanations (SHAP) values, which were used for global and local interpretability analysis [13]. Search ranges and the selected best configuration are reported in Table 3.
Gradient Boosting Machine (GBM) also follows a sequential boosting strategy but uses Friedman’s gradient descent approach with a Bernoulli loss function for binary classification. Four hyperparameters were tuned by random search over 50 candidate configurations under 5-fold cross-validation: number of trees n.trees  [ 200 , 1000 ] , interaction depth [ 3 , 7 ] , shrinkage rate [ 0.01 , 0.1 ] , and bag fraction [ 0.5 , 1.0 ] . The operational number of trees for each candidate was determined within the inner 5-fold cross-validation by monitoring the out-of-bag deviance improvement. GBM is implemented with the gbm package in R. Search ranges and the selected best configuration are reported in Table 3.
All four models were trained on the same 70% stratified random training set and evaluated on the held-out 30% test set. Model performance was assessed using accuracy, Cohen’s κ coefficient, AUC, precision, recall, and F1-score for both classes (gully and non-gully). To assess the impact of spatial autocorrelation on reported metrics, a leave-one-district-out spatial cross-validation (Spatial CV) was additionally conducted, in which each of the six administrative districts was held out in turn as the test set while the remaining five districts were used for training. Full-extent susceptibility predictions were generated by applying each tree-based model to the 16-layer environmental factor raster stack, and the resulting continuous probability maps were classified into five susceptibility levels (very low, low, moderate, high, very high) using the Jenks natural breaks algorithm.
Beyond single-model assessment, further analyses were conducted: (a) SHAP analysis for the XGBoost model to provide feature-level interpretability; (b) frequency ratio (FR) validation to independently verify susceptibility class discrimination by computing the ratio of observed gully density in each susceptibility class to the overall study-area density; and (c) vegetation restoration scenario simulations (NDVI +10% and +20%) to evaluate the potential of revegetation in reducing high-susceptibility areas. Model uncertainty was quantified as the pixel-level standard deviation of susceptibility probabilities across the three models, and priority treatment zones were identified by overlaying high-susceptibility consensus areas with observed gully density hotspots.
Resolution Harmonization
All 16 environmental factors were resampled to a common 25 m analysis grid prior to model training. The 25 m working resolution was chosen as a pragmatic compromise: it is fine enough to encode the within-field heterogeneity of the visually interpreted gully polygons (whose minimum mapping unit is 300 m2), and coarse enough to limit the artificial introduction of structure not present in the coarser climate, soil and HFI source layers. Fine-resolution layers were aggregated to 25 m by mean resampling for continuous fields; coarse-resolution climate layers were disaggregated by bilinear interpolation; categorical variables (LULC, soil type) used nearest-neighbor assignment; and HFI was assigned by nearest neighbor without spatial smoothing.
Scale Considerations
Resampling coarse-resolution fields (ERA5-Land MAT at ∼10 km, TerraClimate precipitation at ∼4 km, HFI at 1 km, HWSD2 soil at 1 km) to the 25 m grid is a coordinate harmonization, not a downscaling that adds spatial information. Climate, HFI and soil-type predictor contributions should therefore be interpreted as regional-scale gradient signals rather than 25 m-resolved forcings [28].
Multicollinearity Screening
Multicollinearity among predictors was assessed by variance inflation factor (VIF) [29,30]. All 16 variables fell below the conventional VIF < 10 threshold and were retained (Table 4). The two highest values—Slope (VIF = 8.04) and TWI (VIF = 8.41)—reflect their shared derivation from the same DEM but encode distinct geomorphic dimensions (gravitational potential versus flow concentration); both were kept because tree-based algorithms are less sensitive to collinearity than OLS-style regression [31].

3. Results

3.1. Spatial Distribution of Gully Erosion

A total of 4020 mapped gully polygons were identified across the study area (Table 5; Figure 5), exhibiting pronounced spatial heterogeneity. At the district scale, gully density peaks in Sunwu County (0.105/km2) and Xunke County (0.096/km2), both dominated by cultivated hillslopes, while the predominantly forested Aihui District contains only 76 gullies (0.005/km2). Gully clusters are concentrated within cropland-dominated catchments where tillage-disturbed topsoil is exposed to concentrated runoff during spring snowmelt and summer convective storms. In this dataset, gully occurrence is highest on gentle-to-moderate slopes rather than on the steepest terrain, indicating that in this low-relief landscape, prolonged flow convergence in poorly drained agricultural depressions—rather than gravity-driven overland flow—is likely to be the main initiation mechanism. Per-class frequency-ratio tables below further quantify these patterns.
To quantify these patterns, gully-cell occurrence was tabulated against LULC class (Table 6) and slope-gradient class (Table 7). For each predictor class i the frequency ratio is defined as
FR i = gully cells in class i / total gully cells area cells in class i / total area cells ,
so FR > 1 indicates overrepresentation of gullies in that class relative to its areal share, and FR < 1 indicates underrepresentation.
At the pixel scale, gully cells are strongly concentrated in the Sparse Woodland class ( FR = 2.49 ): this single class accounts for 37.7% of the analysis-area cells but contains 93.9% of all gully cells. Dryland Cropland and Paddy Field are strongly underrepresented ( FR = 0.027 and 0.070 ). This contrasts with the district-scale pattern in which the most cultivated districts (Sunwu, Xunke) have the highest gully density (Table 5), suggesting that many mapped gully channels intersect cells labelled as Sparse Woodland due to vegetated thalwegs and field margins in the LULC map. The slope-gradient table confirms that gully cells are concentrated on gentle slopes (FR = 1.50 in the 0–2° bin, 1.03 in 2–5°), dropping sharply on steeper terrain (FR = 0.01 in the >20° bin).

3.2. Multi-Model Performance Comparison

Four models—LR (baseline), RF, XGBoost, and GBM—were trained on the same predictor set and evaluated on the independent test set obtained from a 70%/30% stratified random split. As shown in Table 8 and Figure 6 and Figure 7, all three tree-based ensemble models achieved strong predictive performance with AUC values exceeding 0.93 and outperformed the LR baseline (AUC = 0.85), suggesting that the multi-source environmental factor framework and nonlinear ensemble methods are useful for this task. The random-split AUC values are optimistic estimates of out-of-region accuracy because positive and negative samples share spatial structure with the training set; the leave-one-district-out estimates in Section 3.2 (AUC ≈ 0.84, Δ AUC ≈ 0.10–0.11) provide a more reliable indicator of geographic transferability. Figure 6 displays the metric comparison across RF, XGBoost, and GBM, while Figure 7 shows the ROC curves for all four models including LR.
XGBoost achieved the highest overall performance, with an AUC of 0.95, accuracy of 0.89, and Kappa coefficient of 0.74. RF ranked second with an AUC of 0.95, accuracy of 0.88, and Kappa of 0.72, while GBM yielded slightly lower but still robust metrics (AUC = 0.94, accuracy = 0.86, Kappa = 0.68). The LR baseline exhibited markedly lower performance (AUC = 0.85, Kappa = 0.45, F1 = 0.62), indicating that substantial nonlinear predictor interactions exist that linear models fail to capture.
In terms of class-specific performance, XGBoost demonstrated balanced discrimination for both gully (class 1) and non-gully (class 0) samples, achieving a recall of 0.84 for gully detection and a precision of 0.81. RF exhibited slightly higher precision for the non-gully class (0.92) but marginally lower gully recall (0.83). GBM showed the lowest gully recall among tree-based models (0.79), while LR showed the poorest gully recall overall (0.60).
The consistently high performance of all three tree-based models (AUC > 0.93) suggests that the selected environmental factors capture major spatial controls on gully erosion and that ensemble methods can represent nonlinear interactions among anthropogenic, topographic, and climatic variables. The AUC improvement of the best ensemble model (XGBoost) over LR is 0.10. The stronger performance of XGBoost may reflect its built-in regularization mechanisms and sequential error-correction strategy.

Spatial Cross-Validation

To evaluate the impact of spatial autocorrelation on model performance estimates, a leave-one-district-out spatial cross-validation was performed (Table 9). In each fold, one of the six administrative districts was held out as the test set while the remaining five were used for training.
Compared with the random split evaluation, spatial CV produced lower AUC values by 0.10–0.11 for tree-based models and by 0.05 for LR. Because the leave-one-district-out folds differ in both spatial proximity and covariate distribution (e.g., Aihui is heavily forested with 76 gullies; Beian is intensively cultivated), the AUC gap reflects both spatial autocorrelation and inter-district covariate shift. The degradation was most pronounced for XGBoost ( Δ AUC = 0.11) and least for LR ( Δ AUC = 0.05), consistent with the finding that high-capacity ensembles are more susceptible to overfitting spatially structured signals [28]. Despite the expected decline, all tree-based models maintained AUC > 0.83 under spatial CV and LR achieved 0.80, indicating adequate spatial generalization. The relatively large F1-score standard deviations (RF: 0.26; XGBoost: 0.25; GBM: 0.28) reflect pronounced inter-district variability in class balance and covariate distribution.

3.3. Environmental Factor Importance and Interpretability

The relative importance of 16 environmental factors was assessed through model-specific importance metrics across all three algorithms, supplemented by SHAP (SHapley Additive exPlanations) analysis for the XGBoost model.
As shown in Figure 8, all three models consistently identified land use/land cover (LULC) and the Human Footprint Index (HFI) as the two most influential predictors. In RF and XGBoost, LULC ranked first (normalized importance = 1.00), followed by HFI (0.72 and 0.87, respectively). In GBM, HFI ranked first, with LULC second. This cross-model consensus indicates that anthropogenic factors are the strongest predictive correlates of gully erosion susceptibility in the Heihe region, though variable importance reflects predictive association rather than direct causation.
Elevation (DEM) and mean annual temperature (MAT) occupied the third and fourth positions across all models, with normalized importance values ranging from 0.54 to 0.77, reflecting the fundamental topographic and thermal gradients governing seasonal freezing extent and the transition between agricultural plains and forested uplands. Distance to buildings showed moderate importance (0.22–0.43), serving as a proxy for localized anthropogenic disturbance. Total annual precipitation, terrain relief, NDVI, TPI, and distance to streams exhibited lower but consistent importance. Notably, slope gradient ranked among the least important predictors across all three models (normalized importance < 0.13), contrasting with the typically high importance of slope in many gully susceptibility studies on steep terrain (e.g., the Loess Plateau) and reflecting the low-relief character of the Heihe agricultural plains.
SHAP analysis for the XGBoost model (Figure 9) confirmed LULC as the most influential feature (mean |SHAP| = 1.168), followed by HFI (0.724), DEM (0.531), MAT (0.496), and Relief (0.458). The SHAP summary plot revealed that high LULC values (cropland and sparse woodland) consistently pushed predictions toward higher susceptibility, in agreement with the per-class FR in Table 6, while HFI exhibited a threshold effect with SHAP values increasing sharply beyond approximately 10. DEM showed a negative monotonic relationship, with lower elevations associated with higher susceptibility. Curvature, aspect, and slope contributed the least (mean |SHAP|< 0.13).

3.4. Gully Erosion Susceptibility Mapping

Full-extent susceptibility maps were generated by applying all three trained models to the 16-layer environmental factor raster stack. Continuous probability outputs were classified into five susceptibility levels using the Jenks natural breaks algorithm (Table 10).
Table 11 summarizes the area and percentage of each susceptibility class for the three models. Across all models, the very-low class dominates (58–72% of the study area), while the very-high class covers 4.6–6.2%, confirming the localized nature of severe gully erosion susceptibility. XGBoost assigns the largest proportion to the very-low class (72.0%); this reflects sharper class separation in its output probability distribution, evident in the higher High/Very High Jenks break (0.708 for XGBoost vs. 0.607 for RF and 0.658 for GBM in Table 10), rather than a stricter probability calibration in any post hoc calibrated sense.
The three models produced broadly consistent susceptibility patterns (Figure 10, Figure 11 and Figure 12), characterized by a “high in the southwest, low in the northwest” spatial gradient. Very-high-susceptibility zones were concentrated in the agricultural plains of southwestern Beian City, western Nenjiang City, and parts of Wudalianchi City, while very-low-susceptibility zones corresponded to the forested uplands in the northern Aihui District and southeastern Xunke County. Moderate susceptibility zones formed a transitional belt along the forest–cropland boundary.
Despite the overall spatial agreement, localized differences among models were observed. XGBoost produced slightly sharper spatial gradients at the forest–cropland boundary, likely reflecting its stronger regularization and sequential error-correction. GBM tended to assign moderately higher susceptibility values in transitional zones, while RF exhibited the smoothest spatial transitions. Model uncertainty, quantified as the pixel-level standard deviation across three models (Figure 13), was highest along the forest–cropland transitional belt, indicating that these areas are most sensitive to model selection.

3.5. Frequency Ratio Validation

To independently verify that the susceptibility classification discriminates gully-prone terrain, a frequency ratio (FR) analysis was performed for each model. For every Jenks-classified susceptibility level, the proportion of observed gully pixels within that class was divided by the proportion of total study-area pixels in the same class. An FR > 1 indicates that gullies are overrepresented relative to the area, while FR < 1 indicates underrepresentation. As shown in Table 12, FR values increase monotonically from very-low to very-high susceptibility across all three models. In the very-high class, FR ranges from 10.48 (GBM) to 14.55 (XGBoost), meaning gully occurrence is roughly 10–15 times more concentrated than random expectation. Conversely, the very-low class consistently yields FR < 0.10, indicating that forested uplands classified as very-low susceptibility contain negligible gully occurrence. The monotonic FR gradient is consistent with the interpretation that the susceptibility classification captures the underlying spatial pattern of gully erosion.

3.6. District-Level Analysis and Priority Zones

District-level statistics (Table 13) revealed substantial spatial heterogeneity across the six administrative units of Heihe City. Sunwu County exhibited the highest gully erosion ratio (defined as the total gully-affected area divided by the district area, expressed as a percentage; 0.170%) and gully density (0.105 gullies/km2) despite its relatively small area, indicating concentrated erosion hotspots. Beian City ranked second in gully density (0.083 gullies/km2) and had the highest proportion of high-susceptibility areas across all three models (RF: 33.4%, XGBoost: 21.3%, GBM: 29.0%). In contrast, Aihui District exhibited the lowest erosion metrics, with only 76 identified gullies and a negligible erosion ratio (0.007%).
Priority treatment zones were identified by overlaying multi-model high-susceptibility consensus areas with observed gully density hotspots (Figure 14). Level-1 priority zones were most extensive in Nenjiang City (635.7 km2) and Beian City (584.2 km2), corresponding to the intensively cultivated plains with both high predicted susceptibility and high observed gully density. Level-2 and Level-3 priority zones extended into Wudalianchi City, Xunke County, and Sunwu County, covering areas where moderate susceptibility overlaps with emerging erosion trends.
The gully erosion ratio analysis (Figure 15a) highlights the disproportionate erosion intensity in Sunwu County and Xunke County relative to their administrative area, while the multi-model comparison (Figure 15b) demonstrates consistent ranking among the three models with XGBoost producing slightly more conservative high-susceptibility estimates. These cross-model discrepancies are most pronounced in transitional zones between cultivated and forested landscapes, reinforcing the need for multi-model comparative assessment strategies.

4. Discussion

4.1. Multi-Factor Associations and Spatial Patterns

The multi-model susceptibility assessment shows a “high in the southwest, low in the northwest” spatial pattern in gully erosion susceptibility across the Heihe region, closely aligned with the spatial distribution of key environmental factors (Figure 3).
The HFI (Figure 3m) displays a pattern of “high in the southwest and low in the northwest,” with over 70% of high-intensity areas concentrated in the western and southwestern plains, corresponding to the low-elevation plain zone that supports over 80% of the city’s cultivated land. The spatial coupling between intensive human activity and flat terrain may increase anthropogenic stress, reduce soil aggregate stability, and lower hydrological connectivity thresholds, thereby facilitating the transition from rill to gully erosion. The LULC pattern (Figure 3k) further modulates erosion processes: cultivated land and forest land, accounting for approximately 45.3% and 46.7%, respectively, represent two contrasting erosion response systems—cultivated areas forming “erosion vulnerability windows” during post-harvest and spring snowmelt periods, while forested areas act as natural erosion barriers through canopy interception, root stabilization, and humus protection [32]. A scale-of-evidence caveat applies: at the district scale gullies cluster in cropland-dominated areas (Sunwu, Xunke), yet at the pixel scale the highest FR belongs to the Sparse Woodland class ( FR = 2.49 ; Table 6), likely because vegetated thalwegs and field margins surrounding gully channels are mapped as woodland in the LULC raster. The LULC importance signal therefore reflects a gully-corridor configuration rather than a simple “high LULC value = high susceptibility” relationship.
SHAP analysis (Figure 9) provides feature-level insight into the nonlinear predictor–susceptibility relationships. High LULC values (cropland and sparse woodland classes) consistently push predictions toward higher susceptibility, with the Sparse Woodland class producing the strongest positive SHAP values—consistent with the pixel-level FR pattern (Table 6). HFI exhibits a threshold effect—SHAP values increase sharply beyond an HFI of approximately 10, suggesting that anthropogenic disturbance has limited impact on gully initiation below this threshold, but increases more rapidly above it. DEM shows a negative monotonic SHAP pattern: lower elevations are associated with higher susceptibility, whereas forested uplands above 300 m are strongly protective. MAT displays a positive marginal effect consistent with the acceleration of freeze–thaw weathering and earlier snowmelt in warmer zones. SHAP and variable importance values reflect statistical predictive associations rather than mechanistic causation; targeted field experiments would be required to confirm the causal pathways implied by these patterns.
The frequency ratio validation (Table 12) supports the physical meaningfulness of the susceptibility classification: FR values increase monotonically from <0.1 in the very-low class to 10–15 in the very-high class across all three models. This pattern suggests that the Jenks-classified maps reflect spatial differentiation in gully occurrence probability rather than statistical artifacts—a validation dimension that many published susceptibility studies based solely on AUC and accuracy metrics overlook. The achieved AUC values (0.94–0.95) are broadly consistent with recent assessments in comparable environments: Li et al. [9] reported 0.88–0.96 for multiple ML algorithms in Northeast China’s black soil region, and Zhang et al. [11] reported strong performance for stacking ensembles in the broader black-soil belt. The inclusion of LR as a linear baseline shows that nonlinear ensembles improve discrimination by 0.10 AUC units, consistent with reported gains in multi-model comparisons [11].
The dominance of HFI over classical topographic indices (slope, TWI) reflects a geomorphic context in which human-driven soil disturbance is the most informative predictive signal for gully occurrence in this region [9]. On steeper terrain such as the Loess Plateau, slope often ranks among the top predictors; in this dataset the ranking is reordered, with anthropogenic indicators (LULC, HFI) above all topographic variables, consistent with Zhang et al. [5] who identified land-use conversion as the primary driver of gully expansion in the Mollisol region. This reordering does not imply independence between anthropogenic and topographic controls, because HFI, LULC and elevation are intercorrelated through the flat-land farming pathway. A partial Spearman correlation analysis on the joint sample ( n = 13,734 ) shows that the HFI–gully association drops from ρ = 0.467 (marginal) to ρ = 0.205 ( p < 10 129 ) when DEM, slope, MAT and precipitation are controlled for—attenuated but still significant (Table 14, top). A four-group variance partitioning (Table 14, bottom) yields an adjusted R 2 of 0.286, with unique fractions of 0.030 (anthropogenic), 0.017 (soil–vegetation), 0.015 (topographic) and 0.012 (climatic); the four-group shared fraction (0.080) is the single largest component, confirming that the predictor families co-vary rather than act independently.
The low predictive ranking of slope likely reflects a resolution artifact rather than geomorphic irrelevance. In low-relief agricultural plains, gully heads are initiated at sub-meter slope breaks—tillage furrows, wheel ruts, headcut edges—that the working DEM (∼30 m effective resolution) cannot resolve [33]. Future work incorporating UAV-LiDAR or sub-meter photogrammetric DEMs would help disentangle the resolution constraint from a genuine landscape effect.
Freeze–thaw processes also merit consideration: Zhou et al. [3] showed that freeze–thaw treatment can increase soil erodibility by 20–45%, and Ma et al. [10] demonstrated that soil loss at permanent gully headcuts is governed by hydromechanical soil response rather than rainfall amount alone. The present framework uses only static climatic variables (MAT, annual precipitation) and does not capture dynamic freeze–thaw indicators; incorporating snow-cover, soil-temperature, and freeze–thaw cycle layers in future work would sharpen attribution in upland zones.
Spatially, two distinct erosion regimes emerge. In the northern and southeastern uplands, lower temperatures and prolonged snow cover make spring freeze–thaw erosion the dominant process. In the southwestern lowlands, warmer temperatures combine with intensive cultivation to produce rainfall-driven gully incision as the primary mechanism. The well-vegetated eastern highlands (Figure 3j) illustrate how intact forest canopy, humus accumulation, and root networks collectively suppress gully initiation, reinforcing the protective role of vegetation identified by the variable importance analysis.

4.2. Vegetation Restoration Scenario Simulation

To evaluate the potential of vegetation management in reducing gully erosion susceptibility, two NDVI enhancement scenarios were simulated: a 10% and a 20% increase in NDVI values across the study area, with all other predictors held constant (Figure 16 and Figure 17).
Under the NDVI +10% scenario, the reduction in high-risk area averaged 4.7–5.5% across models (RF: 421 km2 reduction, 5.5%; XGBoost: 320 km2, 4.7%; GBM: 405 km2, 5.0%). Under the NDVI +20% scenario, the reduction was substantially larger, averaging 11.2–12.4% (RF: 935 km2, 12.3%; XGBoost: 849 km2, 12.4%; GBM: 912 km2, 11.2%). The nonlinear increase in risk reduction between the two scenarios indicates a threshold-like vegetation effect in the fitted model, though this apparent threshold could shift under scenarios in which LULC and HFI also change.
These projections should be interpreted as a one-dimensional sensitivity analysis rather than a revegetation forecast. In practice, a 20% NDVI increase would co-occur with LULC reclassification and changes in HFI—the two highest-ranked predictors—as well as altered soil structure and hydrological connectivity, none of which are perturbed here. The experiment therefore creates out-of-distribution feature combinations (e.g., high-NDVI cells still labelled as bare cropland) whose model response carries additional uncertainty. The simulated reductions are best read as model-elasticity estimates; site-level assessment remains necessary before translating them into restoration targets. Nonetheless, the consistent response across all three models confirms that NDVI is a genuinely informative predictor and supports its use as a monitoring indicator for vegetation management.
The AUC gap of 0.10–0.11 between random-split and leave-one-district-out cross-validation provides a quantitative estimate of the optimism inherent in the random-split design, attributable jointly to spatial autocorrelation and covariate shift across districts. All tree-based models maintained AUC > 0.83 under spatial CV and LR achieved AUC = 0.80, indicating reasonable spatial transferability beyond the training domain. Several caveats apply: field validation was centered on Xunke County with supplementary checks in Sunwu, Beian and Nenjiang, while Aihui District was not visited on the ground; the gully inventory (2023 imagery) and NDVI predictor (2024 composite) have a one-year temporal offset; national-database distance layers exclude field-scale agricultural footpaths and tractor tracks, which are a well-documented locus of gully initiation in black-soil cropland [5]; SHAP analysis was conducted only for XGBoost; and susceptibility maps describe relative probability rather than absolute erosion rates, requiring interpretation alongside field evidence and local management context. Despite these constraints, the combination of spatial cross-validation, FR verification and SHAP analysis provides a more complete picture than a single-model, random-split evaluation alone.

4.3. Management Implications and Strategies

The susceptibility maps and priority zone analysis support a differentiated, zonal governance strategy. In very-high-susceptibility agricultural lowlands—primarily southwestern Beian City and western Nenjiang City—integrated soil conservation packages (conservation tillage, vegetative buffer strips, and engineered gully-check structures) should be prioritized, complemented by optimized drainage layouts that divert concentrated runoff. In the moderate-susceptibility transitional belt, converting marginal cropland to perennial grass or woodland and adopting strip intercropping can raise landscape-level erosion resistance. In very-low-susceptibility forest zones, the management focus should shift to maintaining ecosystem integrity through long-term monitoring networks that detect early degradation signals before gullying initiates.
At the district scale, Beian City and Nenjiang City merit the highest investment priority given their extensive Level-1 priority zones, while Sunwu County and Xunke County—smaller but with the highest gully densities—require targeted, localized interventions. The vegetation restoration scenario analysis suggests that a 20% NDVI improvement could reduce high-susceptibility area by approximately 12%, offering a potential complement to engineering measures, though actual outcomes would depend on co-occurring land-cover and hydrological changes.

5. Conclusions

This study mapped regional gully erosion susceptibility in the Heihe region by combining a remote sensing-based gully inventory, 16 environmental factors, and a four-model ML framework including a logistic regression baseline. Overall, the results indicate that human land disturbance, represented by LULC and the Human Footprint Index, is more influential than topographic controls in explaining gully occurrence in this region. Tree-based ensembles (XGBoost AUC 0.95) outperformed the linear baseline (AUC 0.85) by a clear margin, and leave-one-district-out cross-validation showed that all three models retained AUC above 0.83, indicating that predictive performance is not solely attributable to spatial autocorrelation. The maps indicate high susceptibility in the southwestern agricultural lowlands, with the highest-risk zones concentrated in Beian City and Nenjiang City. A sensitivity analysis suggests that a 20% NDVI increase could reduce high-susceptibility area by roughly 12%, so revegetation may complement engineering-based control measures, although co-occurring land-cover and hydrological changes were not simulated.

Author Contributions

Conceptualization, J.Z. and X.G.; methodology, J.Z., Y.C. and F.W.; software, J.L.; validation, J.Z., Y.C. and D.W.; formal analysis, J.Z. and J.L.; investigation, J.Z., Y.C. and D.W.; data curation, J.L. and D.W.; writing—original draft preparation, J.Z.; writing—review and editing, X.G., F.W. and B.C.; visualization, J.L.; supervision, X.G.; project administration, J.Z. and X.G.; funding acquisition, J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the China Geological Survey project (Grant No. DD20230477).

Data Availability Statement

The gully erosion inventory and susceptibility maps generated in this study are available from the corresponding author upon reasonable request. Sentinel-2 imagery is publicly available through the Copernicus Data Space Ecosystem (https://dataspace.copernicus.eu). Gaofen-1 and Gaofen-2 imagery were obtained from the China Centre for Resources Satellite Data and Application (https://data.cresda.cn). The ERA5-Land reanalysis data, TerraClimate data, HWSD2 soil data, and Human Footprint Index data are publicly accessible from their respective repositories.

Acknowledgments

We gratefully acknowledge the field survey team at the Harbin Center for Integrated Natural Resources Survey for their support during the field campaign, and the anonymous reviewers for their constructive comments that helped improve this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Poesen, J.; Nachtergaele, J.; Verstraeten, G.; Valentin, C. Gully Erosion and Environmental Change: Importance and Research Needs. Catena 2003, 50, 91–133. [Google Scholar] [CrossRef]
  2. Morgan, R.P.C.; Nearing, M.A. (Eds.) Handbook of Erosion Modelling, 1st ed.; Wiley: Hoboken, NJ, USA, 2010; ISBN 978-1-4051-9010-7. [Google Scholar]
  3. Zhou, P.; Guo, M.; Zhang, X.; Zhang, S.; Qi, J.; Chen, Z.; Wang, L.; Xu, J. Quantifying the Effect of Freeze–Thaw on the Soil Erodibility of Gully Heads of Typical Gullies in the Mollisols Region of Northeast China. Catena 2023, 228, 107180. [Google Scholar] [CrossRef]
  4. Zhang, X.; Zhang, Y.; Qi, J.; Marek, G.W.; Srinivasan, R.; Feng, P.; Hu, K.; Liu, D.L.; Chen, Y. Effects of Changes in Freeze–Thaw Cycles on Soil Hydrothermal Dynamics and Erosion Degradation Under Global Warming in the Black Soil Region. Water Resour. Res. 2025, 61, e2024WR038318. [Google Scholar] [CrossRef]
  5. Zhang, S.; Guo, M.; Liu, X.; Chen, Z.; Zhang, X.; Xu, J.; Han, X. Historical evolution of gully erosion and its response to land use change during 1968–2018 in the Mollisol region of Northeast China. Int. Soil Water Conserv. Res. 2024, 12, 388–402. [Google Scholar] [CrossRef]
  6. Castillo, C.; Gómez, J.A. A Century of Gully Erosion Research: Urgency, Complexity and Study Approaches. Earth-Sci. Rev. 2016, 160, 300–319. [Google Scholar] [CrossRef]
  7. Flanagan, D.C.; Gilley, J.E.; Franti, T.G. Water Erosion Prediction Project (WEPP): Development History, Model Capabilities, and Future Enhancements. Trans. ASABE 2007, 50, 1603–1612. [Google Scholar] [CrossRef]
  8. Wang, J.; Yang, J.; Li, Z.; Ke, L.; Li, Q.; Fan, J.; Wang, X. Research on Soil Erosion Based on Remote Sensing Technology: A Review. Agriculture 2024, 15, 18. [Google Scholar] [CrossRef]
  9. Li, H.; Jin, J.; Dong, F.; Zhang, J.; Li, L.; Zhang, Y. Gully Erosion Susceptibility Prediction Using High-Resolution Data: Evaluation, Comparison, and Improvement of Multiple Machine Learning Models. Remote Sens. 2024, 16, 4742. [Google Scholar] [CrossRef]
  10. Ma, C.; Wang, S.; Zheng, D.; Zhang, Y.; Tang, J.; Wen, Y.; Dong, J. Understanding Soil Loss in Mollisol Permanent Gully Head Cuts through Hydrological and Hydromechanical Responses. Hydrol. Earth Syst. Sci. 2025, 29, 823–842. [Google Scholar] [CrossRef]
  11. Zhang, M.; Qin, W.; Wang, X.; Yin, Z.; Xu, X.; Zhou, L.; Han, X. Assessment of Gully Erosion Susceptibility in Northeast China’s Black Soil Region Using New Stacking Model with Multiple Machine Learning Algorithms. Soil Tillage Res. 2025, 257, 106964. [Google Scholar] [CrossRef]
  12. Shahabi, H.; Jarihani, B.; Tavakkoli Piralilou, S.; Chittleborough, D.; Avand, M.; Ghorbanzadeh, O. A Semi-Automated Object-Based Gully Networks Detection Using Different Machine Learning Models: A Case Study of Bowen Catchment, Queensland, Australia. Sensors 2019, 19, 4893. [Google Scholar] [CrossRef]
  13. Lundberg, S.M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30 (NIPS 2017); Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2017; pp. 4765–4774. [Google Scholar]
  14. Zhang, G.; Yang, Y.; Liu, Y.; Wang, Z. Advances and Prospects of Soil Erosion Research in the Black Soil Region of Northeast China. J. Soil Water Conserv. 2022, 36, 1–12. [Google Scholar]
  15. Zhao, H.; Gong, L.; Qu, H.; Zhu, H.; Li, X.; Zhao, F. The Climate Change Variations in the Northern Greater Khingan Mountains during the Past Centuries. J. Geogr. Sci. 2016, 26, 585–602. [Google Scholar] [CrossRef]
  16. Li, J.; Meng, L.; Bai, J. Dynamic Monitoring and Practice of Gully Erosion in Typical Black Soil Region of Northeast China. Northeast Water Resour. Hydropower 2012, 3, 4–6. [Google Scholar] [CrossRef]
  17. Kong, D.; Chu, N.; Luo, C.; Liu, H. Analyzing Spatial Distribution and Influencing Factors of Soil Organic Matter in Cultivated Land of Northeast China: Implications for Black Soil Protection. Land 2024, 13, 1028. [Google Scholar] [CrossRef]
  18. Song, H.; Wang, Z.; Yang, K. Remote Sensing and GIS-Based Study on Dynamic Monitoring of Gully Erosion in the Typical Black Soil Region of Eastern Changchun. Geomat. Spat. Inf. Technol. 2016, 39, 4. [Google Scholar] [CrossRef]
  19. Gascon, F.; Bouzinac, C.; Thépaut, O.; Jung, M.; Francesconi, B.; Louis, J.; Lonjou, V.; Lafrance, B.; Massera, S.; Gaudel-Vacaresse, A.; et al. Copernicus Sentinel-2A Calibration and Products Validation Status. Remote Sens. 2017, 9, 584. [Google Scholar] [CrossRef]
  20. Wei, J.; Yang, H.; Tang, W.; Li, Q. Spatiotemporal-Spectral Fusion for Gaofen-1 Satellite Images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 5002205. [Google Scholar] [CrossRef]
  21. Tang, Z.; Sun, Y.; Wan, G.; Zhang, K.; Shi, H.; Zhao, Y.; Chen, S.; Zhang, X. Winter Wheat Lodging Area Extraction Using Deep Learning with GaoFen-2 Satellite Imagery. Remote Sens. 2022, 14, 4887. [Google Scholar] [CrossRef]
  22. Alaska Satellite Facility. ALOS PALSAR Radiometric Terrain Correction (RTC) Product Guide; ASF DAAC: Fairbanks, AK, USA, 2015; Available online: https://asf.alaska.edu/data-sets/sar-data-sets/alos-palsar/ (accessed on 1 May 2026).
  23. Muñoz-Sabater, J.; Dutra, E.; Agustí-Panareda, A.; Albergel, C.; Arduini, G.; Balsamo, G.; Boussetta, S.; Choulga, M.; Harrigan, S.; Hersbach, H.; et al. ERA5-Land: A State-of-the-Art Global Reanalysis Dataset for Land Applications. Earth Syst. Sci. Data 2021, 13, 4349–4383. [Google Scholar] [CrossRef]
  24. Abatzoglou, J.T.; Dobrowski, S.Z.; Parks, S.A.; Hegewisch, K.C. TerraClimate, a High-Resolution Global Dataset of Monthly Climate and Climatic Water Balance from 1958–2015. Sci. Data 2018, 5, 170191. [Google Scholar] [CrossRef]
  25. FAO; IIASA. Harmonized World Soil Database Version 2.0; FAO: Rome, Italy; IIASA: Laxenburg, Austria, 2023. [Google Scholar]
  26. Mu, H.; Li, X.; Wen, Y.; Huang, J.; Du, P.; Su, W.; Miao, S.; Geng, M. A Global Record of Annual Terrestrial Human Footprint Dataset from 2000 to 2018. Sci. Data 2022, 9, 176. [Google Scholar] [CrossRef]
  27. Wright, M.N.; Ziegler, A. ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R. J. Stat. Softw. 2017, 77, 1–17. [Google Scholar] [CrossRef]
  28. Ploton, P.; Mortier, F.; Réjou-Méchain, M.; Barbier, N.; Pibernat, S.; Rossi, V.; Decourdière, C.; Cornu, G.; Viennois, G.; Bayol, N.; et al. Spatial Validation Reveals Poor Predictive Performance of Large-Scale Ecological Mapping Models. Nat. Commun. 2020, 11, 4540. [Google Scholar] [CrossRef] [PubMed]
  29. O’Brien, R.M. A Caution Regarding Rules of Thumb for Variance Inflation Factors. Qual. Quant. 2007, 41, 673–690. [Google Scholar] [CrossRef]
  30. Hair, J.F.; Black, W.C.; Babin, B.J.; Anderson, R.E. Multivariate Data Analysis, 7th ed.; Pearson: Upper Saddle River, NJ, USA, 2009; ISBN 978-0138132637. [Google Scholar]
  31. Belsley, D.A.; Kuh, E.; Welsch, R.E. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity; Wiley Classics Library; John Wiley & Sons: Hoboken, NJ, USA, 2004; ISBN 978-0471691174. [Google Scholar]
  32. Zhang, M.; Zhang, K.; Cen, Y.; Wang, P.; Xia, J. Effects of Grass Cover on the Overland Soil Erosion Mechanism Under Simulated Rainfall. Water Resour. Res. 2025, 61, e2023WR036888. [Google Scholar] [CrossRef]
  33. Tarolli, P. High-Resolution Topography for Understanding Earth Surface Processes: Opportunities and Challenges. Geomorphology 2014, 216, 295–312. [Google Scholar] [CrossRef]
Figure 1. Location of the study area. The blue boundary delineates Heilongjiang Province; the red boundary delineates the study area.
Figure 1. Location of the study area. The blue boundary delineates Heilongjiang Province; the red boundary delineates the study area.
Remotesensing 18 01844 g001
Figure 2. Typical gully erosion features observed during the Xunke field campaign. Panels (ac) show untreated, naturally evolving gullies: (a) linear gully distribution in a transitional hillslope–cropland zone; (b) vegetation conditions surrounding an active gully; (c) a gully formed after a rainstorm event. Panel (d) shows a rehabilitated gully on cultivated land where engineering check-dam infill and surface stabilization have been applied.
Figure 2. Typical gully erosion features observed during the Xunke field campaign. Panels (ac) show untreated, naturally evolving gullies: (a) linear gully distribution in a transitional hillslope–cropland zone; (b) vegetation conditions surrounding an active gully; (c) a gully formed after a rainstorm event. Panel (d) shows a rehabilitated gully on cultivated land where engineering check-dam infill and surface stabilization have been applied.
Remotesensing 18 01844 g002
Figure 3. Spatial distributions of the 16 environmental factors used for gully erosion susceptibility modeling. (ad) Topographic factors: elevation, slope, aspect, and relief; (eg) Topographic indices: TWI, TPI, and curvature; (h,i) Climatic factors: mean annual temperature and annual precipitation; (jl) Vegetation, land cover, and soil: NDVI, LULC, and soil type; (mp) Anthropogenic and proximity factors: Human Footprint Index, and distances to buildings, roads, and streams.
Figure 3. Spatial distributions of the 16 environmental factors used for gully erosion susceptibility modeling. (ad) Topographic factors: elevation, slope, aspect, and relief; (eg) Topographic indices: TWI, TPI, and curvature; (h,i) Climatic factors: mean annual temperature and annual precipitation; (jl) Vegetation, land cover, and soil: NDVI, LULC, and soil type; (mp) Anthropogenic and proximity factors: Human Footprint Index, and distances to buildings, roads, and streams.
Remotesensing 18 01844 g003
Figure 4. Methodological flowchart of the gully erosion susceptibility assessment framework.
Figure 4. Methodological flowchart of the gully erosion susceptibility assessment framework.
Remotesensing 18 01844 g004
Figure 5. Spatial distribution of the centroids of the 4020 mapped gully polygons across the six administrative districts of the Heihe region. Each red dot represents the centroid of one gully polygon. Gully clusters are concentrated in the central and southwestern agricultural plains.
Figure 5. Spatial distribution of the centroids of the 4020 mapped gully polygons across the six administrative districts of the Heihe region. Each red dot represents the centroid of one gully polygon. Gully clusters are concentrated in the central and southwestern agricultural plains.
Remotesensing 18 01844 g005
Figure 6. Comparison of performance metrics (Accuracy, Kappa, AUC, Precision, Recall, and F1-Score) across RF, XGBoost, and GBM.
Figure 6. Comparison of performance metrics (Accuracy, Kappa, AUC, Precision, Recall, and F1-Score) across RF, XGBoost, and GBM.
Remotesensing 18 01844 g006
Figure 7. ROC curves for all four models including the LR baseline. The dashed diagonal line represents the random classifier baseline (AUC = 0.5). All tree-based models achieve AUC > 0.93, while LR reaches 0.85.
Figure 7. ROC curves for all four models including the LR baseline. The dashed diagonal line represents the random classifier baseline (AUC = 0.5). All tree-based models achieve AUC > 0.93, while LR reaches 0.85.
Remotesensing 18 01844 g007
Figure 8. Normalized variable importance ranking across the three models (RF, XGBoost, and GBM).
Figure 8. Normalized variable importance ranking across the three models (RF, XGBoost, and GBM).
Remotesensing 18 01844 g008
Figure 9. SHAP analysis for the XGBoost model: global feature importance ranking and SHAP summary plot showing the direction and magnitude of each feature’s contribution to susceptibility predictions.
Figure 9. SHAP analysis for the XGBoost model: global feature importance ranking and SHAP summary plot showing the direction and magnitude of each feature’s contribution to susceptibility predictions.
Remotesensing 18 01844 g009
Figure 10. Gully erosion susceptibility map produced by RF, classified into five levels using Jenks natural breaks.
Figure 10. Gully erosion susceptibility map produced by RF, classified into five levels using Jenks natural breaks.
Remotesensing 18 01844 g010
Figure 11. Gully erosion susceptibility map produced by XGBoost, classified into five levels using Jenks natural breaks.
Figure 11. Gully erosion susceptibility map produced by XGBoost, classified into five levels using Jenks natural breaks.
Remotesensing 18 01844 g011
Figure 12. Gully erosion susceptibility map produced by GBM, classified into five levels using Jenks natural breaks.
Figure 12. Gully erosion susceptibility map produced by GBM, classified into five levels using Jenks natural breaks.
Remotesensing 18 01844 g012
Figure 13. Model uncertainty map showing the spatial distribution of pixel-level standard deviation of susceptibility predictions across the three models.
Figure 13. Model uncertainty map showing the spatial distribution of pixel-level standard deviation of susceptibility predictions across the three models.
Remotesensing 18 01844 g013
Figure 14. District-level analysis: (a) spatial distribution of priority treatment zones; (b) proportion of high-susceptibility areas (Jenks classes 4–5) by district across the three models.
Figure 14. District-level analysis: (a) spatial distribution of priority treatment zones; (b) proportion of high-susceptibility areas (Jenks classes 4–5) by district across the three models.
Remotesensing 18 01844 g014
Figure 15. Comparison of district-level gully erosion metrics: (a) gully erosion ratio by district; (b) multi-model susceptibility comparison across administrative districts.
Figure 15. Comparison of district-level gully erosion metrics: (a) gully erosion ratio by district; (b) multi-model susceptibility comparison across administrative districts.
Remotesensing 18 01844 g015
Figure 16. High-susceptibility area (km2) under baseline and two NDVI enhancement scenarios (+10% and +20%) for the three ensemble models.
Figure 16. High-susceptibility area (km2) under baseline and two NDVI enhancement scenarios (+10% and +20%) for the three ensemble models.
Remotesensing 18 01844 g016
Figure 17. Spatial distribution of susceptibility change ( Δ Susceptibility) under the NDVI +10% (a) and +20% (b) scenarios (ensemble mean). Green: reduced susceptibility; red: increased susceptibility relative to baseline.
Figure 17. Spatial distribution of susceptibility change ( Δ Susceptibility) under the NDVI +10% (a) and +20% (b) scenarios (ensemble mean). Green: reduced susceptibility; red: increased susceptibility relative to baseline.
Remotesensing 18 01844 g017
Table 1. Satellite data used in this study.
Table 1. Satellite data used in this study.
SatelliteSensorSpatial ResolutionAcquisition PeriodBands Used
Sentinel-2MSI10/20/60 mMay–October 2023B2, B3, B4, B8, B11
Gaofen-1PMS/WFV2/8/16 m2023Pan, R, G, B, NIR
Gaofen-2PMS0.8/4 m2023Pan, R, G, B, NIR
Table 2. Environmental factors used for gully erosion susceptibility modeling.
Table 2. Environmental factors used for gully erosion susceptibility modeling.
CategoryFactorData SourceResolution
TopographyDEM, Slope, Aspect, Relief, TPI, TWI, CurvatureALOS DEM12.5 m
ClimateMean Annual Temperature (MAT)ERA5-Land∼10 km
Annual PrecipitationTerraClimate∼4 km
Vegetation & land coverNDVILandsat surface reflectance30 m
LULCProject-compiled from Sentinel-225 m
SoilSoil typeHWSD21 km
Anthropogenic & proximityHuman Footprint Index (HFI)Mu et al. (global annual HFI, Figshare)1 km
Distance to buildings, roads, streamsNational geographic databases
Table 3. Random-search ranges and best hyperparameter configuration per model (50 candidates per model, 5-fold CV-AUC objective on the 70% training set; final test AUC reported on the held-out 30% test set with the best configuration re-fitted on the full training set).
Table 3. Random-search ranges and best hyperparameter configuration per model (50 candidates per model, 5-fold CV-AUC objective on the 70% training set; final test AUC reported on the held-out 30% test set with the best configuration re-fitted on the full training set).
ModelHyperparameterSearch RangeBest Value
RFnum.trees { 200 , 300 , 500 , 700 , 1000 } 1000
mtry { 2 , 3 , , 10 } 4
min.node.size { 1 , 2 , , 10 } 3
XGBoost η (learning rate) [ 0.01 , 0.30 ] 0.076
max_depth { 3 , 4 , , 10 } 7
subsample [ 0.5 , 1.0 ] 0.76
colsample_bytree [ 0.5 , 1.0 ] 0.91
min_child_weight { 1 , 2 , , 10 } 1
gamma ( γ ) [ 0 , 5 ] 0.12
nrounds (early stop, patience = 30)up to 500429
GBMn.trees { 200 , 300 , 500 , 700 , 1000 } 1000
interaction.depth { 3 , 4 , 5 , 6 , 7 } 5
shrinkage [ 0.01 , 0.10 ] 0.092
bag.fraction [ 0.5 , 1.0 ] 0.79
Best-config performance summaryCV-AUC (mean ± SD)Test AUC
RF (best config) 0.939 ± 0.007 0.95
XGBoost (best config) 0.940 ± 0.007 0.95
GBM (best config) 0.933 ± 0.005 0.94
Table 4. Variance inflation factor (VIF) values for the 16 environmental factors.
Table 4. Variance inflation factor (VIF) values for the 16 environmental factors.
VariableVIFVariableVIF
TWI8.406D_Building1.443
Slope8.039Curvature1.320
DEM3.942Soil Types1.274
MAT2.188D_Stream1.272
Relief2.023NDVI1.171
HFI1.673D_Road1.124
TPI1.601LULC1.096
Aspect1.045Precipitation1.009
Table 5. Gully count and density by administrative unit in the Heihe region.
Table 5. Gully count and density by administrative unit in the Heihe region.
Administrative UnitArea (km2)Gully CountDensity (/km2)Proportion (%)
Nenjiang City15,21710440.06925.9
Xunke County10,68510250.09625.4
Wudalianchi City15,0867810.05219.4
Beian City71905940.08314.7
Sunwu County47755000.10512.4
Aihui District13,911760.0051.9
Total66,86540200.060100.0
Table 6. Per-LULC-class gully frequency ratio (FR) on the analysis grid (CLUD-style class labels). FR > 1 : class is overrepresented among gully cells relative to its areal share; FR < 1 : underrepresented.
Table 6. Per-LULC-class gully frequency ratio (FR) on the analysis grid (CLUD-style class labels). FR > 1 : class is overrepresented among gully cells relative to its areal share; FR < 1 : underrepresented.
RankLULC Class (Code)Area CellsArea (%)Gully CellsGully (%)FR
1Sparse Woodland (5)40,418,58137.71105,84293.912.490
2Moderate Grassland (8)50,3940.05610.051.151
3Water Body (11)10,094,3529.4251704.590.487
4Paddy Field (1)1,177,7791.10870.080.070
5Dryland Cropland (2)54,153,56550.5315131.340.027
6Dense Grassland (7)1,173,4301.09290.030.023
7Shrubland (4)109,0440.1000.000.000
8Sparse Grassland (9)16100.0000.000.000
Total 107,178,755100.00112,702100.00
Table 7. Slope-gradient class gully frequency ratio (FR). Slope bins follow geomorphic interpretation: <2° near-flat, 2–5° gentle, 5–10° moderate, 10–20° moderately steep, >20° steep.
Table 7. Slope-gradient class gully frequency ratio (FR). Slope bins follow geomorphic interpretation: <2° near-flat, 2–5° gentle, 5–10° moderate, 10–20° moderately steep, >20° steep.
RankSlope ClassArea CellsArea (%)Gully CellsGully (%)FR
10–2°37,012,47034.5358,41151.831.501
22–5°45,260,31342.2348,85843.351.027
35–10°19,157,35817.8751754.590.257
410–20°5,273,5524.922530.220.046
5>20°475,0620.4450.000.010
Total 107,178,755100.00112,702100.00
Table 8. Performance comparison of the four susceptibility models on the independent test set.
Table 8. Performance comparison of the four susceptibility models on the independent test set.
ModelAUCAccuracyKappaPrecisionSensitivitySpecificityF1 (Gully)
LR0.850.760.450.640.600.840.62
RF0.950.880.720.790.830.900.81
XGBoost0.950.890.740.810.840.910.82
GBM0.940.860.680.780.790.890.79
Table 9. Leave-one-district-out spatial cross-validation results (mean ± SD across 6 folds).
Table 9. Leave-one-district-out spatial cross-validation results (mean ± SD across 6 folds).
ModelAUCAccuracyKappaF1 (Gully)
LR0.80 ± 0.070.74 ± 0.100.31 ± 0.160.50 ± 0.21
RF0.84 ± 0.050.76 ± 0.120.29 ± 0.220.43 ± 0.26
XGBoost0.84 ± 0.050.75 ± 0.130.29 ± 0.210.44 ± 0.25
GBM0.84 ± 0.060.76 ± 0.120.34 ± 0.230.50 ± 0.28
Table 10. Jenks natural break thresholds for the three susceptibility models.
Table 10. Jenks natural break thresholds for the three susceptibility models.
Class BoundaryRFXGBoostGBM
Very Low/Low0.0850.0900.093
Low/Moderate0.2420.2720.260
Moderate/High0.4190.4880.451
High/Very High0.6070.7080.658
Table 11. Susceptibility class area statistics for the three models.
Table 11. Susceptibility class area statistics for the three models.
SusceptibilityRFXGBoostGBM
km2%km2%km2%
Very Low37,77258.3046,66872.0341,94864.75
Low966414.92655510.12782112.07
Moderate692310.6947667.3658278.99
High64089.8937915.8553068.19
Very High40206.2130074.6438866.00
Table 12. Frequency ratio (FR) validation: ratio of observed gully proportion to area proportion within each susceptibility class. FR > 1 indicates gully overrepresentation.
Table 12. Frequency ratio (FR) validation: ratio of observed gully proportion to area proportion within each susceptibility class. FR > 1 indicates gully overrepresentation.
Susceptibility ClassFR (RF)FR (XGBoost)FR (GBM)
Very Low0.010.020.02
Low0.130.370.34
Moderate0.601.150.98
High1.973.232.84
Very High11.5714.5510.48
Table 13. Gully erosion and susceptibility statistics by administrative district.
Table 13. Gully erosion and susceptibility statistics by administrative district.
DistrictArea (km2)GulliesDensity (/km2)RF High (%)XGB High (%)GBM High (%)
Aihui13,911760.0051.10.61.0
Xunke10,68510250.09613.29.911.9
Sunwu47755000.10519.514.717.3
Beian71905940.08333.421.329.0
Wudalianchi15,0867810.05217.111.416.6
Nenjiang15,21710440.06921.312.317.2
Table 14. Partial Spearman correlation and four-group variance partitioning for the gully indicator (joint training/test sample, n = 13,734 ). Top panel: partial vs. marginal Spearman correlation of HFI with the binary gully indicator. Bottom panel: adjusted R 2 fractions from a four-group variance partitioning (groups: anth = HFI, LULC, D_Building, D_Road; topo = DEM, Slope, Aspect, Curvature, Relief, TWI, TPI; clim = MAT, Total PPT; sv = Soil Types, NDVI, D_Stream). Only unique fractions and the four-way shared fraction are shown; pairwise and three-way shared fractions (summing to 0.132) are omitted for brevity.
Table 14. Partial Spearman correlation and four-group variance partitioning for the gully indicator (joint training/test sample, n = 13,734 ). Top panel: partial vs. marginal Spearman correlation of HFI with the binary gully indicator. Bottom panel: adjusted R 2 fractions from a four-group variance partitioning (groups: anth = HFI, LULC, D_Building, D_Road; topo = DEM, Slope, Aspect, Curvature, Relief, TWI, TPI; clim = MAT, Total PPT; sv = Soil Types, NDVI, D_Stream). Only unique fractions and the four-way shared fraction are shown; pairwise and three-way shared fractions (summing to 0.132) are omitted for brevity.
Top: Spearman correlation of HFI with gully indicator
Metric ρ p-valueControlled for
Partial Spearman0.205 < 10 129 DEM, Slope, MAT, Total PPT
Marginal Spearman0.467≈0
Bottom: 4-group variance partitioning (adjusted R 2 )
FractionAdj.  R 2 Interpretation
[a] anthropogenic unique0.030HFI/LULC/D_Build/D_Road only
[d] soil–vegetation unique0.017Soil/NDVI/D_Stream only
[b] topographic unique0.015DEM–TPI block only
[c] climatic unique0.012MAT/Total PPT only
[o] shared across all four0.080co-variation among all four groups
    Total explained (4 groups)0.286joint adjusted R 2
    Residual0.714unexplained
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zheng, J.; Wan, F.; Cai, Y.; Liu, J.; Wang, D.; Guo, X.; Chen, B. Multi-Model Machine Learning Mapping of Gully Erosion Susceptibility in the Heihe Region of the Xiaoxingán Mountains, China. Remote Sens. 2026, 18, 1844. https://doi.org/10.3390/rs18111844

AMA Style

Zheng J, Wan F, Cai Y, Liu J, Wang D, Guo X, Chen B. Multi-Model Machine Learning Mapping of Gully Erosion Susceptibility in the Heihe Region of the Xiaoxingán Mountains, China. Remote Sensing. 2026; 18(11):1844. https://doi.org/10.3390/rs18111844

Chicago/Turabian Style

Zheng, Jilin, Fanle Wan, Yanlong Cai, Junshuai Liu, Dake Wang, Xiaoyu Guo, and Bowei Chen. 2026. "Multi-Model Machine Learning Mapping of Gully Erosion Susceptibility in the Heihe Region of the Xiaoxingán Mountains, China" Remote Sensing 18, no. 11: 1844. https://doi.org/10.3390/rs18111844

APA Style

Zheng, J., Wan, F., Cai, Y., Liu, J., Wang, D., Guo, X., & Chen, B. (2026). Multi-Model Machine Learning Mapping of Gully Erosion Susceptibility in the Heihe Region of the Xiaoxingán Mountains, China. Remote Sensing, 18(11), 1844. https://doi.org/10.3390/rs18111844

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop