Data Biases in Geohazard AI: Investigating Landslide Class Distribution Effects on Active Learning and Self-Optimizing

Miao, Jing; Wang, Zhihao; Ma, Tianshu; Wang, Zhichao; Gao, Guoming

doi:10.3390/rs17132211

Open AccessArticle

Data Biases in Geohazard AI: Investigating Landslide Class Distribution Effects on Active Learning and Self-Optimizing

by

Jing Miao

¹,

Zhihao Wang

^2,*,

Tianshu Ma

¹,

Zhichao Wang

³ and

Guoming Gao

⁴

¹

Space Star Technology Co., Ltd., Beijing 100086, China

²

Department of Land, Air, and Water Resource, University of California, Davis, CA 95616, USA

³

Beijing Key Laboratory of Precision Forestry, Beijing Forestry University, Beijing 100083, China

⁴

School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin 150001, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(13), 2211; https://doi.org/10.3390/rs17132211

Submission received: 5 February 2025 / Revised: 13 June 2025 / Accepted: 25 June 2025 / Published: 27 June 2025

(This article belongs to the Special Issue Recent Advances in Multispectral and Hyperspectral Image Analysis and Classification)

Download

Browse Figures

Versions Notes

Abstract

Data bias in geohazard artificial intelligence (AI) systems, particularly class distribution imbalances, critically undermines the reliability of landslide detection models. While active learning (AL) offers promise for mitigating annotation costs and addressing data biases, the interplay between landslide class proportions and AL efficiency remains poorly quantified; additionally, self-optimizing mechanisms to adaptively manage class imbalances are underexplored. This study bridges these gaps by rigorously evaluating how landslide-to-non-landslide ratios (1:1, 1:12, and 1:30) influence the effectiveness of a widely used AL strategy—margin sampling. Leveraging open-source landslide inventories, we benchmark margin sampling against random sampling using the area under the receiver operating characteristic curve (AUROC) and partial AUROC while analyzing spatial detection accuracy through classification maps. The results reveal that margin sampling significantly outperforms random sampling under severe class imbalances (1:30), achieving 12–18% higher AUROC scores and reducing false negatives in critical landslide zones. In balanced scenarios (1:1), both strategies yield comparable numerical metrics; however, margin sampling produces spatially coherent detections with fewer fragmented errors. These findings indicate that regardless of the landslide proportion, AL enhances the generalizability of landslide detection models in terms of predictive accuracy and spatial consistency. This work also provides actionable guidelines for deploying adaptive AI systems in data-scarce, imbalance-prone environments.

Keywords:

geohazard AI; data bias mitigation; active learning; landslide class distribution

1. Introduction

As deadly natural disasters, landslides are triggered by extreme events, such as earthquakes, heavy rainfall, and rapid snowmelt [1]. From January 1995 to December 2014, 3876 landslides killed 163,658 people and injured 11,689 worldwide [2]. Beyond the tragic loss of life, landslides can also cause severe socio-economic damage by destroying homes, infrastructure, and agricultural land. Each year, landslides are responsible for billions of dollars in economic losses globally [3,4]. Moreover, climate change and urban expansion have increased the frequency and extent of landslides [5,6,7]. Therefore, the rapid and reliable identification of incipient landslides can contribute to early warning systems and rapid evacuation.

Machine learning and statistical methods have shown promise in landslide assessments, especially when dealing with the challenge of mapping landslide/landslide-prone areas in large regions [8]. For example, the authors of [9] achieved an automatic slow-moving landslide detection method over a large area by combining Synthetic Aperture Radar Interferometry with the improved Attention-YOLOv3 network. Furthermore, the authors of [10] used convolutional neural networks and the random forest algorithm based on satellite multitemporal interferometry, resulting in the automatic mapping of potential landslides. Additionally, the authors of [11] demonstrated that generalized additive models can obtain a higher accuracy when predicting shallow landslide occurrence in comparison with the generalized linear model. However, despite these advances, existing studies commonly assume that landslide occurrences in training datasets are adequately represented and class-balanced, leaving a critical knowledge gap that limits the practical reliability of machine learning-based landslide assessments.

Recent advances in geohazard artificial intelligence (AI) have highlighted persistent challenges stemming from data biases, particularly those arising from skewed landslide class distributions in training datasets. For instance, the authors of [11] demonstrated how different class proportions can significantly impact classifier performance in 44 imbalanced datasets, including environmental datasets. Meanwhile, the authors of [12] demonstrated different predictive performances of algorithms across 31 datasets with varying imbalance ratios. Landslide detection models, which are predominantly trained on imbalanced datasets where non-landslide pixels vastly outnumber landslide instances, exhibit systematic biases toward majority classes, compromising their reliability in real-world scenarios. Studies reveal that such imbalances—often exacerbated by the inherent rarity of landslide events and uneven spatial sampling—lead to inflated false-negative rates, as well as the misclassification of critical landslide zones as stable terrain. For example, analyses of widely used inventories (e.g., NASA COOLR and regional landslide databases) demonstrate that class ratios exceeding 1:30 (landslide: non-landslide) degrade model recall by 20–40% compared to balanced settings [12,13].

Active learning (AL) has emerged as a promising tool to mitigate annotation costs and reduce the impact of class imbalances in the training set, which has been demonstrated by many studies in landslide assessments. For instance, the authors of [14] used query-by-committee active learning to reduce the computational time for training and classification in landslide detection assessments. Moreover, the authors of [15] found that the uncertainty sampling active learning strategy can achieve the best predictive performance in mapping landslides, using an Ecuadorian case study, compared to the query-by-committee active learning strategy and random sampling in landslide detection assessments. While AL has shown its potential in natural hazard assessments, its effectiveness under different degrees of class imbalances remains underexplored.

The current body of literature predominantly focuses on algorithmic enhancements (e.g., loss reweighting and synthetic oversampling) but neglects the dynamic interplay between class distributions and AL’s sample selection mechanisms [15,16,17]. Furthermore, few studies quantify how spatially clustered landslide signatures—common in geomorphologically complex regions—interact with class ratios to amplify sampling biases. This oversight perpetuates suboptimal model deployments, where AL strategies designed for balanced data fail to prioritize under-represented landslide features in imbalanced contexts. Addressing these gaps is critical for developing bias-resilient AI systems that are capable of adapting to the inherent data scarcity and heterogeneity of geohazard monitoring.

By collating recent findings, this work underscores the urgent need to evaluate the landslide class distribution effects on AI-driven detection frameworks, particularly in relation to optimizing AL strategies for real-world imbalance scenarios. The experiment in the present study is conducted based on landslide inventories with different class proportions (i.e., different landslide number percentages). The area under the ROC curve (AUROC) and the partial AUROC are used for assessing the quantitative performance of active learning due to their promise in model evaluation and their effectiveness for comparison in landslide assessments [18,19,20]. The open-sourced landslide inventory of the Iburu seismic zone in Hokkaido, Japan, is used in our study.

2. Materials and Methods

2.1. Study Area

The study area is located in the southwestern part of Hokkaido, one of the most tectonically active regions in the world (Figure 1). The primary geological formations are sedimentary and volcanic rocks, with land cover predominantly consisting of forests and paddy fields [21,22]. The Iburi earthquake occurred on the 6 September 2018; the maximum seismic intensity was 7.0 [23]. This earthquake triggered approximately 6000 landslides, resulting in the submergence of numerous buildings, hundreds of fatalities, and many missing persons [24]. The authors of [25] pointed out that the varying distances from the earthquake’s epicenter led to different spatial distributions of co-seismic landslides, causing varied intensities of landslide impacts across different regions. Consequently, the authors of [25] divided the study area into three zones of differing landslide-triggering severities to reflect the overall spatial patterns of coseismic landslides. These predefined zones were then adopted in this study (Figure 1); more details can be found in [25].

Since different landslide-triggering severity zones contain different proportions of landslides and non-landslides, and given the fact that the same area has similar geological formations, this paper explores the effects of different proportions of landslides and non-landslides on the effectiveness of AL in constructing training samples based on these three areas.

2.2. Data

Digital elevation models (DEMs) with a 10 m × 10 m resolution are provided by the Geospatial Information Authority of Japan (source: https://fgd.gsi.go.jp/ (accessed on 5 February 2025)). Pre-event (3 August 2018) and post-event (21 September 2018) optical images were obtained from PlanetScope [26]. The images, featuring four spectral bands (blue, green, red, and near-infrared), have a 3 m × 3 m spatial resolution and a daily temporal resolution. All images were resampled to a 10 m × 10 m resolution using the B-Spline Interpolation method in SAGA (System for Automated Geoscientific Analysis) GIS 7.4.0. Cloud-covered areas in both the pre- and post-event images were removed to obtain the overlapping areas using SAGA GIS [27]. Consequently, the three study areas used have a coverage of 173 km², 623 km², and 1216 km², respectively.

Landslide inventories for all study areas were obtained from [25]. A 200 m buffer was created around landslides, while non-landslide points were selected from areas outside the 200 m buffer areas. Following the authors of [28], landslides <400 m² were excluded, while landslide presence points were randomly sampled within the landslide polygons. In total, the three study areas encompass 3564, 5380, and 5473 co-seismic landslides (Table 1).

2.3. Predictor Variables

Predictor variables were derived from the DEM, as well as the pre- and post-event optical images. Common terrain predictors include local slope angle (°, slope), elevation (m, dem), plan and profile curvature (rad m⁻¹, plancurv, and profcurv), catchment slope angle (cslope), SAGA—Topographic Wetness Index (TWI), and upslope contributing area (m²) [8]. Previous studies have indicated that the above-terrain predictors can be proxies for destabilizing forces (e.g., slope and catchment slope), water availability (e.g., logarithm of the size of the upslope contributing area (log.carea) and concave curvatures), and exposure to wind (convex curvatures) [29,30].

The normalized difference vegetation index (NDVI), which can be generated from optical images, is an effective index for distinguishing between green vegetation areas in relation to landslides [1,31,32,33]. Normally, the value of the NDVI decreases sharply in landslide areas.

N D V I = \frac{N I R - R e d}{N I R + R e d}

(1)

Here, NIR and Red represent the red and near-infrared bands of optical images, respectively. We used the difference between the NDVI of pre- and post-event optical images as one of the predictor variables.

The R package “RSAGA” [34] and SAGA GIS were used for the generation of predictor variables. The predictor variables for landslide and non-landslide observations are shown in Table 2. All predictors that presented outliers were winsorized at the 1st and 99th percentile.

2.4. Active Learning

Active learning is a subfield of machine learning that aims to achieve good predictive performances using a small amount of training data [35,36,37]. The main hypothesis is that if learning algorithms can select samples that they are curious about, their performance will be improved with a small amount of training data. The active learning framework employs a query strategy to select instances based on the results (e.g., posterior probabilities) of learning algorithms predicting unlabeled instances; these instances are then labeled by the “oracle” (e.g., a human annotator) (Figure 2). These selected instances are typically referred to as “informative” or “valuable” instances.

In general, the setting of active learning queries can be classified into membership query synthesis, stream-based selective sampling, or pool-based sampling [37]. In membership query synthesis, a human annotator needs to label the unlabeled instance, which results from underlying/non-underlying natural distribution [38]. Previous studies have pointed out that membership query synthesis is not suitable for real-world applications since arbitrary instances are possibly generated [35,39]. The stream-based selective sampling and pool-based sampling settings can address this problem. They query the unlabeled instances from the actual distribution and present them to a human annotator to label [40,41]. The difference between the stream-based sampling and pool-based sampling settings is that the former selects one unlabeled instance at a time in order to decide whether to label it or not, while the latter selects multiple unlabeled instances after ranking the entire unlabeled dataset. In this work, pool-based sampling is adopted since it is plausible for real-world applications with large collections of unlabeled data.

In terms of deciding the informativeness of unlabeled instances, three criteria are proposed, which are the disagreements of committees, the distance to the decision boundary, and posterior probability [42]. The authors of [42] demonstrated that active learning strategies based on posterior probability had the advantage of speed while yielding good results. As the simplest and most commonly used query framework, uncertainty sampling (US) queries the instances with the least certainty about their labels and can be applied to any classifier that provides probabilistic outputs [36,37,43]. Among US strategies, the margin sampling active learning strategy (MS) is one of the most widely implemented and effective strategies, considering the distance of a sample to the model’s hyperplane, which has been confirmed in many studies, including in relation to land cover and landslide mapping [42,44]. In addition, the authors of [29] compared MS with other active learning strategies, demonstrating that it outperformed them in the context of landslide mapping. Therefore, MS is adopted to investigate the impact of landslide inventories with different landslide number percentages on the performance of active learning. MS selects the instance

x_{S}

whose posterior probabilities for the first and second likely classes,

y_{1}^{*}

and

y_{2}^{*}

, demonstrate the smallest difference.

x_{S} = a r g m i n (P_{θ} (y_{1}^{*} | x) - P_{θ} (y_{2}^{*} | x))

(2)

Here, the posterior probability

P_{θ} (y | x_{i})

is the result of the prediction of the

i

-th instance

x

on the class

y

according to the model

θ

.

2.5. Experiment Design

The generalized additive model (GAM) is adopted to generate posterior possibilities for MS since it allows for nonlinear relationships between the features and the response, while maintaining interpretability [45,46]. For example, the authors of [8] compared GAM with logistic regression, weights of evidence, the support vector machine, random forest classification, and bootstrap aggregated classification trees with penalized discriminant analysis, determining that the performances of all modeling techniques were, for the majority, not significantly different from each other and that the GAM can produce smooth prediction surfaces. Furthermore, the authors of [47] proposed a new active-transfer learning strategy based on MS with GAM, resulting in stable and accurate landslide mapping. Due to the complexity and limited transferability of segmentation procedures in object-based approaches across different regions and sensors [48], a pixel-based experiment was conducted in this study.

In terms of the ratio of landslides and non-landslides in the initial training data, we set the total number of initial training samples to 100 and conducted experiments using 1:9 ratios of landslides and non-landslides. Then, the selection of unlabeled landslide/non-landslide points was carried out in study areas 1, 2, and 3, respectively, to investigate the impact of different class proportions in the entire study area.

To eliminate random variances, we repeated the whole process 100 times. The mean and standard deviation of predictive accuracy (AUROC and pAUROC) based on datasets with different class proportions were compared. In addition, the commonly used random sampling strategy (RS) was also applied as a benchmark to assess the performance of MS. The process of the computational experiment is shown in Figure 3.

3. Results

3.1. Mean and Standard Deviation of AUROCs and pAUROCs

The overall performance of MS was more robust than that of RS in most situations (Figure 4 and Figure 5). In terms of the mean AUROCs and pAUROCs, MS outperformed RS in most situations (Figure 4). However, when the ratio of landslide and non-landslide points was 1:1 (i.e., the dataset had a balanced class proportion), RS showed better performances than MS in relation to both AUROC and pAUROC, even when the training dataset was small, such as epoch 4 with 200 landslide and non-landslide points (Figure 4a). When the ratio of landslide and non-landslide points became smaller, i.e., the dataset exhibited imbalanced class proportions, MS obtained higher AUROCs and pAUROCs than RS. For example, when the number of the training datasets was 200 in study area 2, MS obtained an AUROC of 0.892, as well as a pAUROC of 0.0714, while RS obtained an AUROC of 0.869 and a pAUROC of 0.0651 (Figure 4b). In particular, when the dataset clearly demonstrated imbalanced class proportions, i.e., the ratio of landslide and non-landslide points was 1:30, MS still obtained high AUROCs and pAUORCs when the training dataset was small (Figure 4c). For example, MS can obtain an AUROC of 0.866 and a pAUROC of 0.0681 based on 200 landslide and non-landslide points. The AUROCs and pAUROCs obtained by RS did not increase with the increasing size of the training dataset and were consistently lower than the AUROC and pAUROC values obtained by MS.

Consistent with the performance of mean AUROCs and pAUROCs, in the case of class imbalance in the dataset, MS was more stable than RS (Figure 5). However, in the case of class balance in the dataset, RS obtained a better performance (Figure 5a). Comparing the changes in the standard deviation of AUROCs and pAUROCs obtained from MS in all scenarios, the trends in the predicted values obtained from MS over 100 repetitions were consistent from the class-balanced (1:1) to the class-very-unbalanced samples (1:30). In contrast to MS, RS showed a smooth change in the precision values obtained over 100 repetitions only when the class was balanced; however, the performance of RS became progressively less smooth as the difference between the number of landslides and non-landslides in the dataset gradually increased.

3.2. Self-Optimizing Ability

For both class-balanced and class-imbalanced scenarios, MS was more capable than RS at detecting landslide points (Figure 6). In the case of class imbalance in the dataset, the amount of information relating to landslides in the training set gradually increased, which was consistent with the performance of the mean and standard deviation of AUROCs and pAUROCs obtained from MS (Figure 4b,c, Figure 5b,c and Figure 6). Nevertheless, in the case of class balance in the dataset, although the landslide information in the training set obtained by MS grew faster than that in the training set obtained by RS, there was no significant difference in the mean and standard deviation of AUROCs and pAUROCs obtained using both sampling strategies (Figure 4a, Figure 5a and Figure 6).

3.3. Classification Mapping

The landslide classification maps revealed that MS was able to detect landslides more spatially coherently and accurately than RS (Figure 7). Regardless of the proximity of the number of landslides and non-landslides, MS was able to detect the location where the landslide occurred, which was consistent with the mean and standard deviation of AUROCs and pAUROCs obtained above (Figure 5 and Figure 6). In contrast to MS, RS did not detect landslides well in its derived landslide classification map, despite obtaining a better mean and standard deviation of AUROCs and pAUROCs in the case where the number of landslides and non-landslides were proximate (Figure 7a).

In addition, we showed the zoomed-in landslide classification maps obtained with MS and RS using 350 landslide and non-landslide points from the landslide inventory (Figure 8). Almost all landslides were covered by areas that had been categorized as being likely (high)/very likely (very high) to present a landslide obtained by MS, which was consistent with the numerical performance indicators obtained above (Figure 8a). Compared to MS, the areas of likely or very likely landslides detected by RS had a lower overlap with the actual landslide inventory (Figure 8b).

4. Discussion

4.1. Impact of Class Proportions in Selecting Sampling Strategies

The class proportions in data can impact the performance of sampling strategies. Previous studies have demonstrated that the class proportions in data need to be taken into account when choosing a sampling strategy [49,50]. However, many studies today adopt commonly used sampling strategies by assuming that the class proportions in the data are the same [51,52,53]. The authors of [54] indicated that the ratio of classes in the data is imbalanced in real-world applications, especially in natural hazards; in most cases, the proportion of hazards is much smaller than the proportion of non-hazards (e.g., landslides). Hence, when natural hazards occur, traditional sampling techniques may reduce the accuracy of detecting hazards, thus affecting the efficiency of rescue. Although the application of active learning in natural hazards is in its infancy, a growing number of studies have demonstrated its effectiveness in sampling ability [29,47]. In line with previous studies, in this study, we found that MS was able to sample classes with a small percentage of data.

When the ratio of classes in the data is proximate, the traditional sampling strategy may be more appropriate; however, further validation in combination with classification maps is needed. In our study, RS showed a better detection accuracy compared to MS when the ratio of the number of landslides to non-landslides was close to 1:1 (Figure 4a and Figure 5a). Previous studies have pointed out that the class proportions obtained from random sampling are consistent with the proportions of classes that are present in the data [50]. Thus, when the class proportions in the data are approaching, random sampling can be a fast and simple method that can be used to obtain samples. Nevertheless, as shown in Figure 7 and Figure 8, the accuracy of landslide detection from the landslide classification map obtained by RS was lower than that obtained by MS. Therefore, from the point of view of stability, as well as the accuracy of the results, MS was better than RS in building the training dataset for landslide assessments.

4.2. Limitations and Outlook

Instances obtained by active learning are “informative” when building the training dataset. Previous studies have pointed out that allowing learning algorithms to pick their samples can save time and effort to achieve highly accurate detection results; this finding has been confirmed by many studies [36,37,55]. Nevertheless, most studies on active learning sampling strategies are based on data with imbalanced classes and do not consider the condition of balanced classes. This study compared the detection performances obtained using the active learning strategy and random sampling by setting different class ratios. It found that although active learning did not significantly improve the detection accuracy compared to random sampling when the number of landslides and non-landslides was close to each other by selecting the instances that the learning algorithm was curious about (i.e., informative instances), combining the numerical performance metrics and landslide classification maps, the results obtained by MS were accurate and stable, regardless of the proportion of classes, even when the number of training samples was small (Figure 4, Figure 5, Figure 7 and Figure 8).

One of the reasons as to why the detection accuracy is not improved may be because the MS active learning strategy is to select instances close to the decision plane. When feature differences between classes in the data are significant and close in number, adding instances close to the decision plane can cause the learning algorithm to become overfitted, which, in turn, affects the detection effectiveness of the learning algorithm. Another reason may be related to the learning algorithm itself. Studies have shown that different learning algorithms perform differently in landslide assessments [8,33,56]. Therefore, the use of different learning algorithms also affects the quality of the instances selected by the MS, which, in turn, affects the detection performance. Moreover, there are other active learning strategies; therefore, the detection performance based on the “informative” instances obtained by them needs to be further researched.

In the meantime, the “informative” instances obtained by active learning need to be further investigated in multi-classification studies. This study compared the performances of MS and RS in detecting landslides and non-landslides, i.e., in a binary classification study. Since the characteristics between classes will be more complex in multi-classification instances, the training samples need to contain enough information for each class. Whether active learning can effectively extract information about each category needs further research. Finally, our findings are based on open-source co-seismic landslide inventories; however, landslides triggered by other events, such as rainfall-induced landslides, may exhibit distinct spatial distributions. Future research should explore the performance of sampling strategies across various event-based landslide types in order to validate and extend the generalizability of our active learning framework to diverse geohazard contexts.

5. Conclusions

Considering the impact of class ratio on machine learning, this study investigates the effectiveness of the MS active learning strategy by using open-source landslide inventories with different landslide number percentages. The results indicate that different landslide number percentages can affect the performance of MS, but MS still demonstrates good performance in imbalanced situations. In particular, compared with RS, MS always outperforms RS. We suggest that the proportion of classes in the whole dataset should be taken into account when selecting a sampling strategy. Further studies on the effectiveness of active learning strategies in multiple classification studies will be investigated.

Author Contributions

Conceptualization, J.M. and Z.W. (Zhihao Wang); methodology, Z.W. (Zhihao Wang); validation, J.M. and Z.W. (Zhihao Wang); investigation, J.M. and Z.W. (Zhihao Wang); resources, J.M.; data curation, J.M. and Z.W. (Zhihao Wang); writing—original draft preparation, J.M.; writing—review and editing, J.M., Z.W. (Zhihao Wang), Z.W. (Zhichao Wang), G.G. and T.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The landslide inventories that support the findings of this study are available from Zhang et al. (2019) [25]. The DEM that supports the findings of this study is available from Geospatial Information Authority of Japan.

Acknowledgments

We would like to express our gratitude to Zhang et al. (2019) [25] and Geospatial Information Authority of Japan that have contributed to the datasets used in the study.

Conflicts of Interest

Authors Jing Miao and Tianshu Ma were employed by the company Space Star Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Fiorucci, F.; Ardizzone, F.; Mondini, A.C.; Viero, A.; Guzzetti, F. Visual Interpretation of Stereoscopic NDVI Satellite Images to Map Rainfall-Induced Landslides. Landslides 2019, 16, 165–174. [Google Scholar] [CrossRef]
Haque, U.; Da Silva, P.F.; Devoli, G.; Pilz, J.; Zhao, B.; Khaloua, A.; Wilopo, W.; Andersen, P.; Lu, P.; Lee, J. The Human Cost of Global Warming: Deadly Landslides and Their Triggers (1995–2014). Sci. Total Environ. 2019, 682, 673–684. [Google Scholar] [CrossRef]
Haque, U.; Blum, P.; da Silva, P.F.; Andersen, P.; Pilz, J.; Chalov, S.R.; Malet, J.-P.; Auflič, M.J.; Andres, N.; Poyiadji, E.; et al. Fatal Landslides in Europe. Landslides 2016, 13, 1545–1554. [Google Scholar] [CrossRef]
Teh, D.; Khan, T. Types, Definition and Classification of Natural Disasters and Threat Level. In Handbook of Disaster Risk Reduction for Resilience: New Frameworks for Building Resilience to Disasters; Eslamian, S., Eslamian, F., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 27–56. ISBN 978-3-030-61278-8. [Google Scholar]
Jemec Auflič, M.; Bezak, N.; Šegina, E.; Frantar, P.; Gariano, S.L.; Medved, A.; Peternel, T. Climate Change Increases the Number of Landslides at the Juncture of the Alpine, Pannonian and Mediterranean Regions. Sci. Rep. 2023, 13, 23085. [Google Scholar] [CrossRef] [PubMed]
Zhu, Y.; Qiu, H.; Liu, Z.; Ye, B.; Tang, B.; Li, Y.; Kamp, U. Rainfall and Water Level Fluctuations Dominated the Landslide Deformation at Baihetan Reservoir, China. J. Hydrol. 2024, 642, 131871. [Google Scholar] [CrossRef]
Qiu, H.; Li, Y.; Zhu, Y.; Ye, B.; Yang, D.; Liu, Y.; Wei, Y. Do Post-Failure Landslides Become Stable? CATENA 2025, 249, 108699. [Google Scholar] [CrossRef]
Goetz, J.N.; Brenning, A.; Petschko, H.; Leopold, P. Evaluating Machine Learning and Statistical Prediction Techniques for Landslide Susceptibility Modeling. Comput. Geosci. 2015, 81, 1–11. [Google Scholar] [CrossRef]
Fu, L.; Zhang, Q.; Wang, T.; Li, W.; Xu, Q.; Ge, D. Detecting Slow-Moving Landslides Using InSAR Phase-Gradient Stacking and Deep-Learning Network. Front. Environ. Sci. 2022, 10, 963322. [Google Scholar] [CrossRef]
Zhang, Y.; Li, Y.; Meng, X.; Liu, W.; Wang, A.; Liang, Y.; Su, X.; Zeng, R.; Chen, X. Automatic Mapping of Potential Landslides Using Satellite Multitemporal Interferometry. Remote Sens. 2023, 15, 4951. [Google Scholar] [CrossRef]
Luengo, J.; Fernández, A.; García, S.; Herrera, F. Addressing Data Complexity for Imbalanced Data Sets: Analysis of SMOTE-Based Oversampling and Evolutionary Undersampling. Soft Comput. 2011, 15, 1909–1936. [Google Scholar] [CrossRef]
Juang, C.S.; Stanley, T.A.; Kirschbaum, D.B. Using Citizen Science to Expand the Global Map of Landslides: Introducing the Cooperative Open Online Landslide Repository (COOLR). PLoS ONE 2019, 14, e0218657. [Google Scholar] [CrossRef] [PubMed]
Kirschbaum, D.; Stanley, T.; Zhou, Y. Spatial and Temporal Analysis of a Global Landslide Catalog. Geomorphology 2015, 249, 4–15. [Google Scholar] [CrossRef]
Stumpf, A.; Lachiche, N.; Malet, J.-P.; Kerle, N.; Puissant, A. Active Learning in the Spatial Domain for Remote Sensing Image Classification. IEEE Trans. Geosci. Remote Sens. 2013, 52, 2492–2507. [Google Scholar] [CrossRef]
Liu, Z.; Ding, H.; Zhong, H.; Li, W.; Dai, J.; He, C. Influence Selection for Active Learning. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; IEEE: New York, NY, USA, 2021; pp. 9254–9263. [Google Scholar]
Fu, Y.; Zhu, X.; Li, B. A Survey on Instance Selection for Active Learning. Knowl. Inf. Syst. 2013, 35, 249–283. [Google Scholar] [CrossRef]
Du, P.; Zhao, S.; Chen, H.; Chai, S.; Chen, H.; Li, C. Contrastive Coding for Active Learning under Class Distribution Mismatch. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; IEEE: New York, NY, USA, 2021; pp. 8907–8916. [Google Scholar]
Frattini, P.; Crosta, G.; Carrara, A. Techniques for Evaluating the Performance of Landslide Susceptibility Models. Eng. Geol. 2010, 111, 62–72. [Google Scholar] [CrossRef]
Brenning, A. Improved Spatial Analysis and Prediction of Landslide Susceptibility: Practical Recommendations. In Landslides and Engineered Slopes, Protecting Society Through Improved Understanding; Eberhardt, E., Froese, C., Turner, A.K., Leroueil, S., Eds.; Taylor & Francis: Banff, AB, Canada, 2012; pp. 789–795. [Google Scholar]
Corominas, J.; Van Westen, C.; Frattini, P.; Cascini, L.; Malet, J.-P.; Fotopoulou, S.; Catani, F.; Van Den Eeckhaut, M.; Mavrouli, O.; Agliardi, F.; et al. Recommendations for the Quantitative Analysis of Landslide Risk. Bull. Eng. Geol. Environ. 2013, 73, 209–263. [Google Scholar] [CrossRef]
Ozaki, M.; Taku, K. 1: 200,000 Land Geological Map in the Ishikari Depression and Its Surrounding Area with Explanatory Note. In Seamless Geoinformation of Coastal Zone “Southern Coastal Zone of the Ishikari Depression”, Seamless Geological Map of Costal Zone S-4; Geological Survey of Japan ALST: Tsukuba, Japan, 2014. [Google Scholar]
Adriano, B.; Yokoya, N.; Miura, H.; Matsuoka, M.; Koshimura, S. A Semiautomatic Pixel-Object Method for Detecting Landslides Using Multitemporal ALOS-2 Intensity Images. Remote Sens. 2020, 12, 561. [Google Scholar] [CrossRef]
Osanai, N.; Yamada, T.; Hayashi, S.; Kastura, S.; Furuichi, T.; Yanai, S.; Murakami, Y.; Miyazaki, T.; Tanioka, Y.; Takiguchi, S.; et al. Characteristics of Landslides Caused by the 2018 Hokkaido Eastern Iburi Earthquake. Landslides 2019, 16, 1517–1528. [Google Scholar] [CrossRef]
Yamagishi, H.; Yamazaki, F. Landslides by the 2018 Hokkaido Iburi-Tobu Earthquake on September 6. Landslides 2018, 15, 2521–2524. [Google Scholar] [CrossRef]
Zhang, S.; Li, R.; Wang, F.; Iio, A. Characteristics of Landslides Triggered by the 2018 Hokkaido Eastern Iburi Earthquake, Northern Japan. Landslides 2019, 16, 1691–1708. [Google Scholar] [CrossRef]
Planet Team. Planet Application Program Interface: In Space for Life on Earth; San Francisco, CA, USA, 2017; p. 2. Available online: https://api.planet.com (accessed on 5 February 2025).
Conrad, O.; Bechtel, B.; Bock, M.; Dietrich, H.; Fischer, E.; Gerlitz, L.; Wehberg, J.; Wichmann, V.; Böhner, J. System for Automated Geoscientific Analyses (SAGA) v. 2.1. 4. Geosci. Model Dev. 2015, 8, 1991–2007. [Google Scholar] [CrossRef]
García-Álvarez, D.; Olmedo, M.T.C.; Paegelow, M. Sensitivity of a Common Land Use Cover Change (LUCC) Model to the Minimum Mapping Unit (MMU) and Minimum Mapping Width (MMW) of Input Maps. Comput. Environ. Urban Syst. 2019, 78, 101389. [Google Scholar] [CrossRef]
Wang, Z.; Brenning, A. Active-Learning Approaches for Landslide Mapping Using Support Vector Machines. Remote Sens. 2021, 13, 2588. [Google Scholar] [CrossRef]
Muenchow, J.; Brenning, A.; Richter, M. Geomorphic Process Rates of Landslides along a Humidity Gradient in the Tropical Andes. Geomorphology 2012, 139, 271–284. [Google Scholar] [CrossRef]
Tucker, C.J. Red and Photographic Infrared Linear Combinations for Monitoring Vegetation. Remote Sens. Environ. 1979, 8, 127–150. [Google Scholar] [CrossRef]
Martha, T.R.; Kerle, N.; Jetten, V.; van Westen, C.J.; Kumar, K.V. Characterising Spectral, Spatial and Morphometric Properties of Landslides for Semi-Automatic Detection Using Object-Oriented Methods. Geomorphology 2010, 116, 24–36. [Google Scholar] [CrossRef]
Huang, F.; Cao, Z.; Guo, J.; Jiang, S.-H.; Li, S.; Guo, Z. Comparisons of Heuristic, General Statistical and Machine Learning Models for Landslide Susceptibility Prediction and Mapping. Catena 2020, 191, 104580. [Google Scholar] [CrossRef]
Brenning, A.; Bangs, D.; Becker, M. RSAGA: SAGA Geoprocessing and Terrain Analysis in R Package. 2018. Available online: https://CRAN.R-project.org/package=RSAGA (accessed on 5 February 2025).
Tong, S. Active Learning: Theory and Applications. Ph.D. Thesis, Stanford University, Stanford, CA, USA, 2001. [Google Scholar]
Settles, B. Synthesis Lectures on Artificial Intelligence and Machine Learning. In Active Learning; Springer International Publishing: Cham, Switzerland, 2012; ISBN 978-3-031-00432-2. [Google Scholar]
Tharwat, A.; Schenck, W. A Survey on Active Learning: State-of-the-Art, Practical Challenges and Research Directions. Mathematics 2023, 11, 820. [Google Scholar] [CrossRef]
Angluin, D. Queries and Concept Learning. Mach. Learn. 1988, 2, 319–342. [Google Scholar] [CrossRef]
Baum, E.B.; Lang, K. Query Learning Can Work Poorly When a Human Oracle Is Used. In Proceedings of the International Joint Conference on Neural Networks, Beijing, China, 3–6 November 1992; Volume 8, p. 8. [Google Scholar]
Cohn, D.; Atlas, L.; Ladner, R. Improving Generalization with Active Learning. Mach. Learn. 1994, 15, 201–221. [Google Scholar] [CrossRef]
Lewis, D.D. A Sequential Algorithm for Training Text Classifiers: Corrigendum and Additional Data. SIGIR Forum 1995, 29, 13–19. [Google Scholar] [CrossRef]
Tuia, D.; Volpi, M.; Copa, L.; Kanevski, M.; Munoz-Mari, J. A Survey of Active Learning Algorithms for Supervised Remote Sensing Image Classification. IEEE J. Sel. Top. Signal Process. 2011, 5, 606–617. [Google Scholar] [CrossRef]
Lewis, D.D.; Catlett, J. Heterogeneous Uncertainty Sampling for Supervised Learning. In Machine Learning Proceedings 1994; Elsevier: Amsterdam, The Netherlands, 1994; pp. 148–156. [Google Scholar]
Demir, B.; Persello, C.; Bruzzone, L. Batch-Mode Active-Learning Methods for the Interactive Classification of Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2010, 49, 1014–1031. [Google Scholar] [CrossRef]
Goetz, J.N.; Guthrie, R.H.; Brenning, A. Integrating Physical and Empirical Landslide Susceptibility Models Using Generalized Additive Models. Geomorphology 2011, 129, 376–386. [Google Scholar] [CrossRef]
Petschko, H.; Bell, R.; Glade, T. Effectiveness of Visually Analyzing LiDAR DTM Derivatives for Earth and Debris Slide Inventory Mapping for Statistical Susceptibility Modeling. Landslides 2016, 13, 857–872. [Google Scholar] [CrossRef]
Wang, Z.; Brenning, A. Unsupervised Active–Transfer Learning for Automated Landslide Mapping. Comput. Geosci. 2023, 181, 105457. [Google Scholar] [CrossRef]
Knevels, R.; Petschko, H.; Leopold, P.; Brenning, A. Geographic Object-Based Image Analysis for Automated Landslide Detection Using Open Source GIS Software. ISPRS Int. J. Geo-Inf. 2019, 8, 551. [Google Scholar] [CrossRef]
Pourghasemi, H.R.; Kornejady, A.; Kerle, N.; Shabani, F. Investigating the Effects of Different Landslide Positioning Techniques, Landslide Partitioning Approaches, and Presence-Absence Balances on Landslide Susceptibility Mapping. Catena 2020, 187, 104364. [Google Scholar] [CrossRef]
Sameen, M.I.; Pradhan, B.; Bui, D.T.; Alamri, A.M. Systematic Sample Subdividing Strategy for Training Landslide Susceptibility Models. Catena 2020, 187, 104358. [Google Scholar] [CrossRef]
Pradhan, B.; Lee, S. Landslide Susceptibility Assessment and Factor Effect Analysis: Backpropagation Artificial Neural Networks and Their Comparison with Frequency Ratio and Bivariate Logistic Regression Modelling. Environ. Model. Softw. 2010, 25, 747–759. [Google Scholar] [CrossRef]
Erener, A.; Sivas, A.A.; Selcuk-Kestel, A.S.; Düzgün, H.S. Analysis of Training Sample Selection Strategies for Regression-Based Quantitative Landslide Susceptibility Mapping Methods. Comput. Geosci. 2017, 104, 62–74. [Google Scholar] [CrossRef]
Raja, N.B.; Çiçek, I.; Türkoğlu, N.; Aydin, O.; Kawasaki, A. Landslide Susceptibility Mapping of the Sera River Basin Using Logistic Regression Model. Nat. Hazards 2017, 85, 1323–1346. [Google Scholar] [CrossRef]
Kuglitsch, M.M.; Pelivan, I.; Ceola, S.; Menon, M.; Xoplaki, E. Facilitating Adoption of AI in Natural Disaster Management through Collaboration. Nat. Commun. 2022, 13, 1579. [Google Scholar] [CrossRef] [PubMed]
Huang, S.-J.; Jin, R.; Zhou, Z.-H. Active Learning by Querying Informative and Representative Examples. Adv. Neural Inf. Process. Syst. 2010, 23, 1936–1949. [Google Scholar] [CrossRef]
Brock, J.; Schratz, P.; Petschko, H.; Muenchow, J.; Micu, M.; Brenning, A. The Performance of Landslide Susceptibility Models Critically Depends on the Quality of Digital Elevation Models. Geomat. Nat. Hazards Risk 2020, 11, 1075–1092. [Google Scholar] [CrossRef]

Figure 1. Location of the study areas and distribution of landslides in Hokkaido. Different colors (red, green, and blue) represent zones of landslide-triggering severities with varied ratios of landslides and non-landslides.

Figure 2. Active learning framework workflow for selecting “informative” instances in landslide detection assessments.

Figure 3. The workflow of the computational experiments using margin sampling and random sampling based on different ratios of landslide and non-landslide points in different study areas.

Figure 4. The mean AUROCs (left) and pAUROCs (right) of 100 repetitions obtained by MS and RS based on different study areas. (a) Study area 1 with the 1:1 ratio of landslide and non-landslide points; (b) study area 2 with the 1:12 ratio of landslide and non-landslide points; and (c) study area 3 with the 1:30 ratio of landslide and non-landslide points.

Figure 5. The standard deviation of AUROCs (left) and pAUROCs (right) of 100 repetitions obtained by MS and RS based on different study areas. (a) Study area 1 with a 1:1 ratio of landslide-to-non-landslide points; (b) study area 2 with a 1:12 ratio of landslide-to-non-landslide points; and (c) study area 3 with a 1:30 ratio of landslide-to-non-landslide points.

Figure 6. The proportion of landslide points in the training set obtained by MS and RS in different scenarios. (a) Study area 1 with a 1:1 ratio of landslide-to-non-landslide points; (b) study area 2 with a 1:12 ratio of landslide-to-non-landslide points; and (c) study area 3 with a 1:30 ratio of landslide-to-non-landslide points.

Figure 7. An example of landslide classification maps obtained by MS and RS for different ratios of landslide-to-non-landslide based on 350 landslide and non-landslide points at epoch 10. The predicted probabilities were classified into four classified levels (very high, high, low, and very low) using the top 4th, 10th, and 50th percentile of each strategy’s predictions.

Figure 8. Zoomed-in landslide maps obtained by (a) MS and (b) RS overlapping landslide inventory based on the data with a 1:12 ratio of landslide-to-non-landslide points. The map was based on 350 training points.

Table 1. Information about the three study areas.

	Study Area 1	Study Area 2	Study Area 3
Number of landslides	3564	5380	5473
Ratio of landslides to non-landslides	1:1	1:12	1:30
Landslide type	co-seismic landslides
Landslide process	shallow debris slides
Size (km²)	173	623	1216
Geological units	sedimentary and volcanic rocks
Triggering mechanism	earthquake

Table 2. Median and interquartile range (IQR) values of predictor variables for landslide and non-landslide observations in study areas 1, 2, and 3.

	Study Area 1		Study Area 2		Study Area 3
Predictor Variable	Landslides Median (IQR)	Non-Landslides Median (IQR)	Landslides Median (IQR)	Non-Landslides Median (IQR)	Landslides Median (IQR)	Non-Landslides Median (IQR)
Slope angle (°, slope)	18.71 (10.86)	14.41 (13.94)	18.50 (11.42)	10.26 (16.05)	18.51 (11.42)	9.88 (17.54)
Plan curvature (radians per 100 m, plancurv)	−0.00013 (0.01251)	0.00097 (0.01674)	−0.00007 (0.01244)	0.00110 (0.01989)	−0.00007 (0.01242)	0.00105 (0.02024)
Profile curvature (radians per 100 m, profcurv)	−0.00033 (0.00455)	0.00003 (0.00414)	−0.00029 (0.00448)	0.0000 (0.00336)	−0.00029 (0.00448)	0.0000 (0.00295)
Upslope contributing area (log10 m², log.carea)	2.735 (0.663)	2.87 (0.578)	2.719 (0.651)	2.874 (0.618)	2.72 (0.65)	2.91 (0.67)
Elevation (m, dem)	140.8 (69.2)	156.6 (121.3)	138.8 (73.8)	117.7 (115.5)	138.7 (73.99)	117.75 (139.1)
TWI	5.74 (2.08)	6.38 (2.81)	5.72 (2.06)	7.07 (5.12)	5.72 (2.05)	7.38 (6.76)
Catchment slope angle (cslope)	19.60 (7.35)	13.75 (9.89)	19.45 (8.03)	10.85 (13.17)	19.46 (8.04)	10.95 (14.81)
NDVI	−0.32 (0.28)	−0.08 (0.11)	−0.29 (0.28)	−0.04 (0.17)	−0.29 (0.28)	−0.03 (0.21)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Miao, J.; Wang, Z.; Ma, T.; Wang, Z.; Gao, G. Data Biases in Geohazard AI: Investigating Landslide Class Distribution Effects on Active Learning and Self-Optimizing. Remote Sens. 2025, 17, 2211. https://doi.org/10.3390/rs17132211

AMA Style

Miao J, Wang Z, Ma T, Wang Z, Gao G. Data Biases in Geohazard AI: Investigating Landslide Class Distribution Effects on Active Learning and Self-Optimizing. Remote Sensing. 2025; 17(13):2211. https://doi.org/10.3390/rs17132211

Chicago/Turabian Style

Miao, Jing, Zhihao Wang, Tianshu Ma, Zhichao Wang, and Guoming Gao. 2025. "Data Biases in Geohazard AI: Investigating Landslide Class Distribution Effects on Active Learning and Self-Optimizing" Remote Sensing 17, no. 13: 2211. https://doi.org/10.3390/rs17132211

APA Style

Miao, J., Wang, Z., Ma, T., Wang, Z., & Gao, G. (2025). Data Biases in Geohazard AI: Investigating Landslide Class Distribution Effects on Active Learning and Self-Optimizing. Remote Sensing, 17(13), 2211. https://doi.org/10.3390/rs17132211

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Data Biases in Geohazard AI: Investigating Landslide Class Distribution Effects on Active Learning and Self-Optimizing

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data

2.3. Predictor Variables

2.4. Active Learning

2.5. Experiment Design

3. Results

3.1. Mean and Standard Deviation of AUROCs and pAUROCs

3.2. Self-Optimizing Ability

3.3. Classification Mapping

4. Discussion

4.1. Impact of Class Proportions in Selecting Sampling Strategies

4.2. Limitations and Outlook

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI