Next Article in Journal
Numerical Modeling of Hydraulic Failure Mechanisms in Levees, River Embankments, and Earth Dams Under Climate-Induced Flood Conditions: A Systematic Literature Review
Previous Article in Journal
A Soybean Monitoring Method Integrating BeiDou Positioning and Low-Power Joint Data Compression
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

GA-GBDT: A Spatio-Temporal Graph-Augmented Gradient Boosting Framework for GNSS Network–Based Landslide Event Warning in Mining Areas

1
China Railway Siyuan Survey and Design Group Co., Ltd., Wuhan 430063, China
2
Shenzhen Technology Institute of Urban Public Safety, Shenzhen 518038, China
3
School of Geography and Planning, Sun Yat-sen University, Guangzhou 510006, China
4
Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong 999077, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(11), 5569; https://doi.org/10.3390/app16115569
Submission received: 26 March 2026 / Revised: 22 May 2026 / Accepted: 26 May 2026 / Published: 2 June 2026
(This article belongs to the Section Earth Sciences)

Abstract

Landslide event warning in mining areas is essential for geohazard risk mitigation and infrastructure safety. With the increasing use of Global Navigation Satellite System (GNSS) monitoring networks, warning decisions are often derived from abnormal deformation responses in continuous displacement records. However, deriving stable and transferable warning decisions from GNSS networks is challenged by spatially coupled station responses, time-varying displacement patterns, and incomplete or disturbed observations. To address these issues, this study proposes a graph-augmented gradient boosting decision tree framework, termed GA-GBDT (Graph-Augmented Gradient Boosting Decision Trees), for multi-station landslide event warning in mining areas. The framework first constructs a weighted station graph to encode spatial dependence across stations. Based on this graph, a Gated Recurrent Unit (GRU) and a Graph Convolutional Network (GCN) are integrated to learn spatio-temporal embeddings, which are then fused with station-wise features and fed into XGBoost (eXtreme Gradient Boosting) for warning decision-making. Experiments on a 90-station GNSS network show that GA-GBDT outperforms representative rule-based, machine-learning, and deep-learning baselines, achieving more robust warning performance with improved generalization and false-alarm control. These results indicate that GA-GBDT improves warning robustness, decision stability, and cross-zone generalization for GNSS-based landslide warning in mining areas, with potential transferability to other slope warning scenarios.

1. Introduction

Landslides are among the most destructive geohazards, frequently threatening life safety and causing persistent disruptions to transportation corridors, urban slopes, and other critical infrastructure [1,2]. In operational practice, early warning often depends on whether a slope is entering an unsafe state and whether a landslide-related event is imminent, rather than on long-term deformation trends alone [3,4]. Consequently, monitoring has progressively evolved from intermittent field inspection toward continuous, automated, and networked observation, enabling risk detection at finer temporal resolution [5]. In this context, high-rate geodetic monitoring, especially Global Navigation Satellite System (GNSS) networks, provides an effective data basis for event-oriented warning because it delivers all-weather, continuous measurements with millimeter-level displacement sensitivity across broad areas [6]. Its network configuration also enables spatially distributed responses to be observed simultaneously [7,8]. However, converting multi-station spatio-temporal observations into stable and transferable event-warning decisions remains challenging when using operational GNSS networks [9]. The key difficulty lies in achieving temporally consistent and spatially coordinated warnings under heterogeneous station behaviors, incomplete observations, and scarce positive labels, especially when deformation patterns, noise levels, and event occurrence differ across monitoring zones [10]. These challenges are particularly relevant in mining areas, where slope deformation may be affected by excavation disturbance, local unloading, rainfall, and progressive slope adjustment and where warning events are often reflected by abnormal deformation responses in continuous monitoring records [11,12].
GNSS-based landslide monitoring has demonstrated strong capability in capturing displacement evolution with high temporal continuity, and it is particularly advantageous for large or complex slopes where distributed motion patterns emerge over time [13,14]. In practical warning applications, the focus is on identifying hazardous time periods associated with landslide activity from monitoring evidence and field interpretation, rather than on modeling long-term deformation trends alone [14]. Early GNSS monitoring studies have shown that geodetic measurements can capture slope kinematics and support instability recognition at the field scale [6]. Similar monitoring demands also arise in slopes affected by engineering activities such as excavation, unloading, and local terrain adjustment, where deformation may evolve continuously and spatially unevenly before a warning event becomes evident [15]. Nevertheless, GNSS network observations in practice often exhibit measurement fluctuations, data gaps, and station-specific response differences, while precursory signs of hazardous slope activity can be subtle and spatially nonuniform [16]. Therefore, event warning should not rely solely on isolated single-station signals but should jointly model cross-station dependence and temporal evolution. It should also remain robust under limited and imbalanced event labels, where positive labels denote time steps within field-interpreted warning-event periods [17,18].
Existing landslide event warning methods have generally evolved from criterion-based approaches to learning-based approaches, and further toward hybrid frameworks [19,20]. First, criterion-based approaches, including threshold- and empirical-criterion approaches [21], infer impending instability from changes in deformation rates or displacement trends and are valued for their transparent decision rules and operational simplicity. Representative examples include inverse-velocity failure-time prediction and related creep-based formulations, as well as the displacement-curve tangent-angle method (TAM) and its variants [22,23]. These methods can provide rapid short-term alerts, but their performance is often sensitive to threshold specification and may degrade under strong noise, site heterogeneity, and evolving external conditions. Second, learning-based approaches infer warning states from data by optimizing predictive objectives on historical observations, and they generally include machine learning and deep learning models [24,25,26]. In machine learning, Random Forest (RF) and boosting-type tree ensembles are widely used for landslide warning-related modeling because they can capture nonlinear feature interactions, handle mixed variable types, and provide feature-importance style interpretability [27,28]. Among them, Gradient Boosting Decision Trees (GBDT) and its optimized implementation Extreme Gradient Boosting (XGBoost) are often competitive on structured inputs [29], but their native input is a fixed-length vector, which makes it difficult to explicitly exploit graph-structured spatial dependence without a principled representation transformation [30]. In deep learning, recurrent sequence models such as Long Short-Term Memory (LSTM) networks [31] and Gated Recurrent Unit (GRU) networks can learn nonlinear temporal dependencies and capture progressive state transitions from displacement time series, while graph-based architectures such as Graph Convolutional Networks (GCN) [32] and other Graph Neural Networks (GNN) [33,34] can encode spatial dependence through message passing over station connections [35]. These models are expressive in representing complex temporal dynamics and inter-station interactions, but their robustness under distribution shift and their deployment stability in safety-critical scenarios remain active concerns when data are noisy and event labels are sparse. These limitations have motivated the development of hybrid early-warning strategies that integrate complementary modeling components for more robust warning decisions [36].
To address these limitations, hybrid early-warning frameworks that allocate complementary roles to different modules have been increasingly explored for both regional landslide warning and local slope monitoring in complex engineering areas [37]. Such designs have been motivated by the observation that operational landslide warning requires simultaneous robustness to noisy measurements, sufficient temporal expressiveness, and transferability across spatially heterogeneous settings [32,38,39]. Piciullo et al. reviewed territorial warning systems for rainfall-induced landslides and noted that performance may deteriorate when rules are transferred across regions without accounting for geomorphological and hydro-meteorological variability, which motivates the development of learning-based warning frameworks [40]. Huang et al. developed a real-time landslide risk early-warning framework for highway corridors by integrating susceptibility assessment, rainfall-threshold-based triggering, and vulnerability modeling, demonstrating the operational value of coupling spatial predisposition with time-varying triggers [41]. Nocentini et al. integrated a static susceptibility indicator with time-varying rainfall-related variables within a Random Forest model to estimate regional-scale spatiotemporal landslide probabilities and reported an AUC of 0.91 compared with 0.84 obtained by a conventional empirical model [42]. For graph-based modeling, Zeng et al. incorporated environmental-consistency constraints into graph neural networks for landslide susceptibility assessment, enhancing spatial inference under heterogeneous conditions [43]. Lin et al. compared XGBoost with Random-Forest-based ensembles for landslide susceptibility modeling and indicated that gradient-boosted trees can capture nonlinear feature interactions effectively while maintaining competitive generalization [44]. Collectively, these studies support the view that hybrid frameworks can better balance explicit spatial-dependence modeling and robust decision-making than single-model pipelines, while the overall effectiveness remains sensitive to how spatial relations are constructed and how learned representations are coupled with the final decision layer [45,46,47,48].
Motivated by these needs, a spatio-temporal graph-augmented gradient boosting framework, termed GA-GBDT, has been developed for multi-station GNSS network-based landslide event warning in mining slopes. The framework has been developed and validated using a 90-station GNSS monitoring dataset from a mining slope area in Chifeng City, Inner Mongolia Autonomous Region, China. The dataset contains continuous displacement time-series observations and warning-event labels derived from field interpretation, enabling the proposed method to be assessed under realistic network monitoring conditions. The framework begins by constructing a station-relation graph that represents the spatial proximity or statistical dependence among the monitoring stations. A spatio-temporal graph encoder is then employed to learn compact embeddings that summarize cross-station interactions together with temporal evolution. These embeddings are fused with site-wise input variables and passed to a GBDT model, implemented with XGBoost, to infer time-indexed deformation event probabilities. The main contribution of GA-GBDT lies in two aspects: (i) it constructs a physically and statistically informed monitoring-station graph by combining geographic proximity, regional consistency, and historical dynamic consistency and converts inter-station dependence and temporal evolution into compact graph-augmented features for GNSS network warning and (ii) it adopts a decoupled representation–decision architecture, in which the spatio-temporal graph encoder learns structured embeddings and the XGBoost decision layer performs warning inference on the augmented features, thereby enabling network-level deformation context to be incorporated into a controllable boosting-based warning framework.

2. Study Area and Data Processing

The monitoring dataset was collected from a landslide-prone mountainous area in Chifeng City, located within the transition belt between temperate arid and semi-arid climates. Precipitation in Chifeng shows a clear seasonal pattern, with rainfall mainly concentrated from late spring to early autumn and the highest monthly rainfall occurring in July [49]. Summer rainfall, especially during June–August, provides an important hydrological background for slope deformation because infiltration and runoff can weaken near-surface materials and reduce shear resistance along discontinuities. Geologically, the slopes are characterized by layered and fractured rock masses with pronounced structural controls. Interbedded sandstone–mudstone sequences are widely developed, and discontinuities may guide preferential deformation and provide potential weak zones for slope instability. For spatial organization and analysis, the 90 monitoring stations were grouped into six sub-regions, namely Zones A–F, as shown in Figure 1. The zoning was based on the spatial clustering of GNSS stations, DEM-derived surface morphology, optical-image interpretation of mining-affected slopes, and the approximate extent of mining activities. These zones are used as monitoring-oriented slope units rather than exact landslide boundaries or strictly defined geological landslide units.
Each station was equipped with a Unistrong MIS30 GNSS receiver sourced from Guangzhou Geoelectron Technology Co., Ltd. (Guangzhou, China). The GNSS antennas were installed on fixed monitoring pillars according to the field monitoring design. The antenna mounts were kept fixed during the monitoring period, and routine inspection was used to identify obvious equipment disturbance, foundation damage, or abnormal station behavior. The hourly displacement series were calculated from station coordinates obtained by short-baseline differential processing of each 1 h observation window, with the nominal positioning accuracy of ±2.5 mm horizontally and approximately ±5 mm vertically. To mitigate colored noise and improve the stability of displacement-derived motion descriptors, the raw displacement sequences were processed using an Unscented Kalman Filter, and the filtered composite displacement was used as the primary kinematic input. In addition, hourly precipitation observations were obtained from regional meteorological records consistent with the monitoring period and were used as an environmental forcing variable to represent precipitation-driven slope responses. Local and regional seismicity was not included as an explicit factor because no network-wide coseismic displacement offset or earthquake-related disturbance was observed in the GNSS records during the monitoring period. Therefore, the analysis focuses on slope-related deformation responses captured by the GNSS monitoring network. Station-position variability was quantified as the coordinate difference between each hourly GNSS solution and its reference position, and obvious coordinate jumps or station-specific disturbances were checked during data quality control.
Ground-truth warning-event labels were produced through consensus interpretation by the monitoring project team rather than by an automatic thresholding rule. The team included technical personnel familiar with GNSS displacement monitoring, field inspection information, and engineering records. The annotation jointly considered GNSS displacement evolution, field inspection information, and engineering records and was guided by engineering experience in slope monitoring. For ambiguous onset or termination periods, the candidate event boundaries were rechecked against the available field and engineering records and discussed within the monitoring team until a consensus label was reached. Event boundaries were treated as warning-oriented temporal intervals because the transition from normal deformation to abnormal deformation can be gradual. Each hourly timestamp was then assigned a binary indicator, where 1 denotes a sample within a labeled warning-event period and 0 denotes a non-event sample. The labels describe abnormal deformation periods used for warning evaluation and are not intended to classify each event into a specific geomorphological movement type or to provide a complete inventory of individual landslide bodies. The warning-event labels are highly imbalanced across the 90 GNSS stations. Positive warning-event samples account for only 5.01% of all hourly samples, whereas non-event samples account for 94.99%. The zone-wise class distribution is summarized in Table 1.
All station records were aligned to a unified hourly timeline and sorted chronologically. Duplicate timestamps were removed. Physically implausible spikes were identified using a three-sigma rule applied to displacement increments (i.e., the first-order difference), and flagged samples were treated as outliers and set to missing. Missing data in the GNSS displacement observations were reconstructed using a Singular Spectrum Analysis (SSA)-based procedure designed for incomplete geodetic time series, with the same reconstruction settings applied to all missing segments [50,51]. The filtered composite displacement was then adopted as the primary kinematic descriptor, and rainfall and other available environmental variables were synchronized to the same temporal grid. Continuous inputs were normalized using statistics computed only from the training subset, and the same transformation was applied to validation and test subsets to prevent information leakage. For each station, the full time series was split chronologically into training/validation/test sets with a ratio of 70%/15%/15%. The supervised objective was hourly binary classification of landslide events, with displacement and environmental variables serving as predictors for estimating event probability and warning decisions.

3. Methods

To achieve robust GNSS network–based landslide event warning under conditions of noise contamination and spatial heterogeneity, the proposed GA-GBDT framework is organized as the workflow illustrated in Figure 2. Given multi-station observation sequences, a fixed weighted undirected station graph is first constructed from geographic proximity and historical dynamic consistency, explicitly encoding spatial dependence across stations. A temporal encoder then summarizes short-window evolution, and graph convolution propagates these temporal states over the station graph to obtain an interpretable spatio-temporal embedding. The embedding is concatenated with the original station-wise features and fed into GBDT to output time-indexed event probabilities and binary warning sequences. To avoid optimization interference from coupled end-to-end training, the learning procedure is organized into two stages: Stage I learns spatio-temporal embeddings under a lightweight supervised head, and Stage II freezes the encoder and trains the XGBoost decision layer on the augmented features.

3.1. Spatial Graph Construction

3.1.1. Graph Representation

In multi-station landslide warning, the deformation response observed at one GNSS station is not completely independent of those observed at nearby or dynamically related stations. Stations located within the same slope unit or affected by similar deformation processes may exhibit correlated displacement evolution, whereas individual station records can be disturbed by local noise, missing observations, or short-term non-deformation fluctuations. Therefore, treating each station as an isolated time series may weaken the spatial consistency of warning decisions. To incorporate inter-station dependence explicitly, the GNSS monitoring network is represented as a weighted undirected graph, in which nodes denote monitoring stations and edges describe their spatial or dynamic relationships. This graph provides the structural basis for propagating spatial context across stations in the subsequent graph-based encoding process. The station graph is defined as follows:
G = ( ν , ε , A )
where ν = 1 , , N denotes N stations, ε is the edge set, and A N × N is the adjacency matrix. The graph topology and edge weights are constructed by deterministic rules and kept fixed for subsequent spatio-temporal feature enhancement.

3.1.2. Edge Set Generation

Given station coordinates (Lon, Lat), the surface distance dij is calculated. A sparse connectivity pattern is preferred to prevent excessive noise propagation and to keep computation efficient. Therefore, the edge set is generated by the k-nearest neighbor (kNN) rule [52]:
ε = i ν ( i , j ) j k N N ( i ; k )
where k N N ( i ; k ) is the set of the k nearest neighbors of node i , and k is a fixed integer hyperparameter. This design yields comparable neighborhood sizes across stations and stabilizes the effective receptive field under varying station densities.

3.1.3. Edge Weighting and Normalization

Beyond connectivity, meaningful graph propagation requires edge weights to reflect interpretable inter-station similarity. In this study, the edge weight between stations i and j is constructed by jointly considering geographic proximity, regional consistency, and historical dynamic consistency:
w i j = λ d ϕ ( d i j ) + λ g s i j ( g ) + λ c s i j ( c ) , λ d + λ g + λ c = 1
where λ d , λ g , and λ c are non-negative weighting coefficients. They were selected through a small validation-grid search rather than jointly learned with the neural encoder, so as to preserve interpretability and avoid unstable edge reweighting under imbalanced warning labels. The final setting was fixed as λ d = 0.4 , λ g = 0.2 , and λ c = 0.4 , which provided a balanced validation performance in terms of F1-Score, FP_rate, and cross-zone stability. d i j is the surface distance between stations i and j , and ϕ ( d i j ) is the corresponding geographic-proximity term computed using an exponential decay function scaled by the median inter-station distance. s i j g is a binary regional-consistency term determined by whether the two stations belong to the same monitoring zone. s i j c is the historical dynamic-consistency term, obtained by mapping the Pearson correlation coefficient between the training-period displacement series of the two stations into [0, 1]. This design assigns stronger connections to stations that are geographically close, located in the same monitoring zone, or exhibit similar historical deformation evolution.
After constructing the weighted adjacency matrix W = [ w i j ] , self-loops and symmetric normalization are applied to stabilize graph propagation under heterogeneous node degrees:
A ^ = D ˜ 1 2 ( W + I ) D ˜ 1 2
where I is the identity matrix, and D ~ is the degree matrix of W + I . The normalized adjacency matrix A ^ provides a consistent propagation scale and reduces the risk that local anomalies are amplified purely due to node-degree effects.

3.2. Spatio-Temporal Graph Feature Enhancement

3.2.1. Observation Sequence Representation

Event warning depends on both temporal evolution and spatial coordination. To inject graph-structured dependence without altering the tree model itself, spatio-temporal feature enhancement is implemented by learning a low-dimensional embedding from spatio-temporal encoding and augmenting the original observations with this embedding. This design preserves the capacity of boosted trees for nonlinear interactions while allowing spatial dependence to enter the decision stage explicitly.
The feature vector of station i at time t is defined as x i ( t ) . In this study, the station-wise input feature vector consisted of the filtered composite displacement and hourly precipitation. The framework can also accommodate additional environmental variables, such as groundwater level or soil moisture, when available. A sliding window of length L is used to construct temporal segments aligned with time t :
X i ( t ) = [ x i ( t L + 1 ) , , x i t ]

3.2.2. Temporal Encoding with GRU

A stable state representation is required to summarize short-window dynamics while suppressing noise-driven fluctuations. GRU employs gating to selectively preserve informative history and attenuate irrelevant variations, which is suitable for monitoring sequences that exhibit both gradual trends and abrupt changes [53,54]. The window sequence is encoded as
h i ( t ) = f G R U ( X i ( t ) ; θ G R U )
where h i t R d h is the last hidden state, d h is the hidden dimension, and θ G R U are learnable parameters. Encoding in a time-aligned state space facilitates subsequent spatial aggregation and mitigates the influence of inter-station phase differences.

3.2.3. Spatial Aggregation with GCN

Spatial fusion is performed at each time t by propagating temporal states over the fixed station graph. GCN can be interpreted as weighted neighborhood averaging followed by a linear transformation, with normalized adjacency controlling scale under heterogeneous degrees [55]. Using A ^ as propagation weights, spatial aggregation is defined as follows:
s i ( t ) = σ ( j = 1 N A ^ i j W s h j ( t ) )
where Ws is learnable, σ(·) is the ReLU activation function, and s i t R d s . Since A ^ i j encodes geometric proximity, regional consistency, and historical dynamic consistency, the aggregation injects cross-station context into node representations, capturing coordinated evolution patterns and reducing susceptibility to isolated local noise.

3.2.4. Spatio-Temporal Embedding

Temporal and spatial representations are fused into a compact embedding that can be directly consumed by the tree-based classifier. Fusion is implemented by concatenation and a two-layer perceptron:
e i ( t ) = ψ ( [ h i ( t ) s i ( t ) ] ) d
where denotes feature-wise concatenation, ψ(·) is a two-layer MLP (Linear–ReLU–Linear), and d is the embedding dimension. The enhanced feature is then formed as follows:
z i ( t ) = [ x i ( t ) e i ( t ) ]
Here, x i t preserves instantaneous evidence, whereas e i t injects graph-propagated spatio-temporal context, enabling event discrimination to be constrained by both local observations and coordinated behavior across stations.

3.3. GBDT Decision Layer

Sample-level landslide warning is formulated as binary classification. Let l i t be the label provided by event ground truth. The enhanced feature z i t is fed into the gradient boosting decision trees to output event probabilities. Because z i t combines continuous variables and discrete encodings and may exhibit high-order nonlinear interactions, boosted trees are appropriate due to their additive residual-fitting mechanism and their ability to model complex feature interactions under explicit regularization.
XGBoost is used as the GBDT implementation [56,57]. The predicted probability is
p ^ i ( t ) = sigmoid ( m = 1 M f m ( z i ( t ) ) )
where M is the number of trees, f m ( ) is the output of the m -th tree, and p ^ i t ( 0,1 ) . The objective is the log-loss with tree-complexity regularization:
L c l s = ( i , t ) [ l i ( t ) log p ^ i ( t ) + ( 1 l i ( t ) ) log ( 1 p ^ i ( t ) ) ] + m = 1 M Ω ( f m )
where Ω ( f m ) constrains depth, number of leaves, and leaf weights. A fixed decision threshold τ was used to convert the predicted probability into a binary warning l ^ i ( t ) = Π ( p ^ i t τ ) . In this study, τ was selected on the validation set through a grid search over candidate values from 0.05 to 0.95 with an interval of 0.05. For each candidate threshold, the F1-score and false-alarm ratio were computed from the validation warning sequence. The threshold yielding the highest validation F1-score was selected, and in the case of similar F1-scores, the threshold with a lower false-alarm ratio was preferred. The selected τ was then fixed and applied unchanged to the test set, cross-zone experiments, and event-level evaluation. With this decision layer, the spatio-temporal context encoded in e i t participates in the final classification through structured feature augmentation, strengthening discrimination of coordinated patterns across stations.

3.4. Model Training and Prediction

The overall training and prediction procedure is organized into two stages, so that spatio-temporal representation learning and final decision learning are functionally decoupled while maintaining consistent data flow and feature format between the two components [58].
In Stage I, the spatio-temporal encoder consists of GRU, GCN, and MLP. For each sample, a window X i t is constructed by sliding over the time axis, and the fixed graph A ^ is used for spatial propagation at the aligned time index t . The encoder outputs the embedding e i t , which is concatenated with x i t to form z i t . To ensure that the embedding is informative for event discrimination, a lightweight binary classification head is attached during Stage I training, and supervision is provided by the event label l i t . The encoder is optimized toward label-consistent spatio-temporal dependency representations. After Stage I converges, the auxiliary classification head is removed. The encoder parameters θ G R U , W s , θ ψ are frozen and subsequently used as a deterministic mapping from raw windows to embeddings.
In Stage II, all samples required for training and inference are transformed into enhanced features z i t by the frozen encoder. XGBoost is then trained on z i t to model the mapping from spatio-temporally augmented features to event probability p ^ i t . The additive boosting procedure progressively fits residuals, while Ω ( f m ) regularizes tree complexity, providing a stable learning process in the augmented feature space.
During inference, the same window construction and encoding procedure is applied at each station and time index: X i t is built from the observation sequence, the frozen encoder produces e i t , the enhanced feature z i t is formed, and XGBoost outputs p ^ i t and the binary warning l ^ i t . Consequently, a continuous-time warning sequence l ^ i t is obtained for each station, which can be further organized as station-wise warning timelines consistent with the monitoring network representation. The separation between Stage I representation learning and Stage II decision learning keeps spatio-temporal dependency encoding explicit and fixed during tree training, which improves controllability and reduces the risk of optimization interference between the two components.

3.5. Evaluation Metrics

To evaluate warning quality at both the sample and event levels, metrics were computed from the time-indexed binary warning sequence and the event intervals obtained after thresholding the predicted probabilities.

3.5.1. Sample-Level Metrics

At the sample level, each time step is treated as one binary decision. True positives (TP) are alarmed samples that fall within true-event periods; false positives (FP) are alarmed samples outside true events; false negatives (FN) are event samples without alarms; and true negatives (TN) are non-event samples correctly not alarmed [59,60]. Based on these counts, the following metrics are reported:
A c c u r a c y = T P + T N T P + T N + F P + F N
P r e c i s i o n = T P T P + F P
R e c a l l = T P T P + F N
F 1 = 2 P r e c i s i o n R e c a l l P r e c i s i o n + R e c a l l
The sample-level false-alarm ratio used in spatial analyses is defined as the fraction of all evaluated samples that are false alarms [61]:
F P _ r a t e = F P T P + T N + F P + F N
To assess ranking quality independent of a fixed threshold, ROC-AUC is computed as the area under the ROC curve obtained by evaluating model performance over all possible probability thresholds. Average Precision (AP) is computed as the area under the precision–recall curve [62].

3.5.2. Event-Level Metrics

For event-level evaluation, consecutive alarm samples are merged into predicted event intervals, and consecutive true-event samples are merged into ground-truth event intervals. Let E ^ denote one predicted event interval and E denote its matched ground-truth event interval. Each interval is represented as a set of discrete time indices (e.g., hourly indices) contained in that interval. Event matching is performed in a one-to-one manner by maximizing temporal overlap, and event-level P r e c i s i o n e , R e c a l l e , and F 1 e are computed in the same functional forms as the sample-level definitions but using event counts (detected events, false event alarms, and missed events).
For each matched predicted–true event pair E ^ E , the sample-level intersection-over-union (IoU) [63] is defined as
I o U ( ε ^ , ε ) = ε ^ ε ε ^ ε
In this equation, and denote the set intersection and union over discrete time indices, and | | denotes the cardinality (i.e., the number of time indices/samples) in the corresponding set. The station-wise SampleIoU_Mean is computed as the mean IoU over all matched event pairs at that station.
The lead time measures warning timeliness for a matched event pair and is defined as
L e a d ( ε ^ , ε ) = ( t ε s t a r t t ε ^ s t a r t ) Δ t
Here, t E s t a r t is the start time index of the ground-truth event interval E , t E ^ s t a r t is the start time index of the predicted interval E ^ , and Δ t is the sampling interval (set to 1   h for hourly sampling). The station-wise LeadMean is the mean of L e a d ( ) overall matched event pairs at that station. Positive lead indicates earlier warning than the true onset, while negative lead indicates delayed triggering.

4. Results

4.1. Hyperparameter Configuration and Model Robustness

This section presents the model configuration and robustness evaluation before the main performance comparison. The hyperparameter settings are first summarized for all methods to ensure a consistent experimental basis. The effects of the embedding dimension and kNN graph parameter are then examined through sensitivity analysis, followed by an evaluation of training stability and cross-zone generalization to assess the robustness of GA-GBDT under heterogeneous monitoring conditions.

4.1.1. Hyperparameter Settings

To maintain a consistent experimental configuration across all methods, a single fixed set of core hyperparameters was adopted for each model in Table 2, balancing model capacity and regularization to ensure stable learning under noisy and imbalanced GNSS event-warning data.
For the criterion-based baseline, the TAM was configured with a tangent-angle threshold of 85° and a persistence constraint requiring at least two consecutive samples to satisfy the threshold before issuing an event warning. For learning-based baselines, the RF used 500 trees with a moderate maximum depth and class reweighting to address label imbalance. The LSTM network employed a 32 h input window, two recurrent layers with 64 hidden units and 0.2 dropout, trained for 60 epochs using Adam (learning rate 1 × 10−3) with early stopping. The GBDT baseline was implemented with XGBoost and used a standard regularized boosting configuration (max_depth = 6, eta = 0.05, n_estimators = 600, subsample = 0.8, colsample_bytree = 0.8) together with imbalance-aware weighting.
For graph-based learning, the standalone GNN baseline adopted a k-nearest neighbor graph with k = 8 and a two-layer GCN with 64 hidden units and 0.2 dropout, trained with the same optimization schedule as the sequence model. For the proposed GA-GBDT framework, a station-relation graph was built using kNN with k = 8. After kNN graph construction, edges with historical dynamic correlation lower than 0.3 were pruned. The correlation was computed using a 72 h sliding window over the training-period displacement series. The spatio-temporal encoder used a GCN-based spatial aggregator and a GRU-based temporal module with an embedding dimension d = 32, which corresponds to the best-performing range indicated by the hyperparameter sensitivity curves. The learned embeddings were then fused with site-wise input variables and fed into an XGBoost decision layer using the same boosting configuration as the GBDT baseline (max_depth = 6, eta = 0.05, n_estimators = 600, subsample = 0.8, colsample_bytree = 0.8), with early stopping and imbalance-aware class weighting to stabilize training and reduce false alarms. To improve reproducibility, the experiments followed a consistent protocol with fixed data splitting, preprocessing, graph construction, and model configurations. Robustness was further examined in Section 4 through sensitivity analysis, training-stability assessment, cross-zone transfer tests, zone-wise comparisons, and representative-station metrics.

4.1.2. Preprocessing Ablation Analysis

To examine the influence of preprocessing on downstream warning performance, a preprocessing ablation test was conducted by removing SSA reconstruction, UKF smoothing, and outlier removal separately while keeping the GA-GBDT model configuration unchanged. In the setting without SSA reconstruction, missing segments were not left untreated but were filled using a simpler interpolation strategy, so that the comparison reflects the contribution of SSA-based reconstruction to temporal-continuity preservation. As shown in Table 3, the full preprocessing pipeline achieves the best overall performance. Removing SSA reconstruction mainly reduces Recall and AP, indicating that inadequate reconstruction of missing segments can weaken the continuity of deformation evolution and reduce the model’s ability to capture positive warning samples. Removing UKF smoothing leads to lower Precision and a higher FP_rate, suggesting that short-term measurement fluctuations are more likely to be interpreted as abnormal deformation when noise suppression is weakened. Similarly, removing outlier removal increases false alarms, because isolated abnormal observations may disturb probability ranking and decision boundaries. Overall, the preprocessing pipeline improves data continuity and noise robustness, while the performance degradation under each ablation setting remains moderate. This indicates that GA-GBDT benefits from preprocessing but is not solely dependent on any single preprocessing step.

4.1.3. Sensitivity Analysis of Hyperparameters

Since the GA-GBDT performance is directly affected by the capacity of spatio-temporal embeddings and the density of the station-relation graph, the embedding dimension d and the kNN graph parameter k are treated as two key control variables that balance representation power and noise propagation. Therefore, with all other training and decision-layer settings fixed, this subsection evaluates the sensitivity of GA-GBDT to d and k , characterizing the performance trends and determining the values used in the subsequent experiments.
Figure 3a shows a non-monotonic response to the embedding dimension d, indicating that larger embeddings are not always beneficial. When d increases from 4 to 32, the event-level F1 improves from 0.688 to 0.860 (an absolute gain of +0.173), while the sample-level false-positive rate (FP_rate) drops from 0.0246 to 0.0108 (an absolute decrease of −0.0139, about 56% relative reduction). This demonstrates that moderate embedding capacity substantially enhances spatio-temporal representation and simultaneously strengthens false-alarm control. However, further enlarging the embedding leads to performance degradation: at d = 48 and d = 64, F1 decreases to 0.846 and 0.836, and FP_rate increases to 0.0122 and 0.0141, respectively. Compared with d = 32, d = 64 yields an F1 drop of about 0.024 (~2.8%) and an FP_rate increase of about 0.00326 (~30%). These trends suggest that overly high-dimensional embeddings may introduce redundant or noise-amplifying features, weakening the stability of the decision boundary. Overall, d = 24–32 provides a robust operating region, with d = 32 offering the best balance between F1 and FP_rate.
Figure 3b confirms that graph construction is a critical factor for event warning. Increasing the kNN neighbor size from k = 3 to k = 8 boosts F1 from 0.763 to 0.871 (absolute +0.108, ~14%) and reduces FP_rate from 0.0183 to 0.0112 (absolute −0.00711, ~39%). This indicates that incorporating an appropriate level of cross-station coupling improves consistency in event detection and markedly suppresses false alarms. When k increases to 12, F1 slightly decreases to 0.860, whereas FP_rate reaches the minimum (0.0110), reflecting a trade-off where denser connectivity can further reduce false positives but may mildly blur discriminative patterns. At k = 16, F1 drops to 0.834, and FP_rate rises to 0.0143, suggesting that excessive edges may introduce irrelevant neighbors and cause over-smoothing. Consequently, k = 8–12 is recommended as a stable range, where k = 8 favors peak F1 and k = 12 favors minimum FP_rate.

4.1.4. Training Stability and Cross-Zone Generalization

In Figure 4a, the curves show the training dynamics of the spatio-temporal feature encoding process. Train/Val loss is the binary cross-entropy loss of the auxiliary classification objective. The training loss decreases rapidly in early epochs and reaches 0.116 at the end. The validation loss attains its minimum of 0.161 at epoch 39 and then gradually increases to 0.206, suggesting mild overfitting in late training. The AUC trajectories are consistent with this behavior. Validation AUC peaks at 0.916 at epoch 45 and ends at 0.890, whereas training AUC continues to rise and reaches 0.958. The final train–validation AUC gap is about 0.068, implying that early stopping or stronger regularization could further suppress late-stage overfitting while preserving discriminative power.
The cross-zone generalization matrix demonstrates non-trivial spatial transferability in Figure 4b. The diagonal entries, representing training and testing within the same zone, have a mean F1 of 0.837. Off-diagonal entries, representing cross-zone transfer, achieve a mean F1 of 0.811, with an average drop of 0.026. This confirms that spatial domain shift introduces performance degradation but does not collapse detection capability. The worst transfer occurs when training on GC and testing on GE, yielding F1 = 0.786, while the best cross-zone transfer is obtained by training on GD and testing on GB with F1 = 0.834. From the test-zone perspective, GE has the lowest average F1 across different training zones (0.801), whereas GF shows the highest (0.824), indicating that GE is the most difficult zone to generalize to, likely due to stronger local heterogeneity. Overall, the matrix provides direct evidence that the proposed method is not overly dependent on a single zone’s statistics and maintains a relatively strong level of cross-zone applicability.

4.2. Sample-Level Event Warning Performance Comparison

To assess the sample-level landslide event warning capability, a comparative evaluation was conducted across six methods using overall metrics together with PR and ROC analyses, as shown in Figure 5.
Figure 5a summarizes the sample-level event warning performance of six methods. Overall, GA-GBDT achieves the best or near-saturated performance across all reported metrics, with Accuracy = 0.9779, Precision = 0.8504, Recall = 0.7999, F1-Score = 0.8244, AUC = 0.9898, and AP = 0.9467. Compared with the strongest competing baseline for each metric, GA-GBDT improves Accuracy, Precision, Recall, F1-Score, AUC, and AP by 0.0048, 0.0302, 0.0502, 0.0410, 0.0113, and 0.0324, corresponding to relative gains of approximately 0.49%, 3.68%, 6.70%, 5.23%, 1.15%, and 3.54%, respectively. These results indicate that GA-GBDT not only increases overall correctness (Accuracy) but also delivers a more favorable balance between event detection capability (Recall) and false-alarm control (Precision), leading to a higher F1-Score.
Among learning-based baselines, the deep models (GNN and LSTM) show consistent performance, with GNN (Precision = 0.8202, Recall = 0.7497, F1 = 0.7834) outperforming LSTM (F1 = 0.7635). For tree-based baselines, plain GBDT remains competitive in ranking-oriented metrics (AUC = 0.9785 and AP = 0.9143) but is notably weaker than GA-GBDT in Precision and Recall (Precision = 0.7501, Recall = 0.6998), suggesting that relying solely on site-wise vector inputs limits both separability and positive-class capture. RF yields a further reduction in F1 (0.6993), implying constrained discriminative capacity under complex nonlinearity and noise. The criterion-based TAM performs worst overall (Precision = 0.55, F1 = 0.5739), with AUC = 0.8909 and AP = 0.6715, highlighting clear limitations in probability ranking and positive detection at the sample level.
The PR and ROC curves in Figure 5b,c further support these observations. GA-GBDT maintains higher precision over a broad recall range and achieves the highest AP of 0.9467, exceeding GNN (0.8763), LSTM (0.8599), and plain GBDT (0.9143), which indicates superior ranking quality for the positive class under class imbalance. In terms of ROC, GA-GBDT reaches an AUC of 0.9898 with a curve closer to the top-left corner, demonstrating strong discrimination across decision thresholds with high true-positive rates and low false-positive rates. In contrast, TAM shows a visibly smaller margin over the random baseline (AUC = 0.8909), reflecting weaker separability. Collectively, these results demonstrate that GA-GBDT provides stronger overall discrimination and more stable threshold behavior for sample-level landslide event warning.

4.3. Analysis of Landslide Event Recognition Results at Representative Stations

To interpret station-scale event recognition behavior under diverse deformation patterns, representative-station comparisons were conducted by juxtaposing the displacement series with ground-truth event windows and the event detections produced by different methods, as shown in Figure 6. To further support the visual interpretation, quantitative metrics for these representative stations are summarized in Table 4.
Figure 6 visualizes the displacement time series (blue curves), ground-truth event windows (vertical dashed lines), and detected event intervals (colored bars) for six representative stations (GA10, GB28, GC45, GD64, GE77, and GF86). The selected cases cover diverse event-associated displacement responses. For instance, GA10 (Figure 6a) shows a step-like acceleration within the true window, whereas GB28 (Figure 6b) is characterized by an abrupt drop. GC45 (Figure 6c) and GD64 (Figure 6d) contain multiple event episodes linked to stage-wise trend reorganization. GE77 (Figure 6e) exhibits a complex drop-and-recovery pattern, and GF86 (Figure 6f) corresponds to a long-duration event window with sustained growth. These cases collectively represent abrupt-change, multi-episode, and long-duration event scenarios.
Across the representative stations, GA-GBDT produces event intervals with more complete temporal coverage and better agreement with the labeled event windows. This is supported by the quantitative results in Table 4, where GA-GBDT obtains the highest Accuracy, Precision, Recall, F1, and IoU values. In addition to the visual comparison, these metrics provide quantitative evidence from both sample-level classification and interval-overlap perspectives. In particular, its Recall reaches 0.932 ± 0.033, markedly higher than GNN (0.670 ± 0.153) and LSTM (0.660 ± 0.202), indicating more complete detection of event samples. Its IoU reaches 0.940 ± 0.033, also higher than GNN (0.667 ± 0.157) and LSTM (0.658 ± 0.206), confirming improved temporal overlap with the true event intervals. Meanwhile, the higher Accuracy, Precision, and F1 values indicate that this improvement not only is due to wider event coverage but also reflects a better overall balance between correct warning samples and false detections. The FP_rate values remain at a low level for all learning-based methods, suggesting that the improved event coverage of GA-GBDT does not lead to excessive false alarms in these representative cases.
In abrupt-change cases (GA10 in Figure 6a and GB28 in Figure 6b), detections are more concentrated on the primary abnormal phase with fewer within-window breaks. In multi-episode cases (GC45 in Figure 6c and GD64 in Figure 6d), the method more consistently covers each true episode while maintaining separations between them. In complex and long-duration cases (GE77 in Figure 6e and GF86 in Figure 6f), detections tend to remain more continuous, with improved persistence over extended windows. By comparison, criterion-based baselines more frequently show boundary offsets or fragmented detections, while some single-model learning baselines insufficiently cover or intermittently detect multi-episode and long-duration events.
Overall, the representative-station visualizations and the quantitative metrics in Table 4 jointly show that GA-GBDT yields more stable event-interval characterization under heterogeneous displacement responses.

4.4. Spatial Patterns of Warning Metrics

To characterize spatial variability in warning performance, station-wise LeadMean, SampleIoU, and FP_rate were aggregated and mapped onto the basemap (Figure 7). Stations form several clustered groups across the study area, and coherent color aggregation within the same groups is repeatedly observed. This pattern suggests that warning performance is shaped not only by model design but also by local terrain context, anthropogenic disturbance, and event-specific temporal signatures. To further support the interpretation of these spatial patterns, Table 5 summarizes the station-wise warning metrics of the six methods using mean ± standard deviation.
For LeadMean, GA-GBDT exhibits the most prominent high-lead pattern, with contiguous high-value points appearing across multiple clusters and frequent occurrences of the >6 h class in the eastern and central parts of the study area (Figure 7a1–f1). This visual pattern is consistent with Table 5, where GA-GBDT achieves the highest mean LeadMean of 5.85 ± 1.13 h, corresponding to a stable multi-hour lead window of approximately 5.4–6.1 h across zones. This behavior indicates that graph-augmented spatio-temporal learning strengthens early triggering while also implying a more sensitive triggering tendency that shifts the warning window forward. In contrast, RF and GBDT show a more conservative and spatially uniform lead distribution, with high-lead points emerging only sporadically. Such behavior is consistent with tree-based models that favor stable discriminative features and tend to trigger when deformation patterns become more definitive. TAM shows low or negative LeadMean values in several station clusters, indicating delayed triggering or the absence of advance warning. This may be caused by the sensitivity of fixed tangent-angle thresholds to station-specific fluctuations and gradual transition phases, which makes event onset difficult to detect in time.
For SampleIoU, the spatial maps show that the learning-based methods generally achieve higher interval-overlap levels than the criterion-based TAM, while GA-GBDT displays a larger proportion of medium-to-high IoU classes across the monitored station clusters (Figure 7a2–f2). This spatial concentration implies that events in those areas may exhibit more regular durations and clearer temporal envelopes, facilitating overlap-based alignment. GNN shows relatively stable IoU performance, but its high IoU points are less widespread than those of GA-GBDT. The quantitative summary in Table 5 further confirms this pattern: GA-GBDT obtains the highest mean SampleIoU of 0.933 ± 0.084, followed by GNN and LSTM. This indicates that the earlier triggering of GA-GBDT does not substantially weaken interval agreement; instead, it maintains strong temporal overlap with the labeled event windows at the station-wise scale.
For FP_rate, RF produces a relatively clean spatial pattern, with high false-alarm classes rarely observed and low FP_rate levels more prevalent in the western and northwestern clusters (Figure 7a3–f3). This result indicates that RF maintains robust specificity even under rugged terrain contexts. GBDT and LSTM also yield generally low FP_rate, although localized increases appear in a few central and southeastern clusters, potentially associated with short-term non-landslide fluctuations induced by local activities. TAM shows an elevated FP_rate more frequently in southwestern clusters, consistent with fixed thresholds being more easily activated by background variability. GA-GBDT exhibits localized FP hotspots in the northeastern and central clusters, which is compatible with its high-lead behavior, where earlier triggering and broader warning windows can increase overlap with non-event variability and consequently raise false-alarm rates. Nevertheless, Table 5 shows that the mean FP_rate of GA-GBDT remains low at 0.0172 ± 0.0068, which is comparable to or slightly lower than those of the other methods. Overall, Figure 7 and Table 5 jointly indicate that GA-GBDT achieves the clearest lead-time advantage while maintaining strong interval overlap and a low false-alarm level across the monitoring network.

4.5. Zone-Wise Spatial Heterogeneity and Robustness

To assess whether event-warning performance remains consistent under spatial heterogeneity, zone-wise distributions of event-level F1, sample-level FP_rate, and mean lead time were examined across the monitoring network in Figure 8.
Figure 8a compares the event-level F1 distributions across six zones (GA–GF). A consistent ranking is observed in every zone, where GA-GBDT achieves the highest median F1 throughout and the improvement is reflected by an overall upward shift of the box rather than a few outliers. The median F1 of GA-GBDT ranges from 0.692 to 0.748 (0.692 in GB, 0.698 in GD, and 0.748 in GF), yielding an across-zone median range of 0.057, which indicates relatively stable event-warning capability under spatial heterogeneity. Compared with the strongest baseline within each zone (typically the GNN-based model), GA-GBDT improves the median F1 by approximately 0.031–0.093, with the largest gain observed in GE (about +0.093). This suggests that the graph-augmented representation is particularly beneficial in more challenging zones. In contrast, the criterion-based TAM remains substantially lower (median F1 roughly 0.41–0.47) and shows more pronounced zone-dependent degradation.
Figure 8b reports the sample-level false-positive rate FP_rate. Overall, GA-GBDT delivers lower and more concentrated FP_rate distributions in most zones, with a median FP_rate of around 0.0053–0.0081 (0.00526 in GB, 0.00542 in GF, and 0.00810 in GE). Relative to the lowest-FP baseline in each zone, GA-GBDT reduces FP_rate markedly in some areas. For example, in GB, the median FP_rate decreases from 0.00829 (best baseline) to 0.00526 (GA-GBDT), an absolute reduction of 0.00304. In GF, it decreases from 0.00761 to 0.00542, an absolute reduction of 0.00219. A minor exception occurs in GA, where GA-GBDT exhibits a slightly higher median FP_rate than GNN (0.00675 vs. 0.00596), yet the difference is small (about 7.9 × 10−4), and the overall FP_rate remains low. Cross-zone inspection also shows that GD and GE tend to have a higher FP_rate for multiple methods, implying that these zones may be characterized by more complex deformation backgrounds or stronger noise contamination that elevates false alarms at the network level.
Figure 8c evaluates warning timeliness using the mean lead time (LeadMean). GA-GBDT achieves median LeadMean values of 5.36–6.13 h (5.36 h in GD and 6.13 h in GE), with a small across-zone range of 0.77 h, demonstrating strong spatial robustness. Compared to the best lead-time baseline in each zone (consistently the GNN model), GA-GBDT increases the median lead time by approximately 2.39–3.57 h. This provides a stable, multi-hour operational window across heterogeneous zones. Importantly, the lead-time improvement is not obtained at the expense of false alarms, since GA-GBDT simultaneously maintains low FP_rate while improving F1 and LeadMean.
Overall, the three subplots confirm that spatial heterogeneity exists (e.g., lower F1 in GB/GD and a higher FP_rate in GD/GE), but GA-GBDT remains consistently superior or near-superior across all zones.

5. Discussion

5.1. Performance Advantages of the Proposed Framework

The results demonstrate that GA-GBDT delivers consistently strong performance at both the sample and event levels, and the advantage persists under conditions of noise contamination and spatial heterogeneity. This gain can be attributed to the synergy between explicit spatial-consistency evidence injection and controllable tree-based decision learning. First, the fixed weighted station graph encodes the geographic proximity, regional consistency, and historical dynamic consistency into propagation weights. Consequently, anomaly discrimination is not driven solely by instantaneous fluctuations at an individual station; instead, it is constrained by coordinated evolution patterns across nearby and dynamically consistent stations, reducing sensitivity to local noise, missing observations, and short-lived background perturbations. Second, GRU summarizes short-window dynamics via gating, preserving informative trend/acceleration cues while attenuating high-frequency variations. The subsequent GCN aggregation under symmetrically normalized adjacency stabilizes the propagation scale under heterogeneous node degrees, lowering the risk that isolated anomalies are amplified purely by degree effects. The resulting low-dimensional spatio-temporal embedding serves as an interpretable contextual descriptor and is concatenated with the original station-wise features before being fed into XGBoost, enabling the final classifier to exploit both the local evidence and graph-propagated context for nonlinear interaction modeling with imbalance-aware regularization.
This design is also important from a practical monitoring perspective. The warning decision is not made from a single displacement increment close to the nominal GNSS positioning precision. Instead, GA-GBDT learns from short-window deformation evolution and graph-constrained inter-station context, so weak displacement changes are evaluated according to their temporal persistence and their consistency with neighboring monitoring responses. Therefore, the framework is better interpreted as a data-driven detector of persistent abnormal deformation patterns than as a deterministic judge of millimeter-level single-step displacement. In practical applications, it can serve as an auxiliary module in operational monitoring platforms, supporting abnormal deformation screening, warning review, and field inspection prioritization for mining slopes, transportation corridors, railway slopes, and other GNSS-monitored slope systems.
The two-stage training strategy further strengthens robustness and controllability. During representation learning, a lightweight supervised head guides the GRU–GCN–MLP encoder to learn label-consistent spatio-temporal dependency representations. After the encoding parameters are determined, the encoder is frozen, and the tree-based decision layer is trained on the augmented features, avoiding optimization interference that may arise when representation learning and decision learning are tightly coupled end-to-end. This functional decoupling supports stable cross-zone behavior and yields larger gains in more challenging zones, indicating that graph-augmented representations effectively mitigate performance degradation induced by spatial domain shift.
A practical observation is the inherent tension between timeliness and boundary alignment. GA-GBDT achieves higher LeadMean but comparatively lower SampleIoU, implying that its primary benefit comes from earlier and more persistent triggering rather than perfectly matching event boundaries. Earlier alarms may widen predicted windows and are also more sensitive to boundary uncertainty in the ground truth, which can penalize IoU. Therefore, the sample-level gains reported above should be interpreted mainly as improved discrimination of hourly warning states, whereas the event-level practical benefit lies in providing earlier and more continuous warning intervals for operational response. The LeadMean values obtained in this study should therefore be understood as the average time advantage relative to the labeled event onset, rather than the time required to physically confirm a slope failure. From this perspective, the reported multi-hour lead window reflects the model response time under the adopted monitoring labels and sampling interval. In real deployment, such early triggering is more suitable for warning review, inspection scheduling, and risk screening than for directly confirming slope failure. In operational settings, this trade-off is often acceptable or even preferable; nevertheless, graded thresholds and simple post-processing constraints (e.g., minimum duration, merge/split rules) can be used to balance early warning and precise interval characterization.

5.2. Limitations and Prospect

Several limitations warrant further investigation. (i) The station graph is fixed in both topology and weights. This design provides a stable structural prior and avoids unreliable time-varying edge estimation under noisy, incomplete, and imbalanced warning data. However, inter-station relations may change with seasonality or deformation regimes, so dynamic or regime-aware graphs should be explored when longer records or denser event labels are available. (ii) The event labels are warning-oriented deformation periods identified from GNSS monitoring records and field interpretation. They are not intended to provide detailed geomorphological classification of individual landslide movements. Nevertheless, subsurface geological structure and geotechnical conditions remain essential for mechanism-based interpretation, whereas this study focuses on GNSS network-based data-driven warning. In addition, event-window boundaries may contain annotation uncertainty, which affects IoU and lead-time interpretation. (iii) The interpretation of very slow deformation remains constrained by the precision level of hourly GNSS displacement solutions. When the movement rate is close to the nominal precision of the processed coordinate solutions, short-term single-station evidence may be insufficient to distinguish genuine deformation from measurement fluctuation. Therefore, reliable warning for weak or slow deformation should rely on persistent temporal evolution, spatially coherent responses, longer observation windows, and, when available, complementary sensors or field verification.
Future work can extend the framework along three complementary directions. First, graph and similarity enhancement can be pursued by introducing time-varying or regime-aware edge weighting or by adopting more robust similarity measures (e.g., coherence, mutual information), together with outlier-station masking and edge pruning to suppress noise diffusion and over-smoothing from irrelevant neighbors. Second, to further improve warning accuracy, richer environmental and contextual factors can be incorporated beyond displacement and rainfall proxies, such as groundwater variations, geology-related conditions [64], and indicators of human activity intensity [65]. These variables may help distinguish deformation patterns that appear similar but are driven by different mechanisms, thereby reducing localized false-alarm hotspots. Additional observation sources, such as total station measurements, ground-based radar, or high-frequency local sensors, may also help verify weak deformation signals close to the GNSS precision level and improve the reliability of warning confirmation. Third, the present results suggest a feasible modeling strategy that combines graph-based spatio-temporal feature augmentation with boosted-tree decision learning. The GBDT decision layer is implemented with XGBoost here, but other boosting variants may also be tested within the same data flow and feature format. Such extensions should be validated on additional monitoring networks before broader operational use. The same graph-augmented strategy may also be extended to other distributed ground-based sensor networks when spatially coordinated deformation observations are available.

6. Conclusions

This work proposes GA-GBDT to improve GNSS network–based landslide event warning under noise contamination and spatial heterogeneity. A fixed weighted station graph is constructed to encode spatial relations, a GRU–GCN encoder learns an interpretable spatio-temporal embedding, and the embedding is fused with station-wise features for an XGBoost decision layer to produce time-indexed event probabilities and binary warning sequences. The overall workflow improves network-level warning performance while maintaining controllable training and interpretability. Key conclusions are summarized as follows:
(1)
GA-GBDT attains the highest overall warning accuracy and ranking quality, achieving near-saturated sample-level metrics (F1 = 0.824, AUC = 0.990, AP = 0.947) with a balanced Precision/Recall (0.850/0.800), indicating strong discrimination under class imbalance and a favorable balance between missed events and false alarms.
(2)
Event-level results remain consistently strong across all zones, with the median event-level F1 maintained within approximately 0.69–0.75, demonstrating stable performance under spatial heterogeneity.
(3)
Cross-zone transfer experiments demonstrate robust generalization, as the average F1 decreases by only about 0.026 under zone-to-zone training–testing shifts, indicating that warning capability is largely preserved under spatial distribution changes.
(4)
Timeliness improves without compromising specificity, yielding a stable multi-hour lead time (LeadMean ≈ 5.4–6.1 h) while maintaining a low overall false-alarm rate (FP_rate ≈ 0.005–0.008).
Overall, the results demonstrate that combining graph-based spatio-temporal feature augmentation with boosted-tree decision learning is a feasible and effective strategy for GNSS network–based landslide event warning in mining areas. Rather than replacing field interpretation or engineering judgement, the proposed framework can serve as a data-driven auxiliary module for identifying abnormal deformation periods, supporting station-level warning screening, and prioritizing follow-up inspection in operational monitoring systems. Future work will consider incorporating richer environmental and anthropogenic factors, exploring time-varying graphs and more robust similarity measures, and applying graded thresholds with event-level post-processing to further improve reliability and operational adaptability in complex real-world monitoring scenarios.

Author Contributions

Conceptualization, J.W.; methodology, J.W.; validation, X.H.; formal analysis, J.W., T.O.C., and J.A.; investigation, J.W., B.Z., and Y.W.; resources, W.D. and B.Z.; data curation, J.W. and X.H.; writing—original draft preparation, J.W.; writing—review and editing, B.Z., X.H., T.O.C., Y.W., and J.A.; visualization, Y.W.; supervision, W.D. and C.C.; project administration, L.F.; funding acquisition, L.F., W.D., and C.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of Hubei Province of China (No. 2026AFB439), the China Postdoctoral Science Foundation (No. 2025M770247), the National Key Research and Development Project of China (No. 2021YFB3901203), and the Postdoctoral Research Project for China Railway Siyuan Survey and Design Group Co., Ltd. (No. KY2024074S).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data supporting the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

Authors Jinhua Wu, Liang Fei, Wei Dong, Chengdu Cao were employed by the company China Railway Siyuan Survey and Design Group Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The authors declare that this study received funding from China Railway Siyuan Survey and Design Group Co., Ltd. The funder had the following involvement with the study: provision of research data and review of the manuscript.

Abbreviations

The following abbreviations are used in this manuscript:
GNSSGlobal Navigation Satellite System
GA-GBDTGraph-Augmented Gradient Boosting Decision Trees
GBDTGradient Boosting Decision Trees
XGBoosteXtreme Gradient Boosting
GRUGated Recurrent Unit
GCNGraph Convolutional Network
GNNGraph Neural Network
LSTMLong Short-Term Memory
RFRandom Forest
TAMTangent Angle Method
MLPMultilayer Perceptron
SSASingular Spectrum Analysis
kNNk-nearest neighbor
AUCArea Under the Receiver Operating Characteristic Curve
APAverage Precision
ROCReceiver Operating Characteristic
PRPrecision–Recall
TPTrue Positive
FPFalse Positive
TNTrue Negative
FNFalse Negative
FP_rateFalse-positive rate
SampleIoUSample-level Intersection over Union
LeadMeanMean lead time of matched warning events
BCEBinary Cross-Entropy

References

  1. Froude, M.J.; Petley, D.N. Global fatal landslide occurrence from 2004 to 2016. Nat. Hazards Earth Syst. Sci. 2018, 18, 2161–2181. [Google Scholar] [CrossRef]
  2. Duan, Y.; Ding, M.; He, Y.; Chen, H.; Zheng, H.; Fuchs, S.; Sokratov, S.; Dourado, F. Projections of global road risk exposed to landslides under climate change. Commun. Earth Environ. 2025, 7, 57. [Google Scholar] [CrossRef]
  3. Werthmann, C.; Sapena, M.; Kühnl, M.; Singer, J.; Garcia, C.; Breuninger, T.; Gamperl, M.; Menschik, B.; Schäfer, H.; Schröck, S. Insights into the development of a landslide early warning system prototype in an informal settlement: The case of Bello Oriente in Medellín, Colombia. Nat. Hazards Earth Syst. Sci. 2024, 24, 1843–1870. [Google Scholar] [CrossRef]
  4. Calvello, M. Early warning strategies to cope with landslide risk. Riv. Ital. Di Geotec. 2017, 2, 63–91. [Google Scholar]
  5. Casagli, N.; Intrieri, E.; Tofani, V.; Gigli, G.; Raspini, F. Landslide detection, monitoring and prediction with remote-sensing techniques. Nat. Rev. Earth Environ. 2023, 4, 51–64. [Google Scholar] [CrossRef]
  6. Huang, G.; Du, S.; Wang, D. GNSS techniques for real-time monitoring of landslides: A review. Satell. Navig. 2023, 4, 5. [Google Scholar] [CrossRef]
  7. Chen, Z.; Huang, G.; Xie, W.; Zhang, Y.; Wang, L. GNSS real-time warning technology for expansive soil Landslide—A case in ningming demonstration area. Remote Sens. 2023, 15, 2772. [Google Scholar] [CrossRef]
  8. Michoud, C.; Bazin, S.; Blikra, L.H.; Derron, M.-H.; Jaboyedoff, M. Experiences from site-specific landslide early warning systems. Nat. Hazards Earth Syst. Sci. 2013, 13, 2659–2673. [Google Scholar] [CrossRef]
  9. Wang, D.; Huang, G.; Zhang, Q.; Gao, Y.; Du, Y.; Liu, X. A multi-source landslide early warning model based on dynamic monitoring data and the SAAHP-FCE method. Geomat. Nat. Hazards Risk 2025, 16, 2513536. [Google Scholar] [CrossRef]
  10. Dai, K.; Li, Z.; Xu, Q.; Bürgmann, R.; Milledge, D.G.; Tomas, R.; Fan, X.; Zhao, C.; Liu, X.; Peng, J. Entering the era of earth observation-based landslide warning systems: A novel and exciting framework. IEEE Geosci. Remote Sens. Mag. 2020, 8, 136–153. [Google Scholar] [CrossRef]
  11. Lu, Y.; Jin, C.; Wang, Q.; Li, G.; Han, T. Deformation and failure characteristic of open-pit slope subjected to combined effects of mining blasting and rainfall infiltration. Eng. Geol. 2024, 331, 107437. [Google Scholar] [CrossRef]
  12. Dong, X.; Li, S.; Ma, R.; Tian, W.; Zhao, K.; Xiang, H.; Zhu, J.; Qiu, Y. Cloud-based slope risk monitoring and early warning system for open-pit coal mines: A case study of Zhonglian Runshi. Sci. Rep. 2025, 15, 44396. [Google Scholar] [CrossRef]
  13. Li, J.; Qin, J.; Kang, K.; Liang, M.; Liu, K.; Ding, X. Enhanced Spatiotemporal Landslide Displacement Prediction Using Dynamic Graph-Optimized GNSS Monitoring. Sensors 2025, 25, 4754. [Google Scholar] [CrossRef]
  14. Strnad, D.; Mongus, D.; Horvat, Š.; Šegina, E. A multi-task deep learning approach for landslide displacement prediction with applications in early warning systems. Sci. Rep. 2025, 16, 196. [Google Scholar] [CrossRef] [PubMed]
  15. Ding, Q.; Guo, C.; Fan, X.a.; Liu, X.; Gong, X.; Zhou, W.; Ma, G. Multi-source monitoring data helps revealing and quantifying the excavation-induced deterioration of rock mass. Eng. Geol. 2023, 325, 107281. [Google Scholar] [CrossRef]
  16. Shu, B.; He, Y.; Wang, L.; Zhang, Q.; Li, X.; Qu, X.; Huang, G.; Qu, W. Real-time high-precision landslide displacement monitoring based on a GNSS CORS network. Measurement 2023, 217, 113056. [Google Scholar] [CrossRef]
  17. Du, Y.; Ning, L.; Chicas, S.D.; Xie, M. A new early warning Criterion for assessing landslide risk. Nat. Hazards 2023, 116, 537–549. [Google Scholar] [CrossRef]
  18. Liu, X.; Du, Y.; Huang, G.; Wang, D.; Zhang, Q. Mitigating GNSS multipath in landslide areas: A novel approach considering mutation points at different stages. Landslides 2023, 20, 2497–2510. [Google Scholar] [CrossRef]
  19. Segoni, S.; Piciullo, L.; Gariano, S.L. A review of the recent literature on rainfall thresholds for landslide occurrence. Landslides 2018, 15, 1483–1501. [Google Scholar] [CrossRef]
  20. Kang, J.; Wan, B.; Gao, Z.; Zhou, S.; Chen, H.; Shen, H. Research on machine learning forecasting and early warning model for rainfall-induced landslides in Yunnan province. Sci. Rep. 2024, 14, 14049. [Google Scholar] [CrossRef]
  21. Segoni, S.; Battistini, A.; Rossi, G.; Rosi, A.; Lagomarsino, D.; Catani, F.; Moretti, S.; Casagli, N. An operational landslide early warning system at regional scale based on space–time-variable rainfall thresholds. Nat. Hazards Earth Syst. Sci. 2015, 15, 853–861. [Google Scholar] [CrossRef]
  22. Chinkulkijniwat, A.; Salee, R.; Horpibulsuk, S.; Arulrajah, A.; Hoy, M. Landslide rainfall threshold for landslide warning in Northern Thailand. Geomat. Nat. Hazards Risk 2022, 13, 2425–2441. [Google Scholar] [CrossRef]
  23. Chen, Q.; Huang, W.; Li, J. Landslide early warning based on improved tangential angle and displacement rate: A case study of the Leijiashan landslide in Shimen County, Hunan Province. Chin. J. Geol. Hazard Control 2024, 35, 133. [Google Scholar]
  24. Halter, T.; Lehmann, P.; Bast, A.; Aaron, J.; Stähli, M. In situ soil moisture data improve precipitation-based shallow landslide early warning through innovative machine learning methods: In situ soil moisture data improve shallow landslide. Landslides 2025, 22, 3599–3614. [Google Scholar]
  25. Ma, Z.; Mei, G.; Piccialli, F. Machine learning for landslides prevention: A survey. Neural Comput. Appl. 2021, 33, 10881–10907. [Google Scholar] [CrossRef]
  26. Mondini, A.C.; Guzzetti, F.; Melillo, M. Deep learning forecast of rainfall-induced shallow landslides. Nat. Commun. 2023, 14, 2466. [Google Scholar] [CrossRef] [PubMed]
  27. Park, S.; Kim, J. Landslide susceptibility mapping based on random forest and boosted regression tree models, and a comparison of their performance. Appl. Sci. 2019, 9, 942. [Google Scholar] [CrossRef]
  28. Zheng, Z.; Zhang, K.; Wang, N.; Zhu, M.; He, Z. Machine learning–based systems for early warning of rainfall-induced landslide. Nat. Hazards Rev. 2024, 25, 04024027. [Google Scholar]
  29. Zeng, Z.; Tan, S.; Li, A.; Ling, Y.; Zhou, W. Landslide Hazard Zonation Driven by Multi-Rainfall Scenarios Based on the Optimal XGBoost Model—A Case Study of Yongren County, Yunnan Province, China. Sustainability 2025, 17, 11307. [Google Scholar] [CrossRef]
  30. Marjanović, M.; Krautblatter, M.; Abolmasov, B.; Đurić, U.; Sandić, C.; Nikolić, V. The rainfall-induced landsliding in Western Serbia: A temporal prediction approach using Decision Tree technique. Eng. Geol. 2018, 232, 147–159. [Google Scholar] [CrossRef]
  31. Gidon, J.S.; Borah, J.; Sahoo, S.; Majumdar, S.; Fujita, M. Bidirectional LSTM model for accurate and real-time landslide detection: A case study in Mawiongrim, Meghalaya, India. IEEE Internet Things J. 2023, 11, 3792–3800. [Google Scholar] [CrossRef]
  32. Li, C.; Wang, Y.; Zhang, F.; Zhang, H. Edge-Aware Superpixel Dual-Graph GCN for Topographically Heterogeneous Landslide Susceptibility Assessment. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2026, 19, 5634–5648. [Google Scholar] [CrossRef]
  33. Kuang, P.; Li, R.; Huang, Y.; Wu, J.; Luo, X.; Zhou, F. Landslide displacement prediction via attentive graph neural network. Remote Sens. 2022, 14, 1919. [Google Scholar] [CrossRef]
  34. Zhang, Y.; He, Y.; Gao, F.; Huo, T.; Zhang, Q.; Lu, J.; Zhang, L. A spatiotemporal displacement prediction method for InSAR-detected landslides using a graph neural network coupling spatial and temporal features. Geomat. Nat. Hazards Risk 2025, 16, 2596362. [Google Scholar] [CrossRef]
  35. Zhang, S.; Jia, H.; Wang, C.; Wang, X.; He, S.; Jiang, P. Deep-learning-based landslide early warning method for loose deposits slope coupled with groundwater and rainfall monitoring. Comput. Geotech. 2024, 165, 105924. [Google Scholar] [CrossRef]
  36. Gupta, K.; Satyam, N. Integrating real-time sensor data for improved hydrogeotechnical modelling in landslide early warning in Western Himalaya. Eng. Geol. 2024, 338, 107630. [Google Scholar] [CrossRef]
  37. Zeng, T.; Gong, Q.; Wu, L.; Zhu, Y.; Yin, K.; Peduto, D. Double-index rainfall warning and probabilistic physically based model for fast-moving landslide hazard analysis in subtropical-typhoon area. Landslides 2024, 21, 753–773. [Google Scholar] [CrossRef]
  38. Li, H.; Zhu, Y.; Xu, Q.; Tang, R.; Pu, C.; He, Y. Condition monitoring of heterogeneous landslide deformation in spatio-temporal domain using advanced graph attention network. Geomat. Nat. Hazards Risk 2025, 16, 2519429. [Google Scholar] [CrossRef]
  39. Li, Y.; Chen, T.; Lv, L.; Niu, R.; Plaza, A. IED-GCN: An Internal and External Decoupled Graph Convolutional Network for Landslide Susceptibility Assessment. IEEE Trans. Geosci. Remote Sens. 2025, 63, 4414717. [Google Scholar] [CrossRef]
  40. Piciullo, L.; Calvello, M.; Cepeda, J.M. Territorial early warning systems for rainfall-induced landslides. Earth-Sci. Rev. 2018, 179, 228–247. [Google Scholar] [CrossRef]
  41. Huang, F.; Yang, Y.; Ma, G.; Rezania, M.; Chang, Z.; Catani, F.; Jiang, B.; Chen, X.; Guo, F. Real-time early warning of landslide disaster risks on major highways in Ganzhou City, China. J. Rock. Mech. Geotech. Eng. 2026, in press. [Google Scholar] [CrossRef]
  42. Nocentini, N.; Rosi, A.; Segoni, S.; Fanti, R. Towards landslide space-time forecasting through machine learning: The influence of rainfall parameters and model setting. Front. Earth Sci. 2023, 11, 1152130. [Google Scholar] [CrossRef]
  43. Zeng, H.; Zhu, Q.; Ding, Y.; Hu, H.; Chen, L.; Xie, X.; Chen, M.; Yao, Y. Graph neural networks with constraints of environmental consistency for landslide susceptibility evaluation. Int. J. Geogr. Inf. Sci. 2022, 36, 2270–2295. [Google Scholar] [CrossRef]
  44. Lin, G.-W.; Hung, C.; Chang Chien, Y.-F.; Chu, C.-R.; Liu, C.-H.; Chang, C.-H.; Chen, H. Towards automatic landslide-quake identification using a random forest classifier. Appl. Sci. 2020, 10, 3670. [Google Scholar] [CrossRef]
  45. Miao, F.; Zhao, F.; Wu, Y.; Li, L.; Török, Á. Landslide susceptibility mapping in Three Gorges Reservoir area based on GIS and boosting decision tree model. Stoch. Environ. Res. Risk Assess. 2023, 37, 2283–2303. [Google Scholar] [CrossRef]
  46. Park, J.-Y.; Lee, S.-R.; Lee, D.-H.; Kim, Y.-T.; Lee, J.-S. A regional-scale landslide early warning methodology applying statistical and physically based approaches in sequence. Eng. Geol. 2019, 260, 105193. [Google Scholar] [CrossRef]
  47. Pham, K.; Kim, D.; Le, C.V.; Choi, H. Dual tree-boosting framework for estimating warning levels of rainfall-induced landslides. Landslides 2022, 19, 2249–2262. [Google Scholar]
  48. Wang, J.; Wang, Z.; Peng, L.; Qian, C. Landslide Recognition Based on Machine Learning Considering Terrain Feature Fusion. ISPRS Int. J. Geo-Inf. 2024, 13, 306. [Google Scholar]
  49. China Weather Network. Climate Background of Chifeng. Available online: https://www.weather.com.cn/cityintro/101080601.shtml (accessed on 11 May 2026).
  50. Ji, K.; Shen, Y.; Chen, Q.; Wang, F. Extended singular spectrum analysis for processing incomplete heterogeneous geodetic time series. J. Geod. 2023, 97, 74. [Google Scholar] [CrossRef]
  51. Ji, K.; Shen, Y.; Wang, F.; Chen, Q. An efficient improved singular spectrum analysis for processing GNSS position time series with missing data. Geophys. J. Int. 2025, 240, 189–200. [Google Scholar] [CrossRef]
  52. Peterson, L.E. K-nearest neighbor. Scholarpedia 2009, 4, 1883. [Google Scholar] [CrossRef]
  53. Zhang, W.; Li, H.; Tang, L.; Gu, X.; Wang, L.; Wang, L. Displacement prediction of Jiuxianping landslide using gated recurrent unit (GRU) networks. Acta Geotech. 2022, 17, 1367–1382. [Google Scholar] [CrossRef]
  54. Dey, R.; Salem, F.M. Gate-variants of gated recurrent unit (GRU) neural networks. In Proceedings of the 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), Boston, MA, USA, 6–9 August 2017; pp. 1597–1600. [Google Scholar]
  55. Zhang, S.; Tong, H.; Xu, J.; Maciejewski, R. Graph convolutional networks: A comprehensive review. Comput. Soc. Netw. 2019, 6, 1–23. [Google Scholar] [CrossRef]
  56. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
  57. Chen, T.; He, T.; Benesty, M.; Khotilovich, V.; Tang, Y.; Cho, H.; Chen, K.; Mitchell, R.; Cano, I.; Zhou, T. Xgboost: Extreme Gradient Boosting; R package version 0.4-2. 2015. Available online: https://www.rdocumentation.org/packages/xgboost/versions/0.4-2 (accessed on 11 May 2026).
  58. Ramraj, S.; Uzir, N.; Sunil, R.; Banerjee, S. Experimenting XGBoost algorithm for prediction and classification of different datasets. Int. J. Control Theory Appl. 2016, 9, 651–662. [Google Scholar]
  59. Palau, R.M.; Berenguer, M.; Hürlimann, M.; Sempere-Torres, D. Application of a fuzzy verification framework for the evaluation of a regional-scale landslide early warning system during the January 2020 Gloria storm in Catalonia (NE Spain). Landslides 2022, 19, 1599–1616. [Google Scholar] [CrossRef]
  60. Piciullo, L.; Tiranti, D.; Pecoraro, G.; Cepeda, J.M.; Calvello, M. Standards for the performance assessment of territorial landslide early warning systems. Landslides 2020, 17, 2533–2546. [Google Scholar] [CrossRef]
  61. Barnes, L.R.; Schultz, D.M.; Gruntfest, E.C.; Hayden, M.H.; Benight, C.C. Corrigendum: False alarm rate or false alarm ratio? Weather Forecast 2009, 24, 1452–1454. [Google Scholar] [CrossRef]
  62. Zhu, M. Recall, Precision and Average Precision; Department of Statistics Actuarial Science, University of Waterloo: Waterloo, ON, Canada, 2004; Volume 2, p. 6. [Google Scholar]
  63. Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE Computer Society: Washington, DC, USA, 2019; pp. 658–666. [Google Scholar]
  64. Li, G.K.; Moon, S. Topographic stress control on bedrock landslide size. Nat. Geosci. 2021, 14, 307–313. [Google Scholar] [CrossRef]
  65. Li, Y.; Wang, X.; Mao, H. Influence of human activity on landslide susceptibility development in the Three Gorges area. Nat. Hazards 2020, 104, 2115–2151. [Google Scholar] [CrossRef]
Figure 1. Study area and station distribution.
Figure 1. Study area and station distribution.
Applsci 16 05569 g001
Figure 2. Flowchart of the proposed GA-GBDT framework.
Figure 2. Flowchart of the proposed GA-GBDT framework.
Applsci 16 05569 g002
Figure 3. Sensitivity analysis of key hyperparameters in GA-GBDT.
Figure 3. Sensitivity analysis of key hyperparameters in GA-GBDT.
Applsci 16 05569 g003
Figure 4. Training dynamics of the spatio-temporal encoding process and cross-zone generalization of GA-GBDT.
Figure 4. Training dynamics of the spatio-temporal encoding process and cross-zone generalization of GA-GBDT.
Applsci 16 05569 g004
Figure 5. Sample-level landslide event warning performance of six methods.
Figure 5. Sample-level landslide event warning performance of six methods.
Applsci 16 05569 g005
Figure 6. Representative-station landslide event recognition results of six methods.
Figure 6. Representative-station landslide event recognition results of six methods.
Applsci 16 05569 g006
Figure 7. Station-wise spatial patterns of LeadMean, SampleIoU, and FP_rate for six warning methods. Rows (af) denote TAM, RF, GBDT, LSTM, GNN, and GA-GBDT, and columns (13) represent LeadMean, SampleIoU, and FP_rate.
Figure 7. Station-wise spatial patterns of LeadMean, SampleIoU, and FP_rate for six warning methods. Rows (af) denote TAM, RF, GBDT, LSTM, GNN, and GA-GBDT, and columns (13) represent LeadMean, SampleIoU, and FP_rate.
Applsci 16 05569 g007
Figure 8. Zone-wise spatial heterogeneity and robustness of six methods: (a) event-level F1, (b) sample-level FP_rate, and (c) mean lead time (LeadMean, h) across zones GA–GF.
Figure 8. Zone-wise spatial heterogeneity and robustness of six methods: (a) event-level F1, (b) sample-level FP_rate, and (c) mean lead time (LeadMean, h) across zones GA–GF.
Applsci 16 05569 g008
Table 1. Distribution of positive and negative warning-event samples across monitoring zones.
Table 1. Distribution of positive and negative warning-event samples across monitoring zones.
ZoneStation CountTotal
Samples
Positive SamplesNegative SamplesPositive
Percentage
Negative Percentage
GA2099,680715592,5257.18%92.82%
GB1679,708286876,8403.60%96.40%
GC1889,712221687,4962.47%97.53%
GD1889,712487084,8425.43%94.57%
GE944,856208742,7694.65%95.35%
GF944,856327141,5857.29%92.71%
Table 2. Hyperparameter settings of the proposed and comparison models.
Table 2. Hyperparameter settings of the proposed and comparison models.
ModelMain ConfigurationSymbol Glossary
TAMtrigger rule: TanAngle_deg ≥ 85° for ≥2 consecutive points (hourly sampling)TanAngle_deg: tangent angle (degree) computed from the displacement curve.
RFn_estimators = 500; max_depth = 12; min_samples_split = 2; min_samples_leaf = 1; max_features = sqrt; class_weight = balanced; random_state = fixed.n_estimators: number of trees. max_depth: maximum tree depth.
min_samples_split/leaf: minimum samples to split/in a leaf.
max_features: features considered at each split.
class_weight: class reweighting.
LSTMInput window length = 32 h; layers = 2; hidden_size = 64; dropout = 0.2; batch_size = 256; optimizer = Adam (lr = 1 × 10−3); epochs = 60; early stopping patience = 10; loss = weighted BCE (positive weight set by class imbalance).h: hours (sampling window).
hidden_size: hidden-state dimension.
dropout: dropout rate.
batch_size: mini-batch size.
lr: learning rate.
BCE: binary cross-entropy.
GBDTbooster = gbtree; objective = binary:logistic; max_depth = 6; eta = 0.05; n_estimators = 600; subsample = 0.8; colsample_bytree = 0.8; min_child_weight = 1; gamma = 0; reg_lambda = 1.0; reg_alpha = 0.0; scale_pos_weight = imbalance ratio; early_stopping_rounds = 50.eta: learning rate.
subsample/colsample_bytree: row/feature sampling ratio per tree.
min_child_weight: minimum sum of instance weight in a child.
gamma: minimum loss reduction for a split.
reg_lambda/reg_alpha: L2/L1 regularization. scale_pos_weight: positive-class weight.
GNNGraph: kNN with k = 8; backbone: GCN; layers = 2; hidden_dim = 64; dropout = 0.2; node-wise MLP classifier head (1–2 layers); training: epochs = 60; optimizer = Adam (lr = 1 × 10−3); loss = weighted BCE.kNN: k-nearest neighbor graph.
k: number of neighbors per node.
GCN: graph convolutional network.
hidden_dim: node embedding hidden dimension.
MLP: multilayer perceptron.
GA-GBDTGraph construction: kNN with k = 8 (distance-based); correlation pruning: ρ ≥ 0.3 with a 72 h correlation window. Spatio-temporal encoder: spatial aggregation = GCN (layers = 1), embedding dimension d = 32; temporal module = GRU (layers = 1), hidden_size = 32; dropout = 0.2. Decision layer: XGBoost (max_depth = 6, eta = 0.05, n_estimators = 600, subsample = 0.8, colsample_bytree = 0.8, early_stopping_rounds = 50, scale_pos_weight set by imbalance).ρ: correlation threshold for pruning edges. d: embedding dimension (graph encoder output).
GRU: gated recurrent unit.
72 h: correlation computation window length.
Table 3. Ablation analysis of the preprocessing pipeline for GA-GBDT.
Table 3. Ablation analysis of the preprocessing pipeline for GA-GBDT.
Preprocessing SettingSSA
Reconstruction
UKF SmoothingOutlier RemovalPrecisionRecallF1-ScoreAUCAPFP_Rate
Minimal QC onlyNoNoNo0.7810.7420.7610.9720.8790.0138
Without SSA reconstructionNoYesYes0.8290.7580.7920.9830.9180.0086
Without UKF smoothingYesNoYes0.8170.7890.8030.9860.9280.0097
Without outlier removalYesYesNo0.8060.7930.7990.9840.9220.0104
Full preprocessing pipelineYesYesYes0.8500.8000.8240.9900.9470.0068
Table 4. Quantitative comparison for the representative stations.
Table 4. Quantitative comparison for the representative stations.
MethodAccuracyPrecisionRecallF1FP_rateIoU
TAM0.833 ± 0.0750.736 ± 0.1850.453 ± 0.1540.461 ± 0.1840.0215 ± 0.01730.461 ± 0.158
RF0.833 ± 0.1130.791 ± 0.1080.558 ± 0.1610.558 ± 0.0930.0226 ± 0.01020.554 ± 0.169
LSTM0.893 ± 0.1000.952 ± 0.0280.660 ± 0.2020.773 ± 0.1410.0124 ± 0.00380.658 ± 0.206
GBDT0.833 ± 0.1120.803 ± 0.1280.650 ± 0.2080.751 ± 0.1520.0186 ± 0.00880.539 ± 0.212
GNN0.898 ± 0.0790.945 ± 0.0730.670 ± 0.1530.776 ± 0.0960.0164 ± 0.01030.667 ± 0.157
GA-GBDT0.939 ± 0.0290.957 ± 0.1440.932 ± 0.0330.906 ± 0.0780.0185 ± 0.02360.940 ± 0.033
Table 5. Summary of station-wise warning metrics for the six methods.
Table 5. Summary of station-wise warning metrics for the six methods.
MethodLeadMean (h)SampleIoUFP_Rate
TAM−3.72 ± 10.120.792 ± 0.3460.0183 ± 0.0091
RF−0.99 ± 8.560.824 ± 0.2100.0187 ± 0.0067
GBDT−0.22 ± 6.330.895 ± 0.2790.0186 ± 0.0076
LSTM2.17 ± 1.470.899 ± 0.2430.0187 ± 0.0099
GNN3.04 ± 1.240.921 ± 0.1440.0175 ± 0.0079
GA-GBDT5.85 ± 1.130.933 ± 0.0840.0172 ± 0.0068
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wu, J.; Fei, L.; Dong, W.; Cao, C.; Zhang, B.; Han, X.; Chan, T.O.; Wang, Y.; Awange, J. GA-GBDT: A Spatio-Temporal Graph-Augmented Gradient Boosting Framework for GNSS Network–Based Landslide Event Warning in Mining Areas. Appl. Sci. 2026, 16, 5569. https://doi.org/10.3390/app16115569

AMA Style

Wu J, Fei L, Dong W, Cao C, Zhang B, Han X, Chan TO, Wang Y, Awange J. GA-GBDT: A Spatio-Temporal Graph-Augmented Gradient Boosting Framework for GNSS Network–Based Landslide Event Warning in Mining Areas. Applied Sciences. 2026; 16(11):5569. https://doi.org/10.3390/app16115569

Chicago/Turabian Style

Wu, Jinhua, Liang Fei, Wei Dong, Chengdu Cao, Bo Zhang, Xiangyang Han, Ting On Chan, Yuli Wang, and Joseph Awange. 2026. "GA-GBDT: A Spatio-Temporal Graph-Augmented Gradient Boosting Framework for GNSS Network–Based Landslide Event Warning in Mining Areas" Applied Sciences 16, no. 11: 5569. https://doi.org/10.3390/app16115569

APA Style

Wu, J., Fei, L., Dong, W., Cao, C., Zhang, B., Han, X., Chan, T. O., Wang, Y., & Awange, J. (2026). GA-GBDT: A Spatio-Temporal Graph-Augmented Gradient Boosting Framework for GNSS Network–Based Landslide Event Warning in Mining Areas. Applied Sciences, 16(11), 5569. https://doi.org/10.3390/app16115569

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop