Rapid Detection and Segmentation of Landslide Hazards in Loess Tableland Areas Using Deep Learning: A Case Study of the 2023 Jishishan Ms 6.2 Earthquake in Gansu, China

Bai, Zhuoli; Ji, Lingyun; Tang, Hongtao; Qiu, Jiangtao; Kang, Shuai; Liu, Chuanjin; Bian, Zongpan

doi:10.3390/rs17152667

Open AccessArticle

Rapid Detection and Segmentation of Landslide Hazards in Loess Tableland Areas Using Deep Learning: A Case Study of the 2023 Jishishan Ms 6.2 Earthquake in Gansu, China

by

Zhuoli Bai

,

Lingyun Ji

^*,

Hongtao Tang

,

Jiangtao Qiu

,

Shuai Kang

,

Chuanjin Liu

and

Zongpan Bian

The Second Monitoring and Application Center, China Earthquake Administration, Xi’an 710054, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(15), 2667; https://doi.org/10.3390/rs17152667

Submission received: 9 July 2025 / Revised: 27 July 2025 / Accepted: 29 July 2025 / Published: 1 August 2025

Download

Browse Figures

Versions Notes

Abstract

Addressing the technical demands for the rapid, precise detection of earthquake-triggered landslides in loess tablelands, this study proposes and validates an innovative methodology integrating enhanced deep learning architectures with large-tile processing strategies, featuring two core advances: (1) a critical enhancement of YOLOv8’s shallow layers via a higher-resolution P2 detection head to boost small-target capture capabilities, and (2) the development of a large-tile segmentation–tile mosaicking workflow to overcome the technical bottlenecks in large-scale high-resolution image processing, ensuring both timeliness and accuracy in loess landslide detection. This study utilized 20 km² of high-precision UAV imagery acquired after the 2023 Gansu Jishishan Ms 6.2 earthquake as foundational data, applying our methodology to achieve the rapid detection and precise segmentation of landslides in the study area. Validation was conducted through a comparative analysis of high-accuracy 3D models and field investigations. (1) The model achieved simultaneous convergence of all four loss functions within a 500-epoch progressive training strategy, with mAP50(M) = 0.747 and mAP50-95(M) = 0.46, thus validating the superior detection and segmentation capabilities for the Jishishan earthquake-triggered loess landslides. (2) The enhanced algorithm detected 417 landslides with 94.1% recognition accuracy. Landslide areas ranged from 7 × 10⁻⁴ km² to 0.217 km² (aggregate area: 1.3 km²), indicating small-scale landslide dominance. (3) Morphological characterization and the spatial distribution analysis revealed near-vertical scarps, diverse morphological configurations, and high spatial density clustering in loess tableland landslides.

Keywords:

loess tableland areas; landslide hazards; deep learning; UAV imagery; instance segmentation; rapid detection

1. Introduction

Earthquakes constitute a major trigger of geohazards, where intense ground shaking reduces the slope shear strength, inducing extensive landslides that severely threaten human lives and infrastructure. Western China, being a tectonically active region, faces high seismic risks with frequent geohazards, particularly in loess tablelands characterized by complex surface environments. Western China, a tectonically active region, faces frequent geological hazards and high seismic risks. This is particularly evident in the loess tableland areas characterized by complex surface environments. Due to the unique hydro-physical properties of loess, its structures are prone to collapse when saturated or subjected to external forces, leading to severe landslide hazards [1,2,3,4]. A stark example is the 1920 Haiyuan Earthquake—the largest seismic event ever recorded in loess terrain globally—which triggered 5384 landslides covering 218.78 km². The catastrophe claimed over 234,000 lives and caused devastating economic losses [5,6,7], critically impeding post-disaster rescue and reconstruction efforts. Consequently, rapidly identifying landslides and mapping their accurate distribution across loess tablelands following earthquakes is essential for guiding emergency responses and damage assessments.

Modern remote sensing technologies have advanced rapidly, providing viable technical approaches for swift and accurate landslide detection and localization [8,9,10,11,12,13,14,15,16,17,18]. The advent of UAV photogrammetry in particular has substantially refined the precision of geohazard detection. Current methodologies for landslide recognition using remote sensing imagery range from a manual visual interpretation and pixel-based classification to object-oriented image analysis and deep learning models [19,20,21,22,23]. These innovations enable deep learning architectures to extract complex features from high-dimensional data via hierarchical representation learning. This approach effectively replaces manual detection, simultaneously enhancing speed and accuracy in target identification [24,25,26]. Researchers have successfully deployed diverse deep learning frameworks, including convolutional neural networks (CNNs) [27,28,29], Fully Convolutional Networks (FCNs) [30], DeepLabv3+ semantic segmentation algorithms [31], and ResU-Net models [32,33], for automated landslide extraction.

Driven by growing demands for geological hazard prevention and rapid advances in artificial intelligence, intelligent recognition and research of earthquake-triggered landslides have seen significant progress in recent years [34,35]. Fu et al. (2023) applied an enhanced YOLOv4 algorithm to identify landslides triggered by the 2021 Haiti Ms 7.2 earthquake [36]. Ju et al. (2020) combined deep learning methods with Google Earth imagery for the automated detection of historical landslides in typical loess regions of China [37]. Zeng et al. (2025) employed the IDNPM (InSAR Data–Newmark Physical Fusion Driver Model) to rapidly assess landslides induced by the 18 December 2023 Gansu Jishishan earthquake, enabling the swift and accurate delineation of macro-scale landslide distributions [38]. Du et al. (2023) integrated convolutional neural networks (CNNs) and Transformer architectures, adopting a DETR network with Transformer as the core framework for automated landslide detection [39]. Bai et al. (2024) developed a deep learning-based landslide extraction method using 1 m-resolution Google Earth imagery, where change features derived from the object-oriented robust change vector analysis (RCVA) served as model inputs; their approach enhanced detection accuracy through a U-Net model incorporating dense upsampling and asymmetric convolutions [40]. Yang et al. (2022) conducted landslide detection studies on loess landslides using GF-1 satellite imagery and DEM data, establishing a classified sample database of loess landslide remote sensing images and DEM features for the study area, and applied a channel-fused CNN model for landslide classification [41]. Other researchers have tested various models on the Bijie Landslide Database provided by Wuhan University [42]. Such research is often constrained either by the machine learning models themselves or by the use of lower-resolution satellite remote sensing imagery. This has led to a predominant focus on either single landslide hazards or landslide-prone areas within fixed regions. Consequently, there has not been a substantial increase in landslide inventory data, nor has there been a significant improvement in recognition accuracy metrics. Furthermore, while numerous deep learning models exist for landslide detection, most utilize training/inference data where landslides exhibit a distinct contrast against vegetation-rich backgrounds. In contrast, this study focuses on landslide detection in loess tablelands—where minimal differentiation between landslides and their surroundings poses greater recognition challenges. Current research on landslide object detection and segmentation in large-scale remote sensing imagery remains notably limited. Particularly in the loess tableland area, where landslides range in size from approximately 0.3 km² to 20 km², exhibit diverse morphologies, and display clustering characteristics [43].

Existing public datasets predominantly feature 640 × 640 pixel dimensions, defining large targets as 160 × 160 pixels and small targets as ≤32 × 32 pixels. Baseline models with standard three-layer detection heads exhibit limited small-target detection capabilities, while vanilla CNN architectures using single-scale convolutional kernels inadequately extract irregular landslide boundaries, resulting in insufficient extraction precision, blurred edges, high false-positive rates, and slow processing speeds. To address these limitations, this study develops an enhanced model integrating multiple approaches. Leveraging 20 km² of centimeter-resolution UAV aerial imagery from the 2023 Gansu Jishishan Ms 6.2 earthquake in loess tablelands, we trained the model with large-target samples. Key enhancements include introducing a higher-resolution P2 detection head to significantly increase feature map resolution, enabling the effective coverage of 4–16 pixel micro-targets for improved small-landslide detection and thereby overcoming the original model’s deficiencies in small-target detection. Furthermore, our integrated workflow of mega-tile segmentation, landslide instance extraction, and tile mosaicking collectively elevates the timeliness and accuracy of landslide detection in loess tablelands.

2. Study Area and Data

2.1. Geological Setting

The loess tablelands along the northeastern margin of the Tibetan Plateau exhibit unique geomorphological features governed by intense tectonic activity from the ongoing Indian–Eurasian plate collision and thick Quaternary loess deposition, forming a distinctive “tectonic–sedimentary” coupled system [44]. Basement structures inherited from Cenozoic tectonics, including Paleozoic folded basements and Mesozoic basin-range structures, are amplified through differential weathering, creating an interlaced landscape of tablelands, ridges, and hills. Typical loess tablelands predominantly develop on paleoplains formed during tectonically stable periods, with their margins constrained by Cenozoic fault-generated scarps and bases often featuring fault facets of bedrock. This region exhibits “disproportionally severe damage given moderate earthquake magnitudes” due to loess-specific properties, where secondary hazards like seismic subsidence and landslides cause amplified losses. Recent studies revealed concealed activity of basement faults beneath loess tablelands; such dual-layer (loess-bedrock) systems are prone to dynamic destabilization under seismic waves, further complicating regional seismic risk assessments [45].

On 18 December 2023, an Ms 6.2 earthquake struck Jishishan County, Gansu Province, China, with an epicentral depth of 10 km. The seismic event caused extensive damage to infrastructure, including buildings, transportation networks, water conservancy facilities, and power systems. It also triggered numerous earthquake-induced geohazards, such as landslides, rockfalls, and mudflows resulting from sand liquefaction [46]. Compared to other earthquakes above Ms 6.0 in recent years, the Jishishan event resulted in significantly higher casualties. The epicenter was located in Liugou Township, approximately 8 km west of Jishishan County. The focal mechanism solution is shown in Figure 1 [47]. This earthquake occurred in the loess tableland area on the northeastern margin of the Tibetan Plateau. Characterized by thick soil layers and a relatively high elevation, this region exhibits a pronounced seismic amplification effect, leading to unusually widespread impacts. Historically, the area has been prone to geohazards. According to documented records, over 50 landslides occurred in Jishishan between 1989 and 2010, causing varying degrees of damage. Notable examples include a landslide in the construction zone of Baijiaping and Anjiaping Gully on 6 October 1989 that claimed four lives. On 1 August 1992, torrential rains triggered a landslide at Weijiayinshan Village, Zhangjia Village, Yinchuan Township, destroying 257 acres of farmland. On 3 September and 12 September 2007, slope failures at Yangshan Residential Area, Tuanjie New Village, Chuimatan Town caused the collapse of one house and two fatalities [48].

2.2. Data Acquisition and Processing

Data from the study area were acquired from an emergency scientific survey mission. Using a JOUAV CW-15 VTOL (Vertical Take-Off and Landing) fixed-wing UAV (Chengdu Crossbow Automation Technology Co., Ltd., Chengdu, China), we conducted aerial photography covering approximately 20 km² of severely affected landslide zones post-earthquake (Figure 2). The drone was equipped with a CA-100 orthophoto lens (42-megapixel resolution) and optimized batteries, enabling a flight endurance of 1.5 h per sortie. Based on the terrain features and elevation differences, the following settings were applied: 75% forward overlap, 65% side overlap, and a relative flight altitude of 310 m. The data for Region 2 were collected in two flights. After data acquisition, raw images were screened using specialized software (e.g., ContextCapture 4.4.10), followed by aerial triangulation processing. Ground control points (GCPs) were utilized for image registration, and a limited number of surface GCPs were employed in bundle adjustment to solve exterior orientation elements. Upon the completion of aerial triangulation, the outputs underwent quality inspection to detect geometric anomalies such as point cloud stratification, voids, outliers, and distortions. The refined point cloud data were then processed to generate a Digital Orthophoto Map (DOM) and Digital Surface Model (DSM). Image quality was influenced by multiple external factors, including the terrain topography, vegetation cover, wind direction during operation, flight altitude, photo overlap, and image count per zone. Notably, variable weather conditions (frequent rain/snow) and complex topography in the study area resulted in resolution heterogeneity. The final DOM achieved a spatial resolution of 3–8 cm/pixel. The acquired dataset comprises two sectors: Region 1 (sampling data collection zone) and Region 2 (testing research zone). Due to the extensive area of Region 2, data collection required two separate UAV sorties (Figure 2).

3. Methods

Following aerial data acquisition of the study area using a full-frame orthophoto camera mounted on a fixed-wing UAV, this study processed Region 1 imagery to establish an initial landslide dataset. Upon completion of the initial dataset, Generative Adversarial Networks (GANs) and data augmentation techniques were employed to expand the dataset volume. Through the creation of landslide labels, a final annotated landslide sample dataset was generated for training instance segmentation models. Concurrently, Region 2 imagery was processed into a DOM to serve as inference data for validating the landslide instance segmentation model. The workflow is illustrated in Figure 3.

On the other hand, the native YOLO architecture based on the ultralytics framework was enhanced by incorporating a P2 shallow detection head alongside the original P3, P4, and P5 deep detection heads. This modification improves recognition accuracy for small targets within ultra-large images. The refined YOLO pre-trained weights were integrated into the convolutional neural network (CNN) pipeline. Combined with the pre-constructed landslide sample dataset, this framework facilitated the training and development of an instance segmentation model for landslide detection. Instance segmentation—a computer vision task—extends beyond object detection by precisely identifying individual objects and delineating their boundaries within an image. This advanced technique not only localizes objects but also accurately traces their contours [49,50]. The training process underwent iterative refinement until an optimal segmentation model was achieved. Subsequently, this model was deployed for inference and testing on landslides in Region 2. The overall methodology comprises three phases: 1. landslide sample dataset construction; 2. architectural enhancement of the native YOLO model; and 3. inference validation using the trained landslide instance segmentation model. A critical note is that post-training model evaluation, optimization, and iterative refinement are essential before final deployment.

3.1. YOLO and Baseline Models

This study employs YOLOv8 as the deep learning framework. The model’s core capability in detecting multi-scale objects is primarily governed by its feature pyramid architecture, which adheres to the principles of Feature Pyramid Networks (FPNs). Different hierarchical detection heads are responsible for targets within specific size ranges, with their effective coverage areas mathematically correlated to feature map resolutions (Figure 4). Taking a standard 640 × 640-pixel input image as an example, after five successive downsampling operations within the model, the highest-level P5 feature map attains a resolution of 20 × 20. At this stage, each unit in P5 covers a receptive field of approximately 32 × 32 pixels in the original image, theoretically enabling the detection of minimal targets around 32 pixels. The intermediate P4 layer primarily detects medium-sized targets ranging from 16 to 64 pixels. Closest to the input, the P3 layer handles relatively smaller targets typically sized between 8 and 32 pixels. However, this standard three-head structure (P3–P5) exhibits a limited capacity for capturing targets below 8 pixels, creating detection blind spots—a significant limitation of the native YOLOv8 architecture.

3.2. Enhanced YOLO with Shallow Detection Heads

To overcome this limitation, this study implements a critical enhancement to YOLOv8’s shallow architecture by introducing an additional higher-resolution P2 detection head. This head employs specialized upsampling operations to integrate information from shallower feature layers through inter-module upsampling, concatenation, and optimized feature extraction, significantly boosting feature map resolution. The enhancement enables the P2 layer to effectively detect 4–16 pixel-sized targets, thereby alleviating the original model’s deficiency in small object recognition. The implemented architecture introduces a novel shallow-layer modification (indicated by the yellow region in Figure 5). This modification: (1) initiates feature extraction and downsampling at Module 18; (2) performs cross-level feature fusion at Module 20 using outputs from Module 15; (3) finally concatenates these fused features with the original Module 15 components to generate enhanced representations.

Specifically, the deep network layers tend to lose feature information of small targets during processing. By effectively preserving shallow-layer detail features and complementarily fusing them with deep semantic information, the feature map output from the earlier C2 layer in the backbone network maintains twice the resolution of the subsequent C3 layer and seven times that of the deepest C5 layer. This high resolution enables the C2 layer to retain finer image structures and contour details. However, the original YOLOv8 design constructs its feature pyramid solely using the C3, C4, and C5 layers, causing these critical details carried by C2 to be underutilized and rapidly diluted in subsequent processing.

The core innovation of our solution lies in establishing a bidirectional feature fusion pathway: (1) processing C2 features through convolution and normalization; (2) simultaneously upsampling C3 features; (3) merging the processed C2 features with upsampled C3 features at the pixel level; and (4) introducing a channel attention mechanism to adaptively balance the contributions of spatial details from C2 and semantic content from C3 in the fused features. The resulting P2 feature map preserves its native high-resolution advantage while incorporating a deeper semantic understanding, enabling the precise localization and effective recognition of minuscule targets. This hierarchical fusion strategy delivers dual benefits. Firstly, it creates bidirectional information exchange between shallow (C2) and deeper (C3) layers, substantially compensating for the spatial detail loss caused by repeated downsampling in deep networks. Secondly, the modular design of this enhancement allows direct integration into the standard YOLOv8 without extensive architectural modifications, ensuring seamless compatibility.

3.3. Large-Scale Image Tile Staging Strategy

Tailored for geological remote sensing applications, this strategy implements intelligent segmentation processing of geocoded imagery for large-scale landslide instance segmentation (Figure 6). The workflow commences with initialization: creating output directories and loading raw GeoTIFF images, while retrieving critical metadata (image width, height, geotransform parameters, and projection information) through GDAL libraries. The core procedure calculates effective overlap pixels and step sizes based on predefined tile dimensions (e.g., 1200 × 1200 pixels) and overlap ratios (recommended 10–20%), where the X-direction step equals tile_width-int(tile_width × overlap_ratio), with analogous computation for the Y-direction.

An intelligent algorithm dynamically generates tile starting positions by systematically traversing image dimensions to determine tile origins at computed step intervals. Specialized edge case handling automatically adjusts the final tile position when residual areas exceed the tile dimensions but fall below the step thresholds, ensuring complete coverage. During tile execution, GDAL’s Translate function employs srcWin parameters to specify crop regions while preserving the original georeferencing coordinates and projection metadata, with LZW compression optimizing storage. Integrated tqdm progress bars provide real-time operational feedback, displaying current row/column positions and completion percentages and delivering intuitive monitoring for technicians.

This method delivers significant advantages for landslide detection. Georeferencing preservation maintains precise coordinates and projections in each tile to support the subsequent spatial analysis. Configurable overlap mechanisms (10–15% recommended) effectively prevent landslide body truncation. The streaming processing design eliminates large memory requirements while handling multi-GB imagery. Intelligent edge handling guarantees complete spatial coverage. For landslide applications, parameters should be set according to feature dimensions: 1024 × 1024 pixels are suitable for small-to-medium landslides, whereas large landslides require 2048 × 2048 tiles. Projection distortion issues in high-latitude regions require special attention.

Key limitations include GDAL I/O bottlenecks causing slow processing of large images (significant time costs for GB-scale data), 30–40% storage overhead from overlapping designs, and exponential tile quantity growth with image dimensions. Successfully deployed in provincial-scale landslide inventories and post-disaster emergency responses, future enhancements would involve multiprocessing parallelization (e.g., multiprocessing.Pool), cloud storage integration (AWS/Azure compatibility), GeoJSON tile-index generation, and adaptive tiling strategies based on topographic complexity to optimize performance and utility.

3.4. Dataset and Annotation

For deep learning applications, sample data must be partitioned into training, validation, and test datasets. The training data facilitate target feature learning, the validation data are used to select optimal models, and the test data are used to evaluate model performance. Establishing a robust sample library is fundamental to landslide detection via deep learning—sufficient high-quality samples account for over half the success in landslide recognition. Guided by principles of seismic emergency response and rapid disaster assessment, this study rapidly established a new sample database. Additionally, we employ GANs for multiscale data augmentation of initial landslide samples (Figure 7). Through adversarial training, diverse synthetic loess landslide samples are generated to resolve sample scarcity in earthquake-affected areas. As illustrated in Figure 1, aerial imagery covering ~6 km² in Yangwa Village, Liuji Township (proximal to the North Margin Fault of Laji Mountains) was acquired as foundational data (Region 1). To optimize network training, all image samples were cropped to 1024 × 1024 pixels based on an integrated analysis of the pixel dimensions and landslide sizes.

From this area, 200 images containing ~300 landslides were selected as core samples. To enhance recognition stability and accuracy—critical given deep learning’s dependency on large datasets—offline augmentation techniques were applied; mirroring, rotation, flipping, and brightness adjustment expanded the landslide samples to 2198. The dataset was partitioned in a 7:3 ratio for training (1868 samples) and validation (330 samples). To rigorously validate the method’s applicability for post-earthquake disaster assessments, the test set was strictly isolated: ~15 km² of aerial data covering Majia–Goujia Village in Liuji Township (Region 2) served as independent test samples (Table 1).

Landslide imagery was annotated using polygon labeling via the LabelMe tool (Figure 8), generating individual JSON files containing contour coordinates and categorical labels for each landslide instance. Due to format incompatibility between LabelMe JSON and YOLO requirements, conversion to YOLO instance segmentation format was implemented. This standardized format supports training mainstream segmentation models (e.g., YOLOv8-Seg) while offering lightweight processing and efficient parsing advantages. The workflow emphasizes tool adaptability (LabelMe → YOLO), technical execution (coordinate normalization), and quality control (visual verification), rendering it optimally suited for geohazard detection research in this study.

3.5. Progressive Training Strategy

Progressive training represents a phased, hierarchical model optimization strategy. This approach employs a three-stage mechanism—freezing network layers, partially unfreezing parameters, and global fine-tuning—to optimally balance model stability with parametric adaptability. Unlike conventional full-network training where all layers update simultaneously, progressive training achieves precise gradient control, stable feature learning, and efficient resource utilization through stratified parameter activation. When applied to complex detection architectures like YOLO, it demonstrably accelerates convergence and enhances small-object detection accuracy. The methodology offers distinct advantages in training strategy design, mitigation of premature convergence, accelerated training efficiency, improved model robustness, performance optimization, broader application potential, and reduced memory footprint.

3.6. Batch Segmentation and Coordinate Extraction Technique

The workflow operates within a Python 3.8+ environment utilizing libraries including ultralytics, GDAL, numpy, Pillow, and opencv-python, covering the entire process from remote sensing image reading and preprocessing to object detection, segmentation visualization, and georeferenced result export. For image processing, the system handles diverse data types through a normalize_image function that standardizes pixel values to the 0–255 range for model compatibility. Geospatial integrity is preserved when processing TIFF files via GDAL, which maintains geotransform parameters and projection metadata throughout loading and saving operations. The architecture supports 8-bit, 16-bit, and floating-point remote sensing data, with critical functions like load_tiff and normalize_image incorporating robust exception handling for immediate error feedback. Automated normalization dynamically adapts to varying data ranges, minimizing manual intervention. Enhanced visualization capabilities generate intuitive segmentation outputs featuring adjustable bounding boxes, centroids, and masks, and customizable parameters like line width and font size for analytical clarity. Key technical challenges stem from the inherent characteristics of remote sensing data: multispectral bands, large volumes, and varied formats create significant preprocessing and model adaptation hurdles across sensor platforms. Additionally, accurate transformation of pixel coordinates to real-world geographic positions requires the precise application of geotransform parameters, where any deviation compromises geolocational fidelity.

4. Experimental Results

Built upon the YOLO framework, this research advances small-target detection capabilities through an optimized detection head, coupled with a processing pipeline of large-scale tiling, batch segmentation, and tile mosaicking. Integrated with progressive training, the method enables efficient annotation, training, detection, and refined segmentation of loess landslide data in the Jishishan seismic zone.

4.1. Model Training Outcomes

This study conducts a rigorous comparative evaluation between conventional 500-epoch training and a progressive training protocol spanning 500 epochs. The progressive strategy implements three distinct optimization phases: ① foundation feature stabilization (0–150 epochs)—full parameter freezing of the backbone network preserves pre-trained feature representations while mitigating early-stage overfitting; ② mid-level feature tuning (151–350 epochs)—selective unfreezing of intermediate backbone layers optimizes feature extraction capabilities, balancing localization and classification performance; and ③ global feature refinement (351–500 epochs)—complete network unfreezing (100% trainable parameters) enhances the small-target detection capacity with overfitting controls. The quantitative evaluation demonstrated the superiority of progressive training in accuracy metrics: mAP50(M) increased from 0.704 (conventional) to 0.747 (progressive, +6.1% improvement), while mAP50-95(M) rose from 0.413 to 0.468 (+13.3%). Inference speed exhibited a marginal reduction from 45 FPS (conventional) to 43 FPS (−4.4%), though both methodologies maintained identical model sizes (87 MB). Collectively, progressive training delivers substantial accuracy gains with negligible computational trade-offs and unaffected model compression (Table 2).

Box_loss rapidly converged from an initial 2.0 to below 1.0, indicating a stable improvement in landslide localization. It stabilized after 400 epochs (Figure 9A). Seg_loss dropped significantly from 5.0 to approximately 1.0, demonstrating effective edge segmentation optimization with minimal late-stage fluctuation (Figure 9B). Cls_loss steadily converged below 0.5, confirming enhanced landslide/non-landslide classification accuracy (Figure 9C). Dfl_loss smoothly decreased below 0.5, validating the reliable bounding box probability distribution prediction (Figure 9D). mAP50(B) (object detection, IoU = 0.5) reached 0.6 with a rising trend, reflecting a high landslide detection accuracy (Figure 9E). mAP50-95(B) (IoU = 0.5–0.95) progressively improved to 0.4, showing robust performance across localization precision requirements (Figure 9F). mAP50(M) (instance segmentation, IoU = 0.5) approached 0.6, confirming strong contour segmentation alignment with the ground truth (Figure 9G). mAP50-95(M) (IoU = 0.5–0.95) achieved 0.4 without saturation, suggesting potential for refinement in complex landslide segmentation (Figure 9H). The train_seg_loss descended to a stable low plateau after 500 epochs, confirming effective feature learning and training data fitting. Despite this, persistent overfitting risks necessitate complementary validation; while the val_seg_loss demonstrated robust initial decline and gradual stabilization, indicating a sound generalization capability, a comprehensive evaluation requires corroboration with validation mAP metrics to ensure field deployment reliability.

The convergence of all four losses within 500 epochs confirms robust training stability. With mAP50(B/M) > 0.6 and mAP50(M) = 0.747, the algorithm demonstrates exceptional competence in detecting and segmenting seismic loess landslides. The instance segmentation score (mAP50-95(M) = 0.468), while lower than detection metrics, reflects characteristic challenges in delineating irregular geological boundaries. This approach thus enables rapid, accurate landslide detection in loess tablelands—critical for emergency geohazard responses.

4.2. Ablation Experiments

To evaluate the impacts of different modules on small-target detection performance in YOLOv8, we conducted ablation experiments. By comparing the native model with progressively enhanced variants, we validated each module’s effectiveness and the necessity of joint optimization. Starting from the baseline YOLOv8n, we incrementally introduced the P2 small-target detection head and assessed its individual and combined effects. The P2 detection head significantly enhanced small-target feature representation by incorporating high-resolution detection layers. Under progressive training, it improved mAP@0.5 by 8.4% and mAP@0.5:0.95 by 16.1%, while boosting small-target recall by 17.4% at a computational cost increase of 2.5 × 10⁹ GFLOPs. Despite increased parameters and computation, the accuracy gains were substantial. These findings demonstrate the clear practical value of integrating shallow detection heads for small-target tasks, providing empirical evidence for balancing real-time performance and accuracy (Table 3).

4.3. Landslide Segmentation Results

Geospatial data processing and visualization utilize GDAL to precisely read image geotransform parameters and projection metadata. Pixel-to-geographic coordinate conversion is achieved via affine transformation with sub-pixel accuracy (<0.5 pixel error). Landslide visualization employs OpenCV contour analysis to calculate mask centroids, and then applies alpha blending (addWeighted) to fuse green masks onto the original imagery. All outputs retain original georeferencing. The key innovation is an adaptive rendering engine that intelligently processes 8- to 16-bit imagery while preserving native coordinate systems and projections. It supports both single-band grayscale and multiband color images, leveraging OpenCV hardware acceleration for high-throughput processing. This ensures professional-grade precision for geohazard monitoring and visualization (Figure 10).

Following batch instance segmentation and coordinate extraction, the massive volume of tiled TIFF files was consolidated using ArcMap’s Mosaic To New Raster tool (Version 10.8.1). Through parameter optimization, including coordinate system adjustment, pixel type specification, configuration of the number of bands, and mosaic colormap mode selection, a seamless 15 km² orthorectified DOM product was generated, as illustrated in the accompanying figure. Identified landslide centroids were systematically stored in detected_cords.txt.

Statistical analysis of the processed tiles revealed 417 detected landslides, with orthographic areas ranging from 0.217 km² (maximum) to 7 × 10⁻⁴ km² (minimum). Landslide areas were predominantly distributed within the 0–0.02 km² range (aggregate area: 1.3 km²), confirming the prevalence of small-scale landslides. Visual verification identified eighteen false positives (false discovery rate: 4.3%) and seven missed/partial detections (omission rate: 1.6%). These accuracy metrics align with Chen et al.’s findings derived from GaoFen-1/6 imagery in the same seismic context [48].

The study area comprises three distinct sectors: northern Goujia Village, central Majia Village, and southern Liugou Township (Figure 11A). The landslide distribution in northern Goujia exhibits marked spatial clustering, with a high density per unit area and frequent coalescence into contiguous failure zones. This aggregation likely originates from homogeneous geological conditions, topographic constraints, and concentrated seismic shaking intensities. These clustered failures appear as large-scale morphological features in medium–high-resolution imagery, predominantly exhibiting arcuate (horseshoe-shaped), dendritic, fan-shaped, or elongated tongue-like configurations that extend considerable distances downslope. Linear failures are frequently observed along gully systems as a distinct failure pattern (Figure 13B). Particularly near Hongtuwa, coalescing landslides form expansive complexes exhibiting significant secondary hazards. In central Majia Village, landslides predominantly concentrate along loess platform margins and adjacent gullies (Figure 11B), where seismic amplification destabilizes slope toes. Steep escarpments (>45°) demonstrate heightened susceptibility due to amplified inertial forces during ground shaking. Anthropogenic activities further exacerbate risks: improper irrigation and slope excavations trigger failures along modified terraces and cultivated field edges (Figure 11C), with 68% of landslides occurring within 200 m of human-altered terrain. Southern Liugou features dispersed, small-scale landslides concentrated near highway tunnel portals traversing the village. These pose critical secondary hazards requiring urgent mitigation, as evidenced by debris flow channels extending toward residential clusters.

The analysis reveals three defining characteristics of landslides in loess tableland terrain. Spatially, failures concentrate densely along river valleys and gullies, demonstrating strong topographic and structural control. Morphologically, diverse failure types, including shear-driven, liquefaction-induced, and seismic subsidence landslides, exhibit distinct slip surface geometries and triggering mechanisms. Temporally, synchronized multi-point failures occur during seismic events, predominantly generating small-to-medium landslides (hundreds to thousands of square meters), though large-scale failures (>10⁴ m²) occasionally cause significant damage impacts. Comparatively, loess collapses and slumps typically manifest smaller dimensions (tens to hundreds of square meters).

5. Discussion

5.1. Feasibility Analysis of the Small-Target Segmentation Strategy

Through 3D model verification and field investigations, 18 false positives were identified and categorized into three primary error types: (1) topographic shadow misclassification—steep ridge-top shadows cast by solar illumination mimic landslide scarps in optical imagery, leading to erroneous delineation (Figure 12A); (2) complex texture artifacts—terraced fields, erosional features, and ridge intersections form pseudo-circular or arcuate boundaries resembling landslide crowns (Figure 12B); and (3) anthropogenic feature confusion—haystack clusters on artificially modified earthen embankments simulate landslide morphology (Figure 12C). Additional limitations include partial segmentations (Figure 12D) and minor omissions (Figure 12E). Notably, 14 loess collapse features were misclassified as landslides (Figure 12F and Figure 13D). Given this study’s focus on a rapid regional assessment of seismic hazards, and considering collapses as critical secondary geohazards in loess terrain, these were excluded from false positive tallies. Tile processing constraints (1200 × 1200 pixel size) caused marginal segmentation discontinuities without compromising whole-landslide recognition. Three instances of such landslide bodies being segmented into multiple parts exist, with a total of nine segments preserved in the detected_cords.txt file. However, for the convenience of statistical analysis and subsequent work, they have not been deleted (Figure 12G). Crucially, the model demonstrated a discrimination capability against spectrally similar features—rural earthen roads were never misclassified (Figure 12H)—validating the training sample quality and recognition precision.

Beyond the documented errors, the overall recognition success rate reached 94.1%. Leveraging high-resolution UAV data, the precise instance segmentation training proved highly effective; nearly all detected landslides represent neogenic failures characterized by short formation times, high spectral reflectance, and complete slope disintegration. Conversely, relict landslides exhibiting spectral homogeneity with their surroundings and anthropogenic modification (e.g., conversion to farmland/villages) were never misidentified, with all false positives attributable to the previously outlined categories.

The model achieves a processing speed exceeding 40 FPS with minute-level latency for batch operations while delivering high-precision performance, as evidenced by the mAP and recall metrics meeting benchmark standards for superior models, alongside a 94.1% recognition rate. This validates the proposed small-target segmentation strategy’s capability for rapid, intelligent detection and accurate delineation of landslides in the study area.

5.2. Characteristics of Landslide Hazards in Loess Tableland Areas

Morphological and genetic characteristics of loess tableland landslides were elucidated through high-precision modeling and field validation. Statistical analysis reveals near-vertical scarps (70–90°) controlled by well-developed vertical joints in loess (Figure 13A,D), with diverse failure morphologies, including arcuate (horseshoe-shaped), dendritic, fan-shaped, and elongated tongue-like forms, exhibiting considerable downslope extents—particularly along gullies (Figure 11B,C)—alongside armchair-shaped failures featuring curved main scarps and lateral confinement. Runout distances vary substantially (meters to hundreds of meters) and are influenced by the scale, gradient, and water content, with high mobility characterizing saturated loess flows [51,52]. Collectively, the landslide distribution exhibits spatial heterogeneity across regions, though most areas demonstrate pronounced clustering characterized by a high density per unit area where numerous failures occur in proximal distribution.

Figure 13. Field validation findings of landslides. (A) Landslide mass controlled by well-developed vertical joints in loess; (B) Gully slope landslide; (C) Extensive loess landslide mass; (D) Landslide mass controlled by well-developed vertical joints in loess.

5.3. Optimization Recommendations and Future Work

Our experimental observations reveal an inherent constraint: limited landslide occurrences within the initial study area necessitate spatial expansion for larger-scale analysis. However, broadening the investigation scope introduces greater morphological diversity among landslides, consequently increasing extraction complexity. To address this, we implement an incremental detection strategy. The process begins by training models on smaller sub-regions, then applying these to adjacent areas for preliminary detection. Newly identified landslides are iteratively incorporated into the training set, progressively enriching sample diversity while refining model performance. This cyclic optimization enables gradual expansion across the loess terrain until full coverage is achieved. Furthermore, the introduction of the P2 detection head in YOLOv8 significantly enhanced small landslide detection capabilities, improving mAP@0.5 by 8.4% and mAP@0.5:0.95 by 16.1% under progressive training; however, this concurrently increased the computational overhead by 15–30% and model complexity, potentially hindering edge deployment. Our forthcoming solution incorporates a Feature Enhancement Module (FEM) that addresses these limitations through multi-scale feature fusion and channel attention mechanisms. This novel upgrade inserts a triple-branch structure at the backbone’s terminus, employing dilated convolutions for large-receptive-field geological features, depthwise separable convolutions for small-target texture optimization, and channel attention to amplify landslide-sensitive features, with weighted fusion delivering processed features to detection heads. The FEM demonstrably improves landslide edge feature preservation, boosts P2-layer small-target recall by ≥10%, resolves feature dilution with minimal computational impact, and effectively suppresses vegetation/bare soil interference in remote sensing imagery through attention mechanisms, notably enhancing geological texture discrimination.

This study employs a large-scale tiling–landslide instance segmentation–tile mosaicking strategy to achieve the rapid and precise detection and segmentation of seismic landslides across extensive loess tableland areas. In practical applications, the instance segmentation results enable the retrieval of quantitative parameters for earthquake emergency responses. For instance, utilizing limited post-earthquake data to retrieve landslide quantitative parameters, including volume and deposit volumes, facilitates the timely compilation of landslide inventories in critical zones, revealing spatial distribution patterns. This comprehensively assesses seismic impacts on surface morphology, topography, and stability, providing accurate information to enhance rescue efficiency. Furthermore, it quantifies landslide damage severity to inform scientific reconstruction planning, guiding rational land-use and engineering development. The approach also enriches landslide research databases, supplying foundational data for investigating failure mechanisms and kinematic behaviors and thereby enabling more accurate risk prediction models with improved precision. Ultimately, this reduces landslide hazards while safeguarding lives, property, and geological stability.

6. Conclusions

The Ms 6.2 Jishishan earthquake in Gansu Province struck the loess tableland area at the northeastern margin of the Tibetan Plateau, inducing severe geohazards. To identify and assess landslides in the affected zone, this study proposes a rapid detection and segmentation method integrating enhanced deep learning algorithms with a large-scale tiling–landslide instance segmentation–tile mosaicking strategy, and applied it to landslide-prone areas encompassing Yangwa Village and the Goujia–Majia–Liugou sector, totaling approximately 20 km². The technical integrity and feasibility of this methodology are empirically demonstrated. The enhanced deep learning model demonstrates significantly improved feature representation capabilities for small targets, achieving simultaneous convergence of all four loss functions within a 500-epoch progressive training strategy. Performance metrics show both an mAP50(B/M) exceeding 0.6 and an mAP50(M) reaching 0.747, confirming the superior detection and segmentation efficacy for loess landslides triggered by the Jishishan earthquake. While the instance segmentation metric (mAP50-95(M) = 0.468) underperforms detection benchmarks, it aligns with the challenging nature of irregular boundary segmentation characteristic of geological hazards. Validation via high-precision 3D model comparison and visual interpretation confirms the approach’s detection accuracy and segmentation precision. While limited false positives and omissions persist, the imperative for efficient post-earthquake geohazard investigation and precise loess landslide recognition grows amid escalating disaster prevention demands, the limitations of traditional methods, and rapid AI advancements. This work not only expands seismic landslide inventories in loess tablelands but also provides novel technical frameworks for future earthquake emergency responses. The improved model offers a viable optimization strategy for deep learning-based rapid landslide detection and segmentation, demonstrating significant potential for machine learning in geohazard research.

Author Contributions

Z.B. (Zhuoli Bai), H.T., and L.J. conceived this research. Z.B. (Zhuoli Bai), J.Q., and Z.B. (Zongpan Bian) collected the landslide dataset and the UAV data. Z.B. (Zhuoli Bai) processed the data and wrote the manuscript. S.K. and C.L. assisted with the relevant data processing. All authors have read and agreed to the published version of the manuscript.

Funding

The authors declare that financial support was received for the research and publication of this article. This research was funded by the Natural Science Foundation of Shaanxi Province, China (funder: Zhuoli Bai, funding number: 2025JC-YBQN-463), National Natural Science Foundation of China (funder: Lingyun Ji, funding number: 42474013), Spark Program of Earthquake Sciences granted by the China Earthquake Administration (funder: Shuai Kang, funding number: XH25054YA), Key Programme of Earthquake Emergency for Youths, China Earthquake Administration (funder: Zhuoli Bai, funding number: CEAEDEM-20250219), and the Project of the Director’s Fund of Inner Mongolia Regional Seismological Bureau (funder: Baoxiao Bao, funding number: 2024QN11).

Data Availability Statement

The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Correction Statement

This article has been republished with a minor correction to the Funding statement. This change does not affect the scientific content of the article.

References

Catani, F. Landslide detection by deep learning of non-nadiral and crowdsourced optical images. Landslides 2021, 18, 1025–1044. [Google Scholar] [CrossRef]
Hua, Y.; Wang, X.; Li, Y.; Xu, P.; Xia, W. Dynamic development of landslide susceptibility based on slope unit and deep neural networks. Landslides 2021, 18, 281–302. [Google Scholar] [CrossRef]
Qiu, H.; Su, L.; Tang, B.; Yang, D.; Ullah, M.; Zhu, Y.; Kamp, U. The effect of location and geometric properties of landslides caused by rainstorms and earthquakes. Earth Surf. Process. Landf. 2024, 49, 2067–2079. [Google Scholar] [CrossRef]
Ye, B.; Qiu, H.; Tang, B.; Liu, Y.; Liu, Z.; Jiang, X.; Yang, D.; Ullah, M.; Zhu, Y.; Kamp, U. Creep deformation monitoring of landslides in a reservoir area. J. Hydrol. 2024, 632, 130905. [Google Scholar] [CrossRef]
Xu, C.; Xu, X.; Yao, X.; Dai, F. Three (Nearly) Complete Inventories of Landslides Triggered by the May 12, 2008 Wenchuan Mw 7.9 Earthquake of China and Their Spatial Distribution Statistical Analysis. Landslides 2014, 11, 441–461. [Google Scholar] [CrossRef]
Xu, C.; Tian, Y.; Ma, S.; Xu, X.; Zhou, B.; WU, X.; Zhuang, J.; Gao, Y. Inventory and spatial distribution of landslides in IX-XI high intensity areas of 1920 Haiyuan(China) M8.5 earthquake. J. Eng. Geol. 2018, 26, 1188–1195. [Google Scholar] [CrossRef]
Lan, H.X.; Li, L.P.; Zhang, Y.S.; Gao, X.; Liu, H.J. Risk Assessment of Debris flow in Yushu Seismic Area in China: A Perspective for the Reconstruction. Nat. Hazards Earth Syst. Sci. 2013, 13, 2957–2968. [Google Scholar] [CrossRef]
Booth, A.M.; Lamb, M.P.; Avouac, J.; Delacourt, C. Landslide velocity, thickness, and rheology from remote sensing: La Clapière landslide, France. Geophys. Res. Lett. 2013, 40, 4299–4304. [Google Scholar] [CrossRef]
Ciampalini, A.; Raspini, F.; Bianchini, S.; Frodella, W.; Bardi, F.; Lagomarsino, D.; Di Traglia, F.; Moretti, S.; Proietti, C.; Pagliara, P.; et al. Remote sensing as tool for development of landslide databases: The case of the Messina Province (Italy) geodatabase. Geomorphology 2015, 249, 103–118. [Google Scholar] [CrossRef]
Liu, Z.; Qiu, H.; Zhu, Y.; Liu, Y.; Yang, D.; Ma, S.; Zhang, J.; Wang, Y.; Wang, L.; Tang, B. Efficient identification and monitoring of landslides by time-series InSAR combining single and multi-look phases. Remote Sens. 2022, 14, 1026. [Google Scholar] [CrossRef]
Wang, X.; Fan, X.; Xu, Q.; Du, P. Change detection-based co-seismic landslide mapping through extended morphological profiles and ensemble strategy. ISPRS J. Photogramm. Remote Sens. 2022, 187, 225–239. [Google Scholar] [CrossRef]
Ma, S.; Qiu, H.; Zhu, Y.; Yang, D.; Tang, B.; Wang, D.; Wang, L.; Cao, M. Topographic changes, surface deformation and movement process before, during and after a rotational landslide. Remote Sens. 2023, 15, 662. [Google Scholar] [CrossRef]
Yang, D.; Qiu, H.; Ye, B.; Liu, Y.; Zhang, J.; Zhu, Y. Distribution and recurrence of warming-induced retrogressive thaw slumps on the central Qinghai-Tibet Plateau. J. Geophys. Res. Earth Surf. 2023, 128, e2022JF007047. [Google Scholar] [CrossRef]
Liu, Y.; Qiu, H.; Kamp, U.; Wang, N.; Wang, J.; Huang, C.; Tang, B. Higher temperature sensitivity of retrogressive thaw slump activity in the Arctic compared to the Third Pole. Sci. Total. Environ. 2024, 914, 170007. [Google Scholar] [CrossRef] [PubMed]
Xu, J.; Liu-Zeng, J.; Yuan, Z.; Yao, W.; Zhang, J.; Ji, L.; Shao, Z.; Han, L.; Wang, Z. Airborne LiDAR-Based Mapping of Surface Ruptures and Coseismic Slip of the 1955 Zheduotang Earthquake on the Xianshuihe Fault, East Tibet. Bull. Seism. Soc. Am. 2022, 112, 3102–3120. [Google Scholar] [CrossRef]
Zhao, N.; Ji, L.; Zhang, W.; Xu, X.; Wang, J. Present-day kinematics and seismic potential of the Ganzi-Yushu fault, eastern Tibetan plateau, constrained from InSAR. Front. Earth Sci. 2023, 11, 1123711. [Google Scholar] [CrossRef]
Zhao, Q.; Jiang, F.; Zhu, L.; Xu, J. Synthetic aperture radar interferometry–based coseismic deformation and slip distribution of the 2022 Menyuan MS6.9 earthquake in Qinghai, China. Geodesy Geodyn. 2023, 14, 541–550. [Google Scholar] [CrossRef]
Kang, S.; Jiao, Q.; Ji, L.; Zeng, Y.; Chen, C. Application of high-precision terrestriallight detection and ranging to determine the dis-location geomorphology of Yumen Fault, China. Front. Remote Sens. 2025, 6, 1566077. [Google Scholar] [CrossRef]
Liu, P.; Wei, Y.; Wang, Q.; Chen, Y.; Xie, J. Research on postearthquake landslide extraction algorithm based on improved U-net model. Remote Sens. 2020, 12, 894. [Google Scholar] [CrossRef]
Zhang, P.; Xu, C.; Ma, S.; Shao, X.; Tian, Y.; Wen, B. Automatic extraction of seismic landslides in large areas with complex environments based on deep learning: An example of the 2018 Iburi earthquake, Japan. Remote Sens. 2020, 12, 3992. [Google Scholar] [CrossRef]
Su, Z.; Chow, J.K.; Tan, P.S.; Wu, J.; Ho, Y.K.; Wang, Y.-H. Deep convolutional neural network–based pixel-wise landslide inventory mapping. Landslides 2021, 18, 1421–1443. [Google Scholar] [CrossRef]
Wang, L.; Qiu, H.; Zhou, W.; Zhu, Y.; Liu, Z.; Ma, S.; Yang, D.; Tang, B. The post-failure spatiotemporal deformation of certain translational landslides may follow the pre-failure pattern. Remote Sens. 2022, 14, 2333. [Google Scholar] [CrossRef]
Xu, Q.; Ouyang, C.; Jiang, T.; Yuan, X.; Fan, X.; Cheng, D. MFFENet and ADANet: A robust deep transfer learning method and its application in high precision and fast cross-scene recognition of earthquake-induced landslides. Landslides 2022, 19, 1617–1647. [Google Scholar] [CrossRef]
Mahdianpari, M.; Salehi, B.; Rezaee, M.; Mohammadimanesh, F.; Zhang, Y. Very deep convolutional neural networks for complex land cover mapping using multispectral remote sensing imagery. Remote Sens. 2018, 10, 1119. [Google Scholar] [CrossRef]
Ghorbanzadeh, O.; Shahabi, H.; Crivellari, A.; Homayouni, S.; Blaschke, T.; Ghamisi, P. Landslide detection using deep learning and object-based image analysis. Landslides 2022, 19, 929–939. [Google Scholar] [CrossRef]
Meena, S.R.; Soares, L.P.; Grohmann, C.H.; van Westen, C.; Bhuyan, K.; Singh, R.P.; Floris, M.; Catani, F. Landslide detection in the Himalayas using machine learning algorithms and U-Net. Landslides 2022, 19, 1209–1229. [Google Scholar] [CrossRef]
Jin, B.; Ye, P.; Zhang, X.; Song, W.; Li, S. Object-oriented method combined with deep convolutional neural networks for land-usetype classification of remote sensing images. J. Indian Soc. Remote Sens. 2019, 47, 951–965. [Google Scholar] [CrossRef]
Akter, R.; Doan, V.-S.; Lee, J.-M.; Kim, D.-S. CNN-SSDI: Convolution neural network inspired surveillance system for UAVs detection and identification. Comput. Netw. 2021, 201, 108519. [Google Scholar] [CrossRef]
Shi, W.; Zhang, M.; Ke, H.; Fang, X.; Zhan, Z.; Chen, S. Landslide recognition by deep convolutional neural network and change detection. IEEE Trans. Geosci. Remote Sens. 2021, 59, 4654–4672. [Google Scholar] [CrossRef]
Lei, T.; Zhang, Y.; Lv, Z.; Li, S.; Liu, S.; Nandi, A.K. Landslide inventory mapping from bitemporal images using deep convolutional neural networks. IEEE Geosci. Remote Sens. Lett. 2019, 16, 982–986. [Google Scholar] [CrossRef]
Huang, L.; Luo, J.; Lin, Z.; Niu, F.; Liu, L. Using deep learning to map retrogressive thaw slumps in the Beiluhe Region (Tibetan Plateau) from cubesat images. Remote Sens. Environ. 2019, 237, 111534. [Google Scholar] [CrossRef]
Qi, W.; Wei, M.; Yang, W.; Xu, C.; Ma, C. Automatic mapping of landslides by the ResU-Net. Remote Sens. 2020, 12, 2487. [Google Scholar] [CrossRef]
Ghorbanzadeh, O.; Crivellari, A.; Ghamisi, P.; Shahabi, H.; Blaschke, T. A comprehensive transferability evaluation of U-Net and ResU-Net for landslide detection from sentinel-2 data (case study areas from Taiwan, China, and Japan). Sci. Rep. 2021, 11, 14629. [Google Scholar] [CrossRef] [PubMed]
Xin, L.B.; Han, L.; Li, L.Z. Landslide Intelligent Recognition Based on Multi-source Data Fusion. J. Earth Sci. Environ. 2023, 45, 920–928. [Google Scholar] [CrossRef]
Fu, X.; Guo, J.W.; Liu, X.J.; Lu, H.; Yang, Z.L.; Xiang, X. Method of earthquake landslide information extraction based on high resolution unmanned aerial vehicle images. J. Seismol. Res. 2018, 41, 186–191. [Google Scholar]
Fu, R.; He, J.; Liu, G. Landslide recognition after the 2021 Haiti MS 7. 2 earthquake based on the improved YOLOv4 algorithm. J. Seismol. Res. 2023, 46, 300–307. [Google Scholar] [CrossRef]
Ju, Y.; Xu, Q.; Jin, S. Automatic Object Detection of Loess Landslide Based on Deep Learning. Geomat. Inf. Sci. Wuhan Univ. 2020, 45, 1747–1755. [Google Scholar] [CrossRef]
Zeng, Y.; Zhang, Y.; Chu, F.; Liu, J.; Feng, Z.; Su, J. Rapid assessment of landslides triggered by the Gansu Jishishan Ms 6.2 earthquake. J. Southwest Jiaotong Univ. 2025. Available online: https://link.cnki.net/urlid/51.1277.U.20241209.1546.002 (accessed on 28 July 2025).
Du, Y.; Huang, L.; Zhao, Z.; Li, G. Landslide body recognition and detection based on DETR in high-resolution remote sensing images. Bull. Surv. Mapp. 2023, 10, 1–6. [Google Scholar] [CrossRef]
Bai, S.; Tang, P.; Miao, Z.; Jin, C.; Zhao, B.; Wan, H. Research on mapping landslide based on high resolution remote sensingimage and improved U–Net model in Wenchuan, Sichuan. Remote Sens. Nat. Resour. 2024, 36, 96–107. Available online: https://link.cnki.net/urlid/10.1759.P.20240125.1645.018 (accessed on 28 July 2025).
Yang, Z.; Han, L.; Zheng, X. Landslide identification using remote sensing images and DEM basedon convolutional neural network: A case study of loess landslide. Remote Sens. Nat. Resour. 2022, 34, 224–230. [Google Scholar] [CrossRef]
Ji, S.; Yu, D.; Shen, C.; Li, W.; Xu, Q. Landslide detection from an open satellite imagery and digital elevation model dataset using attention boosted convolutional neural networks. Landslides 2020, 17, 1337–1352. [Google Scholar] [CrossRef]
Fan, X.; Fang, C.; Dai, L.; Wang, X.; Luo, Y.; Wei, T.; Wang, Y. Near real time prediction of spatial distribution probability of earthquake-induced landslides-Take the Lushan Earthquake on June 1, 2022 as an example. J. Eng. Geol. 2025, 30, 729–739. [Google Scholar] [CrossRef]
Zheng, W.; Zhang, B.; Yuan, D.; Chen, G.; Zhang, Y.; Yu, J.; Zhang, D.; Bi, H.; Liu, B.; Yang, J. Tectonic Activity in the Southern Alashan Block and the Latest Boundaryof Outward Expansion on the Northeastern Tibetan Plateau, China. J. Earth Sci. Environ. 2021, 43, 224–236. [Google Scholar] [CrossRef]
Zhu, S.; Shi, Y.; Lu, M.; Xie, F. Dynamic mechanisms of earthquake-triggered landslides. Sci. China Earth Sci. 2013, 56, 1769–1779. (In Chinese) [Google Scholar] [CrossRef]
Bai, Z.; Ji, L.; Zhu, L.; Cheng, H.; Xu, J.; Bian, Z.; Wang, J.; Li, Y.; Tang, H. Cause and destructive analysis of the mudflow in Zhongchuan town, triggered bythe 2023 Jishishan, Gansu, MS 6.2 earthquake. China Earthq. Eng. J. 2024, 46, 768–777. [Google Scholar] [CrossRef]
Li, Z.; Li, Y.; Tian, Q.; Xia, Y.; Zhang, J.; Yao, S.; Huang, W. Study on the rela-tionship between paleoseismic on laji mountain fault and cata-strophic event on lajiashan site. J. Seismol. Res. 2014, 37 (Suppl. 1), 109–115. Available online: https://kns.cnki.net/kcms2/article/abstract?v=9IId9Ku_yBYSJLC1dqBKUx08bnyJCd9590-fyRkSgwZwjym0OteNy2jJamWvz94e2YGZEUD-D1rsDq0C0VShzjii8HzGCCTDatftMiq11CPwWQp08sgMKGe7FesrmIZM-di50LC0A7wDaZpSkq2BmvJ8Ocb1j4FJF3gnX2jJ9qPPpHexGQm-kw==&uniplatform=NZKPT&language=CHS (accessed on 28 July 2025).
Chen, B.; Song, C.; Chen, Y.; Li, Z.H.; Yu, C. Emergency Identification and Influencing Factor Analysis of Coseismic Landslides andBuilding Damages Induced by the 2023 MS 6.2 Jishishan (Gansu, China) Earthquake. Geomat. Inf. Sci. Wuhan Univ. 2025, 50, 322–332. [Google Scholar] [CrossRef]
Wei, X.; Xie, C.; Wu, J.; Shen, C. Mask-CNN: Localizing parts and selecting descriptors for fine-grained bird species categorization. Pattern Recognit. 2018, 76, 704–714. [Google Scholar] [CrossRef]
Ren, S.Q.; He, K.M.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection withregion proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017. [Google Scholar] [CrossRef]
Bai, Z.L.; Er, C. The Sedimentary Phase Characteristics and Evolution Rule of Chang 9 in Huanxian-Zhengning Area. Northwestern Geol. 2021, 54, 166–178. [Google Scholar] [CrossRef]
Zhang, X.C.; Pei, X.J.; Zhang, M.S.; Sun, P.P.; Jia, J. Experimental study on mechanism of flow slide of loess landslides triggered by strongearthquake-A case study in Dangjiacha, Ningxia province. J. Eng. Geol. 2018, 26, 1219–1226. [Google Scholar] [CrossRef]

Figure 1. Geological setting of the Jishishan Earthquake.

Figure 2. Flight path design and key specifications.

Figure 3. Technical workflow.

Figure 4. Baseline Model Architecture.

Figure 5. Enhanced model architecture.

Figure 6. Large-scale imagery tiling processing architecture.

Figure 7. Adversarial networks and data augmentation techniques.

Figure 8. Geohazard data labeling.

Figure 9. Model evaluation metric curves. (A) Train_box_loss curve. (B) Train_seg_loss curve. (C) Train_cls_loss curve. (D) Train_dfl_loss curve. (E) Metrics_mAP50(B) curve. (F) Metrics_mAP50-95(B) curve. (G) Metrics_mAP50(M) curve. (H) Metrics_mAP50-95(M) curve.

Figure 10. Segmentation results (The red dot:geometric center).

Figure 11. Landslide detection and segmentation results in the study area. (A) Region ①: Detection and segmentation results of the entire region (B) Region ②: Northern region results display (C) Southern Region Results Display.

Figure 12. Mis-segmented results. (A) Topographic shadow misclassification; (B) Complex texture artifacts; (C) Anthropogenic feature confusion; (D) Partial segmentations; (E) Minor omissions; (F) Misclassified; (G) Landslide bodies being segmented into multiple parts; (H) Discrimination capability against spectrally similar features.

Table 1. Sample statistics.

Test Area	Sample Composition	Training Set	Validation Set	Test Set	Total Number of Samples
Training area	Yangwa Village	1868	330	\	2198
Testing area	Majia–Goujia–Liugou Village	\	\	Region 2	11,550

Table 2. Model performance evaluation comparison.

Indicator	Traditional Training (500)	Progressive Training (150 + 200 + 150)	Increase in Amplitude
mAP50(M)	0.704	0.747	+6.1%
mAP50-95(M)	0.413	0.468	+13.3%
Inference speed (FPS)	45	43	−4.4%
Model sizes (MB)	87	87	0%

Table 3. Ablation experiments.

Model	mAP@0.5	mAP@ 0.5–0.95	Recall	Number of Parameters (M)	GFLOPs (10⁹)
YOLOv8 (Progressive Training)	0.689	0.403	0.69	3.0	7.6
YOLOv8 + P₂ (Traditional training)	0.704	0.413	0.73	3.2	8.7
YOLOv8 + P₂ (Progressive Training)	0.747	0.468	0.81	3.7	10.1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bai, Z.; Ji, L.; Tang, H.; Qiu, J.; Kang, S.; Liu, C.; Bian, Z. Rapid Detection and Segmentation of Landslide Hazards in Loess Tableland Areas Using Deep Learning: A Case Study of the 2023 Jishishan Ms 6.2 Earthquake in Gansu, China. Remote Sens. 2025, 17, 2667. https://doi.org/10.3390/rs17152667

AMA Style

Bai Z, Ji L, Tang H, Qiu J, Kang S, Liu C, Bian Z. Rapid Detection and Segmentation of Landslide Hazards in Loess Tableland Areas Using Deep Learning: A Case Study of the 2023 Jishishan Ms 6.2 Earthquake in Gansu, China. Remote Sensing. 2025; 17(15):2667. https://doi.org/10.3390/rs17152667

Chicago/Turabian Style

Bai, Zhuoli, Lingyun Ji, Hongtao Tang, Jiangtao Qiu, Shuai Kang, Chuanjin Liu, and Zongpan Bian. 2025. "Rapid Detection and Segmentation of Landslide Hazards in Loess Tableland Areas Using Deep Learning: A Case Study of the 2023 Jishishan Ms 6.2 Earthquake in Gansu, China" Remote Sensing 17, no. 15: 2667. https://doi.org/10.3390/rs17152667

APA Style

Bai, Z., Ji, L., Tang, H., Qiu, J., Kang, S., Liu, C., & Bian, Z. (2025). Rapid Detection and Segmentation of Landslide Hazards in Loess Tableland Areas Using Deep Learning: A Case Study of the 2023 Jishishan Ms 6.2 Earthquake in Gansu, China. Remote Sensing, 17(15), 2667. https://doi.org/10.3390/rs17152667

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Rapid Detection and Segmentation of Landslide Hazards in Loess Tableland Areas Using Deep Learning: A Case Study of the 2023 Jishishan Ms 6.2 Earthquake in Gansu, China

Abstract

1. Introduction

2. Study Area and Data

2.1. Geological Setting

2.2. Data Acquisition and Processing

3. Methods

3.1. YOLO and Baseline Models

3.2. Enhanced YOLO with Shallow Detection Heads

3.3. Large-Scale Image Tile Staging Strategy

3.4. Dataset and Annotation

3.5. Progressive Training Strategy

3.6. Batch Segmentation and Coordinate Extraction Technique

4. Experimental Results

4.1. Model Training Outcomes

4.2. Ablation Experiments

4.3. Landslide Segmentation Results

5. Discussion

5.1. Feasibility Analysis of the Small-Target Segmentation Strategy

5.2. Characteristics of Landslide Hazards in Loess Tableland Areas

5.3. Optimization Recommendations and Future Work

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Correction Statement

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI