Next Article in Journal
Geochemical Characteristics and Helium Enrichment Mechanism of Coal-Derived Gas in the Sanjiaobei Block, Eastern Margin of the Ordos Basin, China
Previous Article in Journal
Sentiment Modeling of Cross-Cultural Public Opinion Communication: A Case Study of the 28 March 2025 Earthquake in Sagaing Province Based on the Improved MAML Algorithm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A High-Performance Model for Landslide Geological Hazard Detection, CDCS-YOLO

1
School of International Business, Xinjiang University, Shuimogou-District, Urumqi 830017, China
2
Xinjiang Naba Expressway Development Co., Ltd., Korla 841000, China
3
School of Traffic and Transportation Engineering, Xinjiang University, Shuimogou-District, Urumqi 830017, China
4
Institute of Geotechnical Engineering, Southeast University, Nanjing 211189, China
5
School of Geology and Mining Engineering, Xinjiang University, Shuimogou-District, Urumqi 830017, China
6
Xinjiang Key Laboratory of Green Construction and Maintenance of Transportation Infrastructure and Intelligent Traffic Control, Urumqi 830017, China
7
School of Architecture and Engineering, Xinjiang University, Shuimogou-District, Urumqi 830017, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Appl. Sci. 2026, 16(10), 4804; https://doi.org/10.3390/app16104804
Submission received: 18 April 2026 / Revised: 7 May 2026 / Accepted: 9 May 2026 / Published: 12 May 2026

Abstract

Although deep learning has been successfully used to detect landslide hazards in recent years, existing methods still face challenges due to the variety of landslide characteristics in different terrains and topographies. This study proposes a new framework for landslide detection by comparing various YOLO models. It employs deformable convolutional modules combined with GhostConv modules to enhance feature extraction for landslide targets. The framework uses a structured IoU loss function to optimize the alignment of actual and predicted frames in a directional sense. Additionally, it introduces the CoordAtt attention mechanism to accelerate model convergence and improve training efficiency. The experimental results demonstrate that the enhanced YOLO model (CDCS-YOLO), incorporating four key enhancement modules (Coordinate Attention, Deformable Convolutional Networks, the C3 Module/CSP Architecture and SIoU Loss), achieved a maximum mAP of 96.6%, an accuracy of 96.1%, and a frame rate of 142.6 FPS. Notably, it performed exceptionally well in soil landslide detection, achieving an average detection accuracy surpassing 90%. Based on the experimental results, we explored a morphological landslide classification method further as well as a multi-source differential monitoring strategy integrating UAV imagery, field surveys, ground-based LiDAR data, rainfall information and deformation indicators. The proposed method outperforms the baseline approach and is a promising solution for detecting landslides and geological hazards in Xinjiang.

1. Introduction

A landslide is one of the most common and destructive natural phenomena. It occurs when soil or rock on a slope becomes unstable—either entirely or in part—along weak planes or zones due to internal and external dynamic factors such as river erosion, groundwater activity, infiltration from heavy rainfall [1], seismic activity, or human-induced slope cutting [2,3]. There are numerous methods for monitoring landslide disasters, which can be categorized into three types. The first type emerged before remote sensing imagery was developed. It involves traditional manual disaster surveys, in which researchers primarily rely on field visits to collect landslide data. While this method allows for on-site verification of information, the data is usually unavailable until long after the disaster has occurred. This approach is costly and time-consuming, making it difficult to meet the demands of large-scale monitoring. The second category of methods emerged alongside the rapid development and widespread application of satellite remote sensing imagery. Many scholars have used traditional image processing techniques to detect landslides in remote sensing imagery [4]. The main methods are statistical and machine learning methods [5,6]. Since prior knowledge of disasters is primarily derived from remote sensing imagery and on-site investigations, which involve significant human interpretation, the resulting data lacks generalizability and transferability. The third category consists of landslide detection technologies based on deep learning [7]. In deep learning, a series of closely interconnected convolutional layers forms a convolutional neural network (CNN) [8]. Input image data is compressed into smaller feature maps through convolutional operations. These feature maps retain the input image’s feature information and are stored as high-dimensional signals with strong descriptive power.
In landslide susceptibility prediction, landslides are influenced by their surrounding environment, environmental factors in the vicinity, and the landslide clustering effect [9]. Landslide features in remote sensing imagery can be categorized into three types: color features (tone, hue, and shading); morphological features (shape, texture, size, and pattern); and spatial location features (spatial coordinates and spatial distribution). The surveying process results in remote sensing images containing complex background information and numerous interfering factors. Factors such as rainfall, snowfall, vegetation cover, and shadows limit detection accuracy, posing significant challenges to detection performance and generalization capabilities [10]. Therefore, improving feature extraction capabilities and removing redundant information from images to enhance detection rates while maintaining accuracy represents a current bottleneck in research on intelligent prediction of landslide-prone areas.
With the continued development and application of machine vision technology, deep learning (ML) has been applied to landslide image recognition [11]. For example, Yongxin Li et al. [10] proposed a multi-label classification and annotation network for landslide detection. By leveraging bidirectional long short-term memory (LSTM) networks’ ability to model label dependencies, this network significantly improved landslide detection accuracy in the study area. DianqingYang et al. [12]. Furthermore, colleagues proposed using Faster R-CNN for landslide detection in remote sensing images via deformable convolutions. After using batch normalization to mitigate the impact of batch size on the model, they optimized the extracted landslide features, thereby significantly improving the model’s accuracy. The LEB-YOLO model proposed by Du, Yingjie et al. [13] significantly improves the efficiency of landslide object detection by simplifying the network architecture and reducing model weight and computational load. However, accuracy, speed, and the number of parameters in landslide detection models are interdependent and influenced by various factors [13]. For example, increasing the model depth can improve accuracy but reduces detection speed and increases parameter complexity. Reducing the number of parameters affects detection accuracy. The number of parameters is also constrained by storage space [14]. Balancing accuracy, speed, and the number of parameters in landslide detection models to ensure high performance has become an urgent issue [15].
In summary, existing research on deep learning models for landslide hazard monitoring primarily relies on large amounts of geological data to predict landslide-prone areas, thereby enhancing early warning and prevention capabilities for landslide disasters [15]. However, in landslide susceptibility prediction, landslides are influenced by the surrounding environment and the clustering effect of landslides. This undoubtedly limits detection accuracy and poses significant challenges to detection performance and generalization ability. Furthermore, balancing accuracy, speed, and the number of parameters in landslide monitoring models to achieve higher performance remains a major research challenge in intelligent landslide susceptibility monitoring [16].
In an attempt to address these challenges, we sought to enhance the YOLO model by integrating four pivotal modules: Coordinate Attention, Deformable Convolutional Networks, the C3 module/CSP architecture and SIoU loss. The outcome was a novel detection model, CDCS-YOLO, named after the initials of these modules, which is designed to detect landslide hazards in the Ili Kazakh Autonomous Prefecture of Xinjiang. First, this study compares the performance of YOLOv4, YOLOv5 and YOLOv7 in terms of detection accuracy, speed and memory consumption. Following a thorough evaluation, YOLOv5s was chosen as the base architecture. This architecture combines the DCN module with the GhostConv [17,18] module to enhance the model’s ability to extract features from remote sensing images. Furthermore, introducing the attention mechanism of the SPPF module (CA) [19] and the structured IoU (SIoU) loss function accelerates the model’s training speed, improves the agreement rate between predicted and ground-truth bounding boxes, reduces computational complexity, and enhances object detection accuracy. Finally, given the significant differences in image features between soil and rock landslides, a differentiated landslide monitoring and management scheme based on the CDCS-YOLO model is proposed.

2. Materials and Methods

2.1. Literature Review and Experimental Steps

Object detection [20] is at the core of deep learning applications for landslide identification and has found widespread use across numerous fields, including remote sensing, image analysis, disaster detection, and image processing. In practical applications, such as slope landslide detection along mountainous roads, it is necessary to select appropriate deep learning object detection algorithms based on soil and rock characteristics. Building upon the successful experiences of image classification networks, such as LeNet-5, AlexNet, VGG, GoogleNet, and ResNet, existing research has developed two major categories of convolutional neural network (CNN)-based object detection models [21]: the region-proposal method and the regression method. The R-CNN series primarily represents object detection algorithms based on region proposals, while the YOLO series is the most representative of regression-based object detection algorithms [22,23].
Regarding region-based algorithms: The R-CNN series [1,24] is the primary representation of these algorithms [25]. A list of region-based algorithms is presented in Table 1. While these algorithms offer high detection accuracy, they generally perform poorly in real time. Region-based detection algorithms are gradually evolving into segmentation algorithms for precise object localization. These algorithms enable accurate localization and classification based on object contours. They are primarily used in scenarios requiring high precision [12,26].
Regarding classification algorithms, this category is primarily represented by the YOLO series [13,27,28]. The YOLOv1 algorithm transforms the classification problem into a regression problem, thereby improving detection speed. However, it performs poorly at detecting small objects and lacks generalization. Released in 2017, the YOLOv2 algorithm introduced an anchor mechanism that sets corresponding-size anchors across the image grid. This improves the accuracy of object detection and localization, thereby addressing YOLOv1’s low detection accuracy [29,30]. The YOLOv2 algorithm uses DarkNet19 as its backbone network and employs k-means to determine anchor box sizes, significantly improving precision and recall while enhancing its ability to detect small objects. Redmon et al. [15] enhanced the YOLOv2 algorithm to improve detection speed. The YOLOv3 algorithm extracts image features using a Darknet53 backbone network and classifies objects into three categories: large, medium, and small. Object detection is implemented based on the corresponding feature layers. Proposed in 2020, the YOLOv4 algorithm incorporates various optimization techniques to improve upon YOLOv3. It uses CSP Darknet53 as its backbone network and employs a path aggregation network to fuse feature maps at different scales. The bottom-up and top-down pyramid structure captures richer feature information, further enhancing detection accuracy [31,32]. The YOLOv5s algorithm, proposed in the same year, aims to address the issues of YOLOv4, such as its large parameter count and high memory consumption, achieving an optimal balance between the number of algorithm parameters and computational complexity [33]. YOLOv7 offers higher accuracy and faster detection speed than other models in the YOLO series [34]. However, it has issues such as high memory consumption, high model complexity, and more stringent hardware requirements [35]. As shown in Table 2, regression-based algorithms excel primarily in detection speed, meeting current real-time requirements. However, they typically have low detection accuracy and require significant improvement. The YOLO series of algorithms currently stands as the preferred solution for object detection, demonstrating excellent performance in detection accuracy and speed [10,36].
This paper introduces CDCS-YOLO, a landslide object detection model that surpasses baseline models in detection accuracy while offering significant memory and speed optimizations. The CDCS-YOLO model significantly improves the ability to identify landslides at an early stage. It can monitor large geographic areas effectively, providing timely alerts as soon as a disaster occurs and supporting relevant authorities in implementing emergency response measures. This reduces casualties and property damage. The method constructs a backbone network that combines GhostConv and Deformable Convolution (DCN) modules. This combination enhances the model’s ability to extract features across different types of landslides while reducing computational costs. The study also introduces a lightweight CoordAtt attention mechanism that maintains long-range dependencies and spatial features across channels. It uses an SIoU-based loss function to improve frame localization accuracy between ground-truth and predicted frames. This further improves the model’s detection speed and accuracy. The experiments in this study are divided into the following four steps and are based on relevant work in the literature review (Figure 1).
Step 1: Obtain remote sensing imagery of landslides. Pre-process the dataset and establish a prior framework for landslide targets based on the imagery to improve model performance during training.
Step 2: Conduct a preliminary comparison of the YOLO model series, evaluating the data in terms of accuracy, speed, and memory usage to determine the initial model architecture.
Step 3: Integrate the GhostConv and DCN modules into the backbone network to improve feature extraction for landslide targets. Through ablation experiments, compare the performance of the popular loss functions EIoU, CIoU, and SIoU with combinations of SENET, CBAM, and CA modules to identify optimal modules for integration and construct the improved CDCS-YOLO model.
Step 4: Further validate the improved model at landslide sites in Xinjiang to confirm its suitability for wider use.

2.2. Framework Overview

Figure 2 shows the overall structure of our proposed method, comprising three submodules: the backbone, neck, and head. The backbone and neck modules extract feature representations, which are then integrated into multi-layer, multi-scale features by the fusion module. The head module then generates the detection results. Our backbone model comprises four core components: a feature extractor, a coordinate attention mechanism (CA), GhostConv modules, and a deformable convolutional network (DCN), and is built upon GhostConv and DCN modules.

2.3. Main Functional Modules

2.3.1. GhostConv Module

GhostConv is a lightweight convolutional module [37]. Assuming that redundant features in the feature map are key to achieving good detection results for a specific neural network, this module performs convolutional operations using only half of the features, while connecting the remaining half to fewer network parameters. Its module structure is shown in Figure 3. Drawing inspiration from this, we have designed a backbone network composed of lightweight modules that combine GhostConv and DCN modules for the first time, thereby enhancing feature extraction while reducing computational cost.

2.3.2. CoordAtt Attention

The CA mechanism’s ability to focus on both channel attention and spatial information is advantageous, as it captures global information while modelling long-range dependencies [38]. Integrating the CA mechanism into the backbone network and embedding positional information into channel attention reduces computational complexity. Meanwhile, vertical and horizontal features are decomposed into two parallel encoding processes and aggregated into two independent feature maps. This enables each feature map to realise long-range dependencies within its respective spatial dimension, preserving the target’s spatial information, improving recognition accuracy, and reducing computational complexity. This architecture is shown in Figure 4.

2.3.3. SIoU

We are committed to improving localization accuracy and, ultimately, enhancing recognition performance as reflected by the overlap between predicted and ground-truth frames. In this study, we adopted the SIoU loss function [39].
LOSS SIoU = 1 IoU   + Δ + Ω 2 ,
Specifically:
Ω   =   t = w , h ( 1     e w t ) θ   =   ( 1     e w w ) θ   +   ( 1     e w h ) θ ,
Δ = t = x , y ( 1 e γ p t ) = 2 e γ p x e γ p y ,
  w w = w w gt max f ( ) ( w , w gt ) ,       w h   = h h gt max f ( ) ( h , h gt ) ,
ρ x =   ( b c x gt b c x c w ) 2 ,   ρ y =   ( b c y gt   b c y c h ) 2 ,   γ = 2 Λ ,
Λ = 1 2     sin sin 1 c h σ π 4 = cos 2   * sin 1 c h σ π 4 .
LossSIoU is the loss value of the SIoU loss function. IoU is the intersection-over-union ratio between the predicted and ground-truth bounding boxes. Δ represents the distance loss between the two bounding boxes. Ω represents the shape loss based on the predicted image width and height. (w, h) Furthermore, (wgt, hgt) represent the width and height of the two bounding boxes, respectively, and are used to control the weight given to shape loss. Their typical range is [4,24]. (cw, ch) represent the width and height, respectively, of the minimum bounding rectangle between the ground-truth and predicted bounding boxes. ch denotes the difference in height between the centre points of the ground-truth and predicted bounding boxes, and σ denotes the distance between the centre points of the ground-truth and predicted bounding boxes.

2.3.4. DCN

Although deep convolutional neural network (DCN) modules have been applied in various fields in this study, research on landslide detection remains underdeveloped [40]. Consequently, we have combined the GhostConv module with the DCN module to form a backbone network for extracting deep features for landslide detection.
Convolutional neural network (CNN) [35] modules form a core part of the YOLO algorithm. However, the fixed shape of convolutional kernels in traditional CNN modules limits their ability to extract features from irregularly shaped objects. In detecting landslide geological hazards, the locations and morphologies of similar landslides often exhibit diverse distributions. This paper therefore adopts a DCN architecture combined with traditional CNN modules from YOLO, enhancing the model’s generalisability, and integrates a Collaborative Attention (CA) mechanism to strengthen its multi-scale feature extraction capabilities. Figure 5 and Figure 6 illustrate the feature extraction methods for traditional and deformable convolutions, respectively.
The DCN computation process can be divided into two steps when using 3 × 3 deformable convolution as an example: (1) sampling the mapping X of the input feature map using grid R; (2) performing a weighted sum of the sampled data points using weights W. Here, grid R defines the receptive field and expansion factor of the convolution.
R = 1 , 1 ,   1 , 0 ,   ,   ( 0 , 1 ) ,   ( 1 , 1 )  
Equation (7) defines a 3 × 3 convolution kernel with a scale factor of 1.
y ( p 0 ) = p n R   w ( p n ) · x ( p 0 + p n + Δ p n )
In Equation (8), y represents the output feature map, o denotes each position in y, and p iterates over the positions in R. (“p, n = 1, 2, …, N”) represents the offsets in R. As can be seen, DCN introduces an offset for each point based on traditional convolution; this offset is generated by convolving the input feature map with another convolution and is typically not an integer.
x p = q   G ( q , p ) · x ( q )  
q enumerates all integer spatial positions in the feature map, and G(·) is a bilinear interpolation kernel. Since the positions after adding the offset are non-integer values and do not correspond to actual pixels in the feature map, the offset pixel values must be interpolated.
The improved network model is designed to enhance detection accuracy and robustness while reducing the number of model parameters and computational load. It is suitable for lightweight applications and can more effectively identify landslide geological hazards in complex operational environments.

3. Experimental Evaluation

3.1. Dataset and Data Pre-Processing

A map of geological hazard distribution in Xinjiang was created in ArcMap 10.8 using satellite imagery of the region downloaded with the All-in-One Map Downloader version 3.0 (Figure 7). Data points for landslide hazards were obtained from the Regional Highway Business Development Centre network. Landslides in Xinjiang are primarily concentrated in three major mountain ranges. Statistical data indicate that the Ili Kazakh Autonomous Prefecture alone accounts for over 2000 landslide sites. Due to geographical constraints, manual data collection is extremely challenging. Future research will focus on constructing a database based on existing high-resolution landslide remote sensing imagery to enable landslide detection. Due to the complexity of landslides in Xinjiang’s mountainous regions, subsequent practical testing of the model will use landslide data from this area to verify the model’s general applicability.
The performance of deep learning models largely depends on the volume of data they are exposed to during training. When processing large datasets, models can learn the mapping rules between data inputs and outputs. It is crucial to ensure that the model is exposed to a diverse and sufficient number of data instances, as this improves the adaptability and predictive accuracy of convolutional neural networks (CNNs) when handling different scenarios. In order to develop a model with comprehensive generalisability, a large number of representative training samples must be carefully collected and appropriate data preprocessing must be performed. According to Step 1, data preprocessing consists of three stages.

3.1.1. Data Organization

First, 933 available Sentinel-2 remote sensing images of landslides were collected from the Kaggle data platform and split into training and testing sets at a ratio of 8:2. Subsequently, the remote sensing images corresponding to road sections with high landslide risk were annotated for subsequent practical applications, based on the landslide disaster data system maintained by the relevant administrative units from 2018 to 2023.

3.1.2. Sample Augmentation

Due to the relatively small size of the dataset in this study, the learning capabilities of the parameters and the network may not be fully realised, which could lead to negative effects such as overfitting. To address this issue, a series of innovative data augmentation strategies were implemented in accordance with the experimental requirements, with the aim of enhancing the model’s ability to generalise and mitigating the potential risk of overfitting during training.
First, we introduced a multidimensional data augmentation method that, unlike simple image rotation, translation or scaling, includes more complex image transformations such as HSV saturation enhancement. These operations significantly increase the diversity of the dataset while preserving the essential features of the images. Secondly, in terms of the implementation strategy for data augmentation, both online and offline image augmentation methods are employed simultaneously. Compared to traditional offline augmentation, this approach first uses offline augmentation to expand the dataset further, and then uses online augmentation to avoid the need for large amounts of storage space, while also ensuring that the data is transformed randomly during each model training session. Specifically, the data augmentation techniques applied during the input phase of the model ensure that the images seen by the model in each iteration possess unique characteristics. Finally, label smoothing was applied to improve training performance, suppress model overfitting, and enhance the robustness of the landslide detection algorithm to noise (Figure 8).

3.1.3. Tags

In this study, the collected images were systematically annotated using the minimum bounding box method to delineate target objects precisely, with each target being assigned a corresponding class label. Annotations were performed separately for the entire landslide body and for each individual landslide within the cluster. This approach improved annotation accuracy and ensured that the generated labelled data met the requirements of object detection models at different scales. Secondly, Use PyCharm Community Edition 2022.1.4 (JetBrains s.r.o., Prague, Czech Republic) was used to process the annotation files and convert them into computer-readable text files for model training.
The structure of the dataset is shown in Figure 9: (3) label.

3.2. Experimental Setup

Several experiments have been conducted using the dataset collected for this task. The model was trained on a GPU; the system configuration is detailed in Table 3 below. The experimental parameters are as follows: the Adam optimizer was used in place of SGD for algorithm optimization, with 100 training epochs and a batch size of 16. The minimum learning rate was set to 0.0001, and the maximum learning rate to 0.01. The weight decay value was set to 5 × 10−4.
We use precision and recall as evaluation metrics to evaluate the performance of the proposed method.
R = T P T P + F N P = T P T P + F P F A = 1 P
Here, R denotes recall, P denotes precision, FA denotes the false positive rate, TP denotes true positives (i.e., data that are both true and correctly predicted), FP denotes false positives (i.e., data that are false but correctly predicted), and FN denotes false negatives (i.e., data that are true but incorrectly predicted).
Common models for object detection include the R-CNN and YOLO series mentioned earlier, along with their derivatives and improved versions. It is not feasible to compare every single model. However, according to the literature, YOLO series algorithms better meet the requirements for model accuracy, memory consumption, and detection speed. Therefore, we will use the YOLO series as a baseline and compare its optimised version with several representative models from recent years to evaluate the optimised model’s performance.

3.3. Model Training Parameters

In supervised learning, the performance of deep learning models is strongly influenced by hyperparameter tuning. Hyperparameter optimisation achieves an optimal balance between model convergence efficiency (training speed) and detection accuracy by systematically adjusting key parameters, such as the learning rate and batch size. This study uses the control-variable method to determine the optimal hyperparameter combination (see Table 4), ensuring that all comparison models undergo benchmark testing under identical conditions. This guarantees the scientific validity and comparability of the experimental conclusions.

3.3.1. Batch Size

In this experiment, each iteration processes 16 images, so the number of batches refers to the number of samples selected from the training set during a single training pass. The results show that when BS = 16, GPU memory usage remains stable at 85–92%, resulting in a 41% improvement in training efficiency compared to the BS = 4/8 scheme.

3.3.2. Learning Rate

As the core control parameter for weight updates, the learning rate must strike a balance between convergence speed and stability. If it is set too low (<1 × 10−3), local optima may be reached, leading to overfitting. Conversely, if it is set too high (>1 × 10−2), gradient explosion will occur. This experiment adopts a staged learning rate: it is set to 0.01 during the freezing phase to accelerate feature extraction and reduced to 0.001 during the fine-tuning phase to allow precise parameter adjustment. A smooth transition is achieved through Adaptive Moment Estimation (Adam).

3.3.3. Anchor Box

The standard k-means clustering method used to generate anchor boxes in the YOLO framework shows signs of bias when applied to the small-scale roadside slope dataset examined in this study.
Due to the concentration of target sizes, the anchor boxes generated by clustering lack diversity, leading to a significant decline in the model’s performance in detecting landslide targets of non-mainstream sizes. Comparative experiments have shown that using YOLO’s default multi-scale bounding box configuration effectively mitigates this issue, as its preset aspect ratio combinations are better suited to the diverse morphological characteristics of landslides and demonstrate greater robustness in cross-scale detection tasks.

4. Results

In accordance with established landslide classification principles, the landslide hazards identified in this study were assessed based on the following criteria: material type (soil landslides, rock landslides), movement type (sliding, falling, flowing), boundary clarity, surface texture, color contrast, degree of vegetation disturbance, and slope characteristics. In existing optical imagery, soil landslides typically exhibit fan-shaped or tongue-shaped forms, relatively smooth surfaces, strong color contrast, and obvious vegetation clearance; Rock mass landslides usually present as blocky or wedge-shaped features with rough textures and exposed rock surfaces. They are often characterised by shadows and angular debris deposits. These classification criteria are used to evaluate model performance and guide the development of monitoring strategies. Observing features in landslide disaster imagery revealed that both soil and rock landslides exhibit significant changes in their triggering factors before and after the slide. These changes manifest as distinct features in the images. Therefore, combining deep learning with traditional measurement methods could accelerate the early identification of landslide disasters even further.
Based on the above data, we will present and discuss the results of the proposed landslide detection method, applying it to landslide-prone areas of Xinjiang. First, we compared different versions of the YOLO family of base models, pretraining them on the dataset we created, as detailed in Step 2 of the experimental procedure. The results are shown in Table 5.
As shown in Table 5, the baseline models YOLOv5 and YOLOv4 achieve higher frame rates than YOLOv7, with only minor differences in accuracy. Meanwhile, YOLOv5n, YOLOv5s, and YOLOv7-tiny use less memory. The table above shows that YOLOv5n, YOLOv5s, and YOLOv7-tiny offer significant advantages in terms of accuracy, memory usage, and detection speed. Their performance more closely aligns with the accuracy, memory, and speed requirements of landslide object detection tasks.
Based on the above conclusions, we expanded the YOLOv5s model framework by incorporating recent mainstream attention mechanism modules into the algorithm. We then proceeded to Experiment 3, in which we optimised the model further using a more practical loss function. The results of the ablation experiments are shown in Figure 10, Figure 11 and Figure 12.
As shown in the scatter plot, the SIoU loss function achieves the highest training accuracy at epoch 90. In contrast, CIoU and EIoU do not achieve their highest training accuracy until epoch 110.
According to the radar chart, integrating the SIoU loss function with the CA attention mechanism improves the model’s performance, yielding an mAP@0.5 of 0.956, a recall of 0.908, and a precision of 0.937.
The model’s validity was verified using landslide remote sensing images from the validation set. We applied the loss-function-based optimal attention mechanism obtained from the experiments, along with the DCN backbone network and the Ghost model, to Model IV. The detection results are shown in Table 6.
The experimental results show that the recognition accuracy of the optimised model has significantly improved compared to the initial model, rising from 67.3% to 96.1%. As shown in Table 6, the proposed model achieves a 1.3% increase in accuracy, an 8.2 FPS (4.4%) increase in frame rate, and a 1.7 MB (10%) reduction in model size compared to the initial model. Taking these results into account, while the proposed model is only slightly inferior to YOLOv5m in terms of accuracy, its smaller model size gives it an overall performance advantage over YOLOv5n.
Spatial Pyramid Pooling (SPP) extracts features at different scales. This enhances the model’s ability to detect landslides of different sizes, improving recognition accuracy. The CA module performs global average pooling on the width and height of the acquired channel information, strengthening the model’s ability to confirm coordinate information and improving landslide recognition across different scenarios. The SIoU loss function, which considers the three-dimensional aspects of angle, distance, and shape, is used for landslide localisation, thereby improving both model training speed and inference accuracy. This analysis shows that using a backbone network that combines the GhostConv module, the deformable convolution module, the CA mechanism module, SPPF, and the SIoU loss function significantly improves landslide detection performance. This validates the research direction and correctness of the model. The test results are shown in Figure 13.
Figure 13 illustrates four typical landslide hazard detection scenarios. From left to right, the scenarios depict areas with dense clusters of multiple landslides; narrow, elongated landslide bodies; mixed landslide clusters of varying scales; and low-light/shadow-obstructed scenarios. Of these four scenarios, the improved YOLOv5s, which was used as the base model for CDCS-YOLO, achieved the highest detection rate for small objects. It demonstrated optimal localisation accuracy and bounding box fit with no false positives, high classification reliability and robust environmental adaptability across challenging scenarios such as complex lighting, shadow occlusion and multi-scale mixed environments. Figure 7 shows the distribution of geological hazards in Xinjiang and indicates that landslides are primarily concentrated in the Ili Kazakh Autonomous Prefecture. We then moved on to the fourth step of our experiment, applying the trained model (CDCS-YOLO) to detect landslides in the mountainous areas of this region (see Figure 14). As shown in the figure, the landslide detection accuracy generally met expectations, further validating the model’s generalisability.

5. Discussion

5.1. Differentiated Landslide Prevention and Mitigation Measures

Analysis of the experimental results in this chapter reveals that UAV technology is significantly more effective at detecting earthslides than rock slides. Earthslides exhibit more distinct colour and texture features in images, making them easier for deep learning models to capture. In contrast, rock slides present challenges due to factors such as the fine texture of rock and significant effects from shadows and lighting. This results in insufficient accuracy in the detection model’s localisation and classification. Therefore, relying solely on UAVs for the inspection of rock slides can lead to missed detections or difficulties in effectively identifying early-stage cracks under certain terrain conditions. Therefore, this approach must be combined with others to achieve more comprehensive early identification and warning of landslides.
In order to develop more effective early warning strategies, the interpretation of detection results must take into account the causative factors and triggering conditions of landslides. In this study area, landslides occur in close relation to topography, rock type, geological faults, freeze–thaw cycles, rainfall infiltration, groundwater activity, seismic disturbances, river erosion and human activity. Soil landslides are generally more sensitive to rainfall, soil moisture, pore water pressure and changes in groundwater levels. They often exhibit distinct changes in colour, texture and vegetation cover in optical imagery. In contrast, rock mass landslides are more strongly influenced by joints, bedding planes, crack propagation, in situ stress, weathering and sudden external triggers. Their early deformation phenomena are often difficult to capture accurately using only UAV or satellite optical imagery. Therefore, relying solely on UAVs for rockslide inspection may lead to missed detections or difficulties in effectively identifying early-stage cracks under certain terrain conditions. Consequently, it is necessary to combine UAVs with other methods (such as ground-based LiDAR [34] and manual inspections [41]) to achieve more comprehensive early identification and early warning of landslides. Tailored slope monitoring methods for different types of soil and rock landslides have been developed based on a summary of their characteristics, as shown in Table 7 and Table 8.
Therefore, the proposed CDCS-YOLO model should be integrated into a differentiated monitoring and response framework. For high-risk earthen slopes, UAV imagery should be combined with on-site surveys, soil moisture monitoring, rainfall records, drainage system inspections and crack monitoring. For high-risk rock slopes, however, UAV inspections should be supplemented with ground-based LiDAR or total station monitoring, crack measurement device inspections, rockfall observations and emergency inspections following heavy rainfall or earthquakes. Mitigation measures may include surface and subsurface drainage, toe protection, retaining structures, anchor bolts, protective netting, crack sealing, vegetation restoration, traffic control and temporary road closures when necessary. Combining image recognition with engineering monitoring reduces the rate of missed defects, making this approach more practical for highway maintenance and emergency management.

5.2. Methodological Framework and Limitations

As demonstrated by the experimental results in this paper, the CDCS-YOLO model significantly improves the early detection and identification of landslides on mountain roads in Xinjiang. Unlike existing studies that focus on one-dimensional improvements to the YOLO model (such as replacing only the backbone or adding attention mechanisms alone) [42,43,44], this study achieves three-dimensional synergistic optimisation of efficiency, accuracy and robustness. With a model size of 14.2 MB, an accuracy of 96.1% and a speed of 142.6 FPS, it strikes an optimal balance between industrial deployability and academic performance. Secondly, the experiments in this study further demonstrate that the early identification and monitoring of landslides using the CDCS-YOLO deep learning model relies on data from various key influencing factors, based on the differentiated theoretical strategy for soil and rock proposed by K. He et al. [25]. For example, when it comes to soil landslides, the key parameters to monitor include soil moisture content, pore water pressure, groundwater level, surface displacement, crack development, precipitation and meteorological data, as well as slope deformation rates [41]. In contrast, for rockfall disasters, the focus should be on joint displacement and deep-seated deformation, changes in in situ stress, groundwater dynamics, surface inclination and topographic deformation characteristics [45]. Combining multi-source monitoring data with image recognition technology can significantly improve landslide monitoring efficiency while ensuring accurate detection. Therefore, as proposed in Section 5.1 regarding differentiated landslide prevention and control measures, a truly comprehensive landslide monitoring system of the future should utilise CDCS-YOLO image recognition technology and be supplemented by field surveys, UAV observations, LiDAR, rainfall monitoring and deformation monitoring. This would develop a landslide monitoring and early warning system led by CDCS-YOLO.
In particular, the model’s adaptability and generalisation capabilities need to be further enhanced when dealing with landslide data under varying geological conditions. Future improvements can be pursued in several areas: first, incorporating a broader range of landslide data would improve the model’s accuracy and reliability across diverse geological environments. Secondly, the model may struggle with more complex background and noise data; integrating advanced noise-suppression and background-modelling techniques would further improve detection accuracy.

6. Conclusions

In this study, we propose a novel landslide detection model called CDCS-YOLO. To address the complexity of landslide backgrounds, we use a Deep Convolutional Network (DCN) module in conjunction with a GhostConv module. We also conduct ablation experiments to determine that the CA mechanism and the SIoU loss function are the most suitable for landslide monitoring. This approach enhances the extraction of landslide features and spatial localisation capabilities, thereby improving the accuracy of detecting landslides of varying angles, sizes, and shapes. Experimental results demonstrate that CDCS-YOLO outperforms the traditional YOLOv4, YOLOv5, and YOLOv7 models in both performance and accuracy. The model achieves an mAP of 96.6%, a precision of 96.1%, and a frame rate of 142.6 FPS with only a slight increase in the number of covariates. This demonstrates that the algorithm offers a certain degree of efficiency improvement. Results from applying the model to the Ili Kazakh Autonomous Prefecture in Xinjiang show that landslide detection accuracy fluctuates minimally, further validating the model’s effectiveness.
The CDCS-YOLO model is not intended to replace geotechnical monitoring; rather, it is designed to provide a rapid, image-based screening tool to be used alongside other methods, such as UAV inspections, ground-based LiDAR, on-site surveys, rainfall data, groundwater monitoring and deformation monitoring. Future research will expand the dataset to include a wider range of topographic and climatic conditions in order to more accurately distinguish between soil and rock slides. It will also integrate rainfall thresholds, snowmelt indices and climate change scenarios in order to support the prediction of future landslides and the management of highway risk.
Although this method has achieved good results in landslide object detection, limitations in data scale and variations in the topography and terrain of landslide-prone areas mean that further research is needed. This research should focus on adaptability to different geographical features, changes in lighting conditions, and phenomena such as occlusion. At the same time, the ability to process large volumes of data must be considered to improve real-time detection while maintaining high accuracy. This study provides an important reference for the automatic identification of landslide geological hazards and paves the way for future research.

Author Contributions

Conceptualisation, Z.Y., X.D. and X.S.; methodology, F.A., Z.Y. and D.H.; software, Z.Y., F.A. and S.M.; formal analysis, X.S. and D.H.; investigation, Z.Y., X.D. and S.M.; resources, X.D.; data curation, X.D.; writing—original draft preparation, X.S., Z.Y., F.Q., Y.W., F.A., D.H., X.D. and S.M.; writing—review and editing, X.S., Z.Y., F.Q., Y.W., F.A., D.H., X.D. and S.M.; supervision, F.A., X.D. and S.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research is a part of the phase results of the Xinjiang Key R&D Program Projects (grant number: 2022B03033-1), the Xinjiang Uygur Autonomous Region “Dr. Tianchi” Project, National Natural Science Foundation of China Regional Project (grant number: 52562045).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

Author Fuerhaiti Ainiwaer was employed by Xinjiang Naba Expressway Development Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Wang, J.; Chen, G.; Jaboyedoff, M.; Derron, M.H.; Fei, L.; Li, H.; Luo, X. Loess landslides detection via a partially supervised learning and improved Mask-RCNN with multi-source remote sensing data. CATENA 2023, 231, 107371. [Google Scholar] [CrossRef]
  2. Van Eynde, E.; Dondeyne, S.; Isabirye, M.; Decker, J.; Poesen, J. Impact of landslides on soil characteristics: Implications for estimating their age. CATENA 2017, 157, 173–179. [Google Scholar] [CrossRef]
  3. Vasanthi, P.; Mohan, L. Ensemble of ghost convolution block with nested transformer encoder for dense object recognition. Biomed. Signal Process. Control 2024, 88, 105645. [Google Scholar] [CrossRef]
  4. Akosah, S.; Gratchev, I.; Kim, D.H.; Ohn, S.Y. Application of artificial intelligence and remote sensing for landslide detection and prediction: Systematic review. Remote Sens. 2024, 16, 2947. [Google Scholar] [CrossRef]
  5. Wang, S.; Fan, Q.; Li, H. Latent landslide hazard recognition in Fang County using synthetic aperture radar interferometry and geological data. Front. Earth Sci. 2025, 13, 1531615. [Google Scholar] [CrossRef]
  6. Wang, T.; Zhang, S. DSC-Ghost-Conv: A compact convolution module for building efficient neural network architectures. Multimed. Tools Appl. 2023, 83, 36767–36795. [Google Scholar] [CrossRef]
  7. Wang, X.L.; Wang, S.; Cao, J.Q.; Wang, Y.S. Data-driven based Tiny-YOLOv3 method for front vehicle detection inducing SPP-Net. IEEE Access 2020, 8, 110227–110236. [Google Scholar] [CrossRef]
  8. Wang, X.; Clague, J.; Crosta, G.; Sun, J.; Stead, D.; Qi, S.; Zhang, L. Relationship between the spatial distribution of landslides and rock mass strength, and implications for the driving mechanism of landslides in tectonically active mountain ranges. Eng. Geol. 2021, 292, 106281. [Google Scholar] [CrossRef]
  9. Yin, W.; Niu, C.; Bai, Y.; Zhang, L.; Ma, D.; Zhang, S.; Zhou, X.; Xue, Y. An adaptive identification method for potential landslide hazards based on multisource data. Remote Sens. 2023, 15, 1865. [Google Scholar] [CrossRef]
  10. Wei, J.P.; As’arry, A.; Rezali, K.A.M.; Yusoff, M.Z.M.; Ma, H.H.; Zhang, K.L. A review of YOLO algorithm and its applications in autonomous driving object detection. IEEE Access 2025, 13, 93688–93711. [Google Scholar] [CrossRef]
  11. Li, Y.; Xin, Z.; Yuan, M. Landslide detection for remote sensing images using a multilabel classification network based on Bijie landslide dataset. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 9194–9213. [Google Scholar] [CrossRef]
  12. Yang, D.; Mao, Y. Remote sensing landslide target detection method based on improved Faster R-CNN. J. Appl. Remote Sens. 2022, 16, 044521. [Google Scholar] [CrossRef]
  13. Wu, Z.; Ma, P.; Zheng, Y.; Gu, F.; Liu, L.; Lin, H. Automatic detection and classification of land subsidence in deltaic metropolitan areas using distributed scatterer InSAR and oriented R-CNN. Remote Sens. Environ. 2023, 290, 113545. [Google Scholar] [CrossRef]
  14. Xu, Z.; Shi, H.; Lin, P.; Ma, W. Intelligent on-site lithology identification based on deep learning of rock images and elemental data. IEEE Geosci. Remote Sens. Lett. 2022, 19, 6511205. [Google Scholar] [CrossRef]
  15. Wei, R.; Ye, C.; Sui, T.; Zhang, H.; Ge, Y.; Li, Y. A feature enhancement framework for landslide detection. Int. J. Appl. Earth Obs. Geoinf. 2023, 124, 103521. [Google Scholar] [CrossRef]
  16. Hu, J.; Zhang, Z.; Zhu, X.; Zhang, X.; Yang, S.; Huang, C.; Wang, W.; Li, X.; Hou, L.; Zhao, L. Geological hazard susceptibility assessment and forecasting analysis based on InSAR and C-L-A model. Int. J. Appl. Earth Obs. Geoinf. 2025, 143, 104840. [Google Scholar] [CrossRef]
  17. Xiang, X.; Liu, M.; Zhang, S.; Wei, P.; Chen, B. Multi-scale attention and dilation network for small defect detection. Pattern Recognit. Lett. 2023, 172, 82–88. [Google Scholar] [CrossRef]
  18. Xu, X.; Ding, Y.; Lv, Z.; Li, Z.; Sun, R. Optimized pointwise convolution operation by Ghost blocks. Electron. Res. Arch. 2023, 31, 3187–3199. [Google Scholar] [CrossRef]
  19. Xu, X.; Zhang, T.; Zhang, X.; Zhang, W.; Ke, X.; Zeng, T. MambaShadowDet: A high-speed and high-accuracy moving target shadow detection network for video SAR. Remote Sens. 2025, 17, 214. [Google Scholar] [CrossRef]
  20. Bouguettaya, A.; Zarzour, H.; Kechida, A.; Taberkit, A.M. Vehicle detection from UAV imagery with deep learning: A review. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 6047–6067. [Google Scholar] [CrossRef]
  21. Kim, E.; Hong, S.J.; Kim, S.Y.; Lee, C.H.; Kim, S.; Kim, H.J.; Kim, G. CNN-based object detection and growth estimation of plum fruit (Prunus mume) using RGB and depth imaging techniques. Sci. Rep. 2022, 12, 20796. [Google Scholar] [CrossRef] [PubMed]
  22. Cao, J.; Bao, W.; Shang, H.; Yuan, M.; Cheng, Q. GCL-YOLO: A GhostConv-based lightweight YOLO network for UAV small object detection. Remote Sens. 2023, 15, 4932. [Google Scholar] [CrossRef]
  23. Zhou, P.; Wang, P.; Cao, J.; Yin, Q. PRO-YOLOv4-tiny: Towards more balance between accuracy and speed in the detection of small targets in remotely sensed images. Remote Sens. Lett. 2023, 14, 947–959. [Google Scholar] [CrossRef]
  24. Carneiro, A.T.S.; Coutinho, F.L.; Morimoto, C.H. Detection of visual pursuits using 1D convolutional neural networks. Pattern Recognit. Lett. 2024, 179, 45–51. [Google Scholar] [CrossRef]
  25. He, K.M.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 386–397. [Google Scholar] [CrossRef]
  26. Yang, F.; Jiang, Z.; Ren, J.; Lv, J. Monitoring, prediction, and evaluation of mountain geological hazards based on InSAR technology. Sci. Program. 2022, 2022, 2227049. [Google Scholar] [CrossRef]
  27. Ge, X.; Zhao, Q.; Wang, B.; Chen, M. Lightweight landslide detection network for emergency scenarios. Remote Sens. 2023, 15, 1085. [Google Scholar] [CrossRef]
  28. Guo, P.; Beheshti, S.B.; Shokravi, M.; Behsad, A. Assessment of geological hazards in landslide risk using the analysis process method. Steel Compos. Struct. 2023, 47, 451–454. [Google Scholar] [CrossRef]
  29. Hua, W.; Liu, L.; Sun, N.; Jin, X. A CA-based weighted clustering adversarial network for unsupervised domain adaptation PolSAR image classification. IEEE Geosci. Remote Sens. Lett. 2023, 20, 4014505. [Google Scholar] [CrossRef]
  30. Huang, F.; Yang, Y.; Ma, G.; Rezania, M.; Chang, Z.; Catani, F.; Jiang, B.; Chen, X.; Guo, F. Real-time early warning of landslide disaster risks on major highways in Ganzhou City, China. J. Rock Mech. Geotech. Eng. 2026, in press. [Google Scholar] [CrossRef]
  31. Li, S.S.; Guo, S.R.; Han, Z.L.; Kou, C.; Huang, B.C.; Luan, M.H. Aluminum surface defect detection method based on a lightweight YOLOv4 network. Sci. Rep. 2023, 13, 11077. [Google Scholar] [CrossRef]
  32. Hussain, M. YOLO-v1 to YOLO-v8, the rise of YOLO and its complementary nature toward digital manufacturing and industrial defect detection. Machines 2023, 11, 677. [Google Scholar] [CrossRef]
  33. Liang, H.; Gong, H.; Cong, L.; Zhang, M.; Tao, Z.; Liu, S.; Shi, J. Automated detection of airfield pavement damages: An efficient light-weight algorithm. Int. J. Pavement Eng. 2023, 24, 2247135. [Google Scholar] [CrossRef]
  34. Zhou, J.; Yang, D.; Song, T.; Ye, Y.; Zhang, X.; Song, Y. Improved YOLOv7 models based on modulated deformable convolution and Swin Transformer for object detection in fisheye images. Image Vis. Comput. 2024, 144, 104966. [Google Scholar] [CrossRef]
  35. Li, C.; Qu, Z.; Wang, S.; Liu, L. A method of cross-layer fusion multi-object detection and recognition based on improved Faster R-CNN model in complex traffic environment. Pattern Recognit. Lett. 2021, 145, 127–134. [Google Scholar] [CrossRef]
  36. Li, R.; Shen, Y. YOLOSR-IST: A deep learning method for small target detection in infrared remote sensing images based on super-resolution and YOLO. Signal Process. 2023, 208, 108962. [Google Scholar] [CrossRef]
  37. Su, J.; Qin, Y.; Jia, Z.; Hou, Y. PTCDet: Advanced UAV imagery target detection. Sci. Rep. 2024, 14, 27403. [Google Scholar] [CrossRef]
  38. Su, Y.; Tan, W.; Dong, Y.; Xu, W.; Huang, P.; Zhang, J.; Zhang, D. Enhancing concealed object detection in active millimeter wave images using wavelet transform. Signal Process. 2024, 216, 109303. [Google Scholar] [CrossRef]
  39. Sun, X.; Mo, T.; Song, J.; Wang, B. Deformable convolution kernel and residual learning assisted irregular seismic data interpolation. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5905911. [Google Scholar] [CrossRef]
  40. Trani, L.; Pagani, G.A.; Zanetti, J.P.P.; Chapeland, C.; Evers, L. DeepQuake: An application of CNN for seismo-acoustic event classification in the Netherlands. Comput. Geosci. 2022, 159, 104980. [Google Scholar] [CrossRef]
  41. Dai, X.; Song, X.; Xing, L.; Han, D.; Li, S. Integrating SBAS-InSAR and machine learning for enhanced landslide identification and susceptibility mapping along the West Kunlun Highway. Appl. Sci. 2026, 16, 120. [Google Scholar] [CrossRef]
  42. Liu, X.; Liu, B. A Hybrid Time Series Model for Predicting the Displacement of High Slope in the Loess Plateau Region. Sustainability 2023, 15, 5423. [Google Scholar] [CrossRef]
  43. Ge, H.; Wang, L.; Liu, M.; Zhu, Y.; Zhao, X.; Pan, H.; Liu, Y. Two-Branch Convolutional Neural Network with Polarized Full Attention for Hyperspectral Image Classification. Remote Sens. 2023, 15, 848. [Google Scholar] [CrossRef]
  44. Ohnishi, Y.; Nishiyama, S.; Yano, T.; Matsuyama, H.; Amano, K. A study of the application of digital photogrammetry to slope monitoring systems. Int. J. Rock Mech. Min. Sci. 2006, 43, 5. [Google Scholar] [CrossRef]
  45. Zhu, F.; Zhu, X.; Huang, Z.; Ding, M.; Li, Q.; Zhang, X. Deep learning based data-adaptive descriptor for non-rigid multi-modal medical image registration. Signal Process. 2021, 183, 108023. [Google Scholar] [CrossRef]
Figure 1. Experimental step.
Figure 1. Experimental step.
Applsci 16 04804 g001
Figure 2. Network structure diagram of CDCS-YOLO.
Figure 2. Network structure diagram of CDCS-YOLO.
Applsci 16 04804 g002
Figure 3. GhostConv Module Structure.
Figure 3. GhostConv Module Structure.
Applsci 16 04804 g003
Figure 4. The coordinate attention network.
Figure 4. The coordinate attention network.
Applsci 16 04804 g004
Figure 5. Extraction of features by conventional convolutional methods.
Figure 5. Extraction of features by conventional convolutional methods.
Applsci 16 04804 g005
Figure 6. Deformable convolution method for feature extraction.
Figure 6. Deformable convolution method for feature extraction.
Applsci 16 04804 g006
Figure 7. Distribution of geologic hazards in Xinjiang.
Figure 7. Distribution of geologic hazards in Xinjiang.
Applsci 16 04804 g007
Figure 8. The examples of data enhancement.
Figure 8. The examples of data enhancement.
Applsci 16 04804 g008
Figure 9. Dataset catalogue structure.
Figure 9. Dataset catalogue structure.
Applsci 16 04804 g009
Figure 10. Comparison of Different Attention Mechanisms on Different Loss Function Pairings.
Figure 10. Comparison of Different Attention Mechanisms on Different Loss Function Pairings.
Applsci 16 04804 g010
Figure 11. Comparison of the accuracy of different modules.
Figure 11. Comparison of the accuracy of different modules.
Applsci 16 04804 g011
Figure 12. Comparison of training effect graphs.
Figure 12. Comparison of training effect graphs.
Applsci 16 04804 g012
Figure 13. Comparison of the effect diagram of models.
Figure 13. Comparison of the effect diagram of models.
Applsci 16 04804 g013
Figure 14. Landslide Disaster Detection Effect Map of Xinjiang Mountainous Areas.
Figure 14. Landslide Disaster Detection Effect Map of Xinjiang Mountainous Areas.
Applsci 16 04804 g014
Table 1. Object Detection Algorithms Based on Candidate Regions.
Table 1. Object Detection Algorithms Based on Candidate Regions.
ArithmeticTopicalityAdvantageDisadvantage
R-CNNabsentIntroducing deep learning to target detection for the first timeTime-consuming to acquire a target area; no sharing of features
SPP-NetabsentSolve the problem of inconsistency of the input feature map sizeSeparation of the individual detection steps still requires multiple training sessions
Fast R-CNNabsentUsing the region of interest pooling layer structureUsing an external algorithm to extract the target candidate box is more time-consuming
Faster R-CNNmediocreEnables port-to-port detection and identificationComplex model; poor detection of small targets
Mask R-CNNmediocreAccurate segmentation; high detection accuracyInstance segmentation is expensive
Table 2. Regression-based object detection algorithms.
Table 2. Regression-based object detection algorithms.
ArithmeticTopicalityAdvantageDisadvantage
YOLOv1excellenceDetection is converted to a regression problem and runs fastProduces more positioning errors and lagging accuracy; weak generalization capabilities
SSDexcellenceCombining regression and anchor frame mechanismsLoss of small target features
YOLOv2excellenceFurther increase in speed; increase in recall ratePoor detection of small targets
RSSDexcellenceBetter detection of small targetsComplex modeling; average detection speed
YOLOv3excellenceSuitable for multi-scale testingLow model recall
YOLOv4excellenceHigh accuracy of model detectionThe model remains largely unchanged
YOLOv5excellenceLow number of model participantsThe model remains largely unchanged
YOLOv7excellenceHigher accuracy; faster detectionLarge memory footprint; training and reasoning become much slower
Table 3. Hardware configuration.
Table 3. Hardware configuration.
NameParameter
CPUIntel Core i5-13400
Hard Disc1 T
GPUNVIDIA RTX 4060
Memory8 GB × 2
Operating systemWindow10
Programming languagePython 3.7
Cuda10.3
Table 4. Parameter Combinations.
Table 4. Parameter Combinations.
ParametersDetails
resize640 × 640
batch size16
learning ratelr0 = 0.01, lrf = 0.001
anchor box[10, 13] [16, 30] [33, 23] [30, 61] [62, 45] [59, 119] [116, 90] [156, 198] [373, 326]
datasettrain:val = 8:2
epoch200
Table 5. YOLO series network test performance comparison.
Table 5. YOLO series network test performance comparison.
PrecisionRecallmAP0.5Model SizeFPS
YOLOv468.2%59.1%65.7%52.5 M135
YOLOv5n66.4%58.9%62.5%3.7 M147
YOLOv5s67.3%59.9%64.7%14.9 M140
YOLOv5m67.5%60.2%65.0%40 M111
YOLOv7-tiny68.7%59.5%64.1%13.4 M95
YOLOv768.2%64.3%66.3%74.9 M66
Table 6. Optimised model comparison results.
Table 6. Optimised model comparison results.
PrecisionRecallmAP@0.5FPSModel Size
YOLOv494.5%92.2%96.1%131.854.5 M
YOLOv5n93.9%90.9%95.8%141.74.2 M
YOLOv5s94.8%92.0%96.0%136.415.8 M
YOLOv5m96.8%91.8%96.2%98.743.4 M
YOLOv7-tiny94.6%89.9%95.6%90.714.1 M
YOLOv795.6%91.8%96.1%58.177.3 M
CDCS-YOLO96.1%92.6%96.6%142.614.2 M
Table 7. Auxiliary Monitoring Methods for Earthen Slopes.
Table 7. Auxiliary Monitoring Methods for Earthen Slopes.
Data TypesData SourceSolutionOutput
Drone footageDrone InspectionImage Acquisition and Landslide DetectionMap of Landslide-Prone Areas
Ground sensor dataGround-based detection equipmentSensor Data Analysis and Displacement MonitoringSlope Displacement and Deformation Rate
Precipitation and Meteorological DataWeather StationComparative Analysis of Precipitation and TemperatureAnalysis of the Correlation Between Weather and Landslides
Historical landslide dataRoad AuthorityA Review and Statistical Analysis of Historical Landslide EventsSpatio-temporal Distribution Map of High-Risk Landslide Areas
Table 8. Auxiliary Monitoring Methods for Rock Slopes.
Table 8. Auxiliary Monitoring Methods for Rock Slopes.
Data TypesData SourceSolutionOutput
High-precision monitoring datasensorDisplacement Monitoring and Calculation of Rock Mass Displacement RatesDisplacement and Velocity
Drone imagery dataDrone InspectionImage Acquisition in Deep LearningIdentification and Changes in Landslide Risks
Geological survey dataField Exploration and SurveysAnalysis of Rock Shear Strength and Crack DistributionRock Mass Stability Assessment
Weather DataWeather station monitoring dataAnalysis of Meteorological Factors and Rock StabilityAssessment of the Impact of Extreme Climate Change on Landslides
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ye, Z.; Ainiwaer, F.; Han, D.; Song, X.; Qu, F.; Wang, Y.; Dai, X.; Ma, S. A High-Performance Model for Landslide Geological Hazard Detection, CDCS-YOLO. Appl. Sci. 2026, 16, 4804. https://doi.org/10.3390/app16104804

AMA Style

Ye Z, Ainiwaer F, Han D, Song X, Qu F, Wang Y, Dai X, Ma S. A High-Performance Model for Landslide Geological Hazard Detection, CDCS-YOLO. Applied Sciences. 2026; 16(10):4804. https://doi.org/10.3390/app16104804

Chicago/Turabian Style

Ye, Zijie, Fuerhaiti Ainiwaer, Dongchen Han, Xinjun Song, Fulin Qu, Yuxi Wang, Xiaomin Dai, and Shengqiang Ma. 2026. "A High-Performance Model for Landslide Geological Hazard Detection, CDCS-YOLO" Applied Sciences 16, no. 10: 4804. https://doi.org/10.3390/app16104804

APA Style

Ye, Z., Ainiwaer, F., Han, D., Song, X., Qu, F., Wang, Y., Dai, X., & Ma, S. (2026). A High-Performance Model for Landslide Geological Hazard Detection, CDCS-YOLO. Applied Sciences, 16(10), 4804. https://doi.org/10.3390/app16104804

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop