Insulator Defect Detection Algorithm Based on Improved YOLO11s in Snowy Weather Environment

Ding, Ziwei; Deng, Song; Liu, Qingsheng

doi:10.3390/sym17101763

Open AccessArticle

Insulator Defect Detection Algorithm Based on Improved YOLO11s in Snowy Weather Environment

by

Ziwei Ding

^1,†,

Song Deng

^1,*,† and

Qingsheng Liu

²

¹

Carbon Neutralization Advanced Technology Research Institute, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

²

State Grid Jiashan Power Supply Company, Jiaxing 314100, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Symmetry 2025, 17(10), 1763; https://doi.org/10.3390/sym17101763

Submission received: 10 August 2025 / Revised: 15 September 2025 / Accepted: 9 October 2025 / Published: 19 October 2025

(This article belongs to the Special Issue Symmetry and Asymmetry in Data Analysis)

Download

Browse Figures

Versions Notes

Abstract

The intelligent transformation of power systems necessitates robust insulator condition detection to ensure grid safety. Existing methods, primarily reliant on manual inspection or conventional image processing, suffer significantly degraded target identification and detection efficiency under extreme weather conditions such as heavy snowfall. To address this challenge, this paper proposes an enhanced YOLO11s detection framework integrated with image restoration technology, specifically targeting insulator defect identification in snowy environments. First, data augmentation and a FocalNet-based snow removal algorithm effectively enhance image resolution under snow conditions, enabling the construction of a high-quality training dataset. Next, the model architecture incorporates a dynamic snake convolution module to strengthen the perception of tubular structural features, while the MPDIoU loss function optimizes bounding box localization accuracy and recall. Comparative experiments demonstrate that the optimized framework significantly improves overall detection performance under complex weather compared to the baseline model. Furthermore, it exhibits clear advantages over current mainstream detection models. This approach provides a novel technical solution for monitoring power equipment conditions in extreme weather, offering significant practical value for ensuring reliable grid operation.

Keywords:

object detection; dynamic convolution; deep learning; defect detection

1. Introduction

The advancement of intelligent power systems has elevated online monitoring technology to a critical role in ensuring grid security. As essential grid components, insulators require effective defect detection to maintain stable infrastructure operation; while researchers have proposed various detection methods using image processing, machine learning (ML), and deep learning (DL), traditional image processing and ML algorithms exhibit limited robustness in complex or noisy environments. Crucially, their detection accuracy degrades significantly under adverse weather conditions. In contrast, DL techniques automatically learn discriminative features, offering superior robustness and accuracy. Particularly in complex backgrounds, DL substantially improves detection performance, making DL-based object detection algorithms a prominent research focus for insulator monitoring.

Recently, deep learning algorithms have garnered significant attention for object detection in complex environments. Traditional methods, based on feature extraction and region proposal techniques, achieve high accuracy but incur substantial computational costs and exhibit poor real-time performance [1,2]. In contrast, single-stage detectors like the YOLO series employ end-to-end training, enhancing both detection speed and accuracy—particularly beneficial for embedded devices [3]. However, YOLO architectures exhibit significant performance degradation under harsh weather conditions. In snowy environments, for instance, snowflakes obscure targets and induce light scattering, resulting in feature loss and compromised detection accuracy.

In insulator defect detection, Li et al. [4] proposed LiteYOLO-ID, a lightweight model based on YOLOv5s, optimizing convolution modules and backbone networks to enhance accuracy. Subsequently, Li et al. [5] introduced a YOLOv8-based detection model incorporating multiple optimized modules to improve both precision and inference speed. Qi et al. [6] enhanced YOLOv5 by integrating DIoU-NMS and an IoU-based loss penalty, significantly boosting detection performance. Wang et al. [7] developed a YOLO-based UAV inspection algorithm for multi-object detection, offering valuable insights for intelligent power inspections. He et al. [8] proposed MFI-YOLO, which is optimized via GhostNet and ResPANet. This model enhances the detection accuracy of multiple insulator faults while reducing the model’s computational load and parameters. Li et al. [9] proposed A2MADA-YOLO, which integrates attention alignment and adversarial learning. It improves the accuracy and generalization of the YOLO series in insulator defect detection under foggy conditions without the need for labeled foggy weather data. Wang et al. [10] proposed the MCI-GLA plug-in suitable for the YOLO series, which addresses the scale and background issues in insulator detection and enhances the multi-scale feature learning capability and the detection accuracy of small defects, with the only drawback of increased computational cost.

Image desnowing, a critical subfield of image degradation restoration, has garnered significant attention recently. Notable methods include DesnowNet [11], HDCWNet [12], SMGARN [13], DDMSNet [14], JSTASR [15] and InvDSNet [16]. However, these approaches struggle to balance effective snow removal with the preservation of fine details—notably for insulator defect detection. Guo et al. [17] introduced a local sparse structure prior technique that accurately locates snowflakes but underperforms in complex backgrounds. Zhang et al. [18] proposed a UNet-based model with Window and Region Self-Attention (WSA/RSA) modules, achieving promising results across datasets yet remaining suboptimal for insulator-specific tasks.

Although substantial progress has been made in both domains, few studies integrate insulator detection with desnowing to form a robust solution for severe weather conditions. To bridge this gap, we propose an enhanced YOLO11s framework, incorporating adverse-weather data augmentation, a desnowing module, and innovative network architecture enhancements for efficient defect detection under adverse weather. The key contributions are as follows:

High-quality insulator images captured by UAVs were curated to establish the Insulator1600 dataset. To simulate severe weather conditions, synthetic snow artifacts were algorithmically introduced, generating the InsulatorSnow1600 dataset. Three state-of-the-art desnowing models were evaluated using SSIM and PSNR metrics, with FocalNet selected for its superior capability to restore image clarity and preserve structural details. The processed outputs formed the final InsulatorDeSnow1600 dataset.
The proposed framework enhances the YOLO11s baseline through two key innovations: First, a Dynamic Snake Convolution module was integrated to improve feature extraction for tubular structures inherent to insulators. Second, the MPDIoU loss function was incorporated to optimize bounding box regression, effectively balancing localization accuracy and recall performance.
Ablation studies and comparative experiments demonstrate that the enhanced YOLO11s model surpasses its baseline counterpart across all evaluation metrics. The proposed solution outperforms mainstream YOLO variants (v5s, v6s, v7-tiny, v8s, v9s, v10s, and v10m) while also achieving superior detection accuracy versus established models including Faster R-CNN, SSD, TOOD, and Deformable DETR.

Figure 1 illustrates the proposed harsh-weather insulator defect detection system. The pipeline initiates with UAV-based image acquisition of insulators. Captured images subsequently undergo snow occlusion removal via the FocalNet desnowing model, which effectively restores structural details. The processed images are then fed into our enhanced YOLO11s detector, which augments feature extraction through dynamic snake convolution and optimizes localization accuracy via a refined loss function. This integrated workflow facilitates robust insulator defect identification under adverse meteorological conditions.

It is pertinent to note that this work resonates with the core themes of symmetry and asymmetry in data analysis, particularly relevant for data analysis in industrial information. Under ideal conditions, insulator images may exhibit inherent geometric symmetry and balanced data distribution, facilitating standard detection models. However, snowy environments introduce severe asymmetry: snow occlusion and light scattering disrupt structural balance, create skewed feature distributions, and generate pervasive noise artifacts that act as challenges for outlier identification. This asymmetric degradation complicates machine learning and artificial intelligence models. The approach presented in this paper effectively addresses this asymmetry: (1) Synthetic snow augmentation models asymmetric data corruption; (2) the FocalNet desnowing module actively restores structural symmetry by removing occlusions; and (3) the Dynamic Snake Convolution leverages the target’s inherent symmetry for robust feature extraction. Thus, the integrated pipeline enhances model robustness against asymmetric data distributions prevalent in harsh environments.

The remainder of this paper is structured as follows: Section 2 details the dataset construction methodology and desnowing model selection. Section 3 introduces the YOLO11s architecture and our proposed enhancements. Section 4 presents comprehensive experimental evaluations using synthetic snowy insulator imagery. Section 5 concludes with key contributions and future research directions.

2. Datasets and Desnowing Models

2.1. Insulator Dataset Under Snowy Conditions

According to the synthetic snowy image formula proposed by Liu et al. [11], insulator images under snowy weather conditions can be modeled linearly as

S (x) = R (x) (1 - L (x)) + J (x) L (x),

(1)

where

S (x)

represents the synthesized snowy image, and

R (x)

is the snow-free ground truth image of insulators, selected from the public dataset RobotFlow. A total of 1600 UAV-captured insulator images were manually selected to form a custom dataset named Insulator1600.

L (x)

denotes the snow streak image from the public dataset Snow100K. Figure 2 provides an intuitive illustration of the synthesis process. The resulting synthetic dataset is named InsulatorSnow1600.

2.2. FocalNet Desnowing Network

The impact of snowy weather on computer vision systems mainly manifests in the dynamic and reflective properties of snowflakes. When snowflakes fall, their unstable motion patterns and changes in speed cause the background of the image to constantly shift, resulting in a large amount of random noise. This dynamic change in the background makes it challenging for object detection algorithms to accurately separate the target from complex scenes.

The reflective properties of snowflakes also negatively affect image quality under different lighting conditions. Snowflakes reflect a large amount of light, leading to overexposure or underexposure in the image, which affects contrast and brightness, further interfering with image preprocessing and feature extraction. In cases of larger snowflakes, image quality may deteriorate significantly, reducing the robustness and accuracy of the computer vision system. Furthermore, the granular structure of snowflakes sometimes closely resembles the texture of the target object. In low-resolution or blurred images, snowflakes may visually resemble the defect features of insulators, leading to misjudgments and misidentifications. For deep learning-based object detection algorithms (such as YOLO11), this phenomenon means that the algorithm requires more computational resources and time to distinguish the subtle differences between snowflakes and real defects.

Therefore, removing the interference from snowflakes is crucial to ensuring that the computer vision system can accurately detect the details of the target object. By effectively desnowing, visual disturbances caused by snowflakes can be eliminated, allowing detection algorithms to maintain high accuracy and robustness under adverse weather conditions, thereby enhancing the reliability of defect detection.

Considering factors such as algorithm complexity, use scenarios, and processing effects, this paper selects the FocalNet model proposed by Cui et al. [19], which effectively performs snow removal operations on images. The model structure diagram is shown in Figure 3.

FocalNet adopts an encoder-decoder architecture, which efficiently learns features at different levels. The network consists of three scales, with each scale’s encoder and decoder composed of multiple ResBlocks. For a low-quality image of size

H \times W \times C

, shallow features are first extracted using a 3 × 3 convolution and then passed through a three-scale symmetric structure, progressively transforming into restored features. The encoder starts with high resolution, gradually reducing spatial dimensions while increasing the number of channels. The decoder performs the reverse operation to restore the clean features. The features from both the decoder and encoder are concatenated and adjusted through a 1 × 1 convolution to modify the channel dimension. Finally, the predicted image is output through a 3 × 3 convolution and an image-level residual connection.

3. YOLO11 Algorithm Model Structure and Improvements

3.1. The Model Structure of YOLO11

YOLO11 is a single-stage object detection algorithm. It introduces the Channel-wise 3 × 3 Convolution with 2 × 2 Stride (C3k2) mechanism, replacing the original Channel-wise 2 × 2 Convolution with Fusion (C2f) module with the C3k2 module to enhance feature extraction capabilities. Additionally, a Cross Stage Partial with Pyramid Squeeze Attention (C2PSA) module is added after the Spatial Pyramid Pooling Fast (SPPF) module, further improving the model’s expressive power. In the decoupled head design, YOLO11 substitutes the convolutional operations within the classification and detection heads with depthwise separable convolutions, effectively reducing the number of parameters and computational load while maintaining high model performance. YOLO11 is available in five versions—n, s, m, l, and x—classified based on parameter size and complexity, allowing it to adapt to different hardware performance requirements. As shown in Figure 4, the model structure of YOLO11s is illustrated.

3.2. Dynamic Snake Convolution

Since insulators typically exhibit a cylindrical or tubular structure, and considering the uniqueness of this structure, this paper adopts the Dynamic Snake Convolution (DSConv) module proposed by Qi et al. [20]. This module replaces the original C3k2 module in the Neck part of YOLO11s. The DSConv module can adaptively focus on elongated or convoluted local structures, thereby accurately capturing the features of tubular structures. The structure of the Dynamic Snake Convolution module is shown in Figure 5.

In the DSConv, the standard convolution kernel is expanded along both the x and y axes. For a kernel size of 9, for the x direction, the position of each grid in the kernel K is represented as

K_{i \pm c} = (x_{i \pm c}, y_{i \pm c}), c = {0, 1, 2, 3, 4},

(2)

where c represents the horizontal distance from the center grid in K. The selection of the position

K_{i \pm c}

is a process of step-wise accumulation. Starting from the center position K, the position of the grid in the center gradually moves away from the center grid. The position depends on the previous grid’s position. Compared to

K_{i}

, the displacement of

K_{i + 1}

is

Δ = {δ ∣ δ \in [- 1, 1]}

, which will increase. Therefore, the accumulation of displacement is denoted as ∑, ensuring that the convolution kernel conforms to a linear structure. The variation along the x-axis in the figure is as follows:

K_{i \pm c} = \{\begin{matrix} (x_{i + c}, y_{i + c}) & = (x_{i} + c, y_{i} + \sum_{i}^{i + c} Δ y) \\ (x_{i - c}, y_{i - c}) & = (x_{i} - c, y_{i} + \sum_{i - c}^{i} Δ y) \end{matrix} .

(3)

Similarly, the variation along the y-axis is as follows:

K_{j \pm c} = \{\begin{matrix} (x_{j + c}, y_{j + c}) & = (x_{j} + \sum_{j}^{j + c} Δ x, y_{j} + c) \\ (x_{j - c}, y_{j - c}) & = (x_{j} - \sum_{j - c}^{j} Δ x, y_{j} - c) \end{matrix} .

(4)

As shown in Figure 5, due to the variation in the two-dimensional direction, DSConv covers a 9 × 9 region during the deformation process. The design of DSConv aims to better adapt to elongated tubular structures, thereby enhancing the perception of key features.

The Dynamic Snake Convolution module is placed in the 16th, 19th, and 22nd layers. The improved YOLO11s model structure is shown in Figure 6.

3.3. MPDIoU Loss Function

For the improvement of the loss function, this paper uses the MPDIoU proposed by Ma et al. [21]. MPDIoU is a novel boundary box similarity comparison metric based on the minimum point distance, which directly minimizes the distance between the top-left and bottom-right points of the predicted bounding box and the ground truth bounding box. MPDIoU incorporates all the relevant factors considered in existing loss functions, such as the overlapping or non-overlapping regions, center point distance, and aspect ratio deviation, while simplifying the computation process. The calculation steps are as follows:

d_{1}^{2} = {(x_{1}^{p r d} - x_{1}^{g t})}^{2} + {(y_{1}^{p r d} - y_{1}^{g t})}^{2}

(5)

d_{2}^{2} = {(x_{2}^{p r d} - x_{2}^{g t})}^{2} + {(y_{2}^{p r d} - y_{2}^{g t})}^{2}

(6)

M P D I o U = I o U - \frac{d_{1}^{2}}{w^{2} + h^{2}} - \frac{d_{2}^{2}}{w^{2} + h^{2}}

(7)

I o U = \frac{A \cap A^{G T}}{A \cup A^{G T}}

(8)

L_{M P D I o U} = 1 - M P D I o U,

(9)

where

(x_{1}^{g t}, y_{1}^{g t})

and

(x_{2}^{g t}, y_{2}^{g t})

represent the coordinates of the top-left and bottom-right corners of the ground truth box, respectively;

(x_{1}^{p r d}, y_{1}^{p r d})

and

(x_{2}^{p r d}, y_{2}^{p r d})

represent the coordinates of the top-left and bottom-right corners of the predicted box, respectively;

d_{1}

represents the distance between the top-left corner of the ground truth box and the top-left corner of the predicted box; and

d_{2}

represents the distance between the bottom-right corner of the ground truth box and the bottom-right corner of the predicted box. h and w represent the height and width of the image, respectively. IoU represents the intersection over union loss, A represents the area of the predicted box,

A^{G T}

represents the area of the ground truth box, and

L_{M P D I o U}

is the objective function used to minimize the MPDIoU loss. As shown in Figure 7, the illustration of MPDIoU and IOU calculation is provided.

4. Simulation Experiments and Result Analysis

4.1. Experimental Environment Configuration

In this study, the InsulatorSnow1600 dataset was divided into training, validation, and test sets in a 7:2:1 ratio, with insulator defect types including breakage, flashover, self-explosion, and others. The 70% training set ensures sufficient samples for the model to learn from, enabling it to capture diverse defect features and avoid underfitting caused by inadequate data. The 20% validation set provides ample unseen data during training for hyperparameter tuning and monitoring the training process to prevent overfitting. The 10% test set is strictly reserved for final evaluation only. It represents completely unknown data that the model might encounter in real-world scenarios. This ratio is sufficient to provide an unbiased and reliable estimation of the model’s performance. The specific experimental configuration is shown in Table 1.

4.2. Evaluation Metrics of the Experiment

To objectively evaluate the experimental results, this paper selects four commonly used evaluation metrics in the field of object detection: Precision, recall, mAP@0.5, and mAP@0.5:0.95. The calculation formulas are as follows [22]:

P r e c i s i o n = \frac{T P}{T P + F P} \times 100 %

(10)

R e c a l l = \frac{T P}{T P + F N} \times 100 %

(11)

A P = \int_{0}^{1} P_{thre} (r) d r

(12)

m A P @ 0.5 = \frac{1}{N} \sum_{n = 1}^{N} {A P}_{n} ({I o U}_{thre} = 0.5)

(13)

m A P @ 0.5 : 0.95 = \frac{1}{N} \sum_{n = 1}^{N} \sum_{m} {A P}_{n} ({IoU}_{thre} = m)

(14)

In object detection tasks, TP (True Positive), FP (False Positive), and FN (False Negative) refer to correctly detected boxes, incorrectly detected boxes, and missed detections, respectively. TP: The true label is true, and the model prediction is positive. FP: The true label is false, and the model prediction is positive. FN: The true label is false, and the model prediction is negative.

Average Precision (AP) is calculated by integrating the precision–recall curve and is used to assess the detection performance of the model. N represents the total number of classes. mAP@0.5 indicates the mean AP across all classes when the IoU threshold is set to 0.5. mAP@0.5:0.95 refers to the mean AP across all classes when the IoU threshold ranges from 0.5 to 0.95 (with a step size of 0.05). Higher values of mAP@0.5 and mAP@0.5:0.95 indicate better detection performance of the model.

Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) are two commonly used objective evaluation metrics in the field of image de-snowing.

P S N R

is a metric used to measure the difference between two images and can be expressed by the following formula:

P S N R = 10 log (\frac{M a x V a l u e^{2}}{M S E})

(15)

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(x_{i} - y_{i})}^{2}

(16)

The Mean Squared Error (MSE) is a metric used to calculate the difference in pixel values between two images. The values

x_{i}

and

y_{i}

represent the pixel values of the two images. MaxValue refers to the maximum possible pixel value in the image. A higher

P S N R

value indicates better quality of the de-snowed image, as it is closer to the ground truth snow-free image.

S S I M

is based on the assumption that the human eye can extract structured information from images, making it more aligned with human visual perception compared to traditional methods. SSIM consists of three components: Contrast C, Luminance L, and Structure S.

L (X, Y) = \frac{2 μ_{X} μ_{Y} + C_{1}}{μ_{X}^{2} + μ_{Y}^{2} + C_{1}}

(17)

C (X, Y) = \frac{2 σ_{X} σ_{Y} + C_{2}}{σ_{X}^{2} + σ_{Y}^{2} + C_{2}}

(18)

S (X, Y) = \frac{σ_{X Y} + C_{3}}{σ_{X} σ_{Y} + C_{3}},

(19)

where

μ_{x}

and

μ_{y}

represent the mean pixel values of the two images to be compared,

σ_{x}

and

σ_{y}

represent the standard deviations of the two images, and

σ_{x y}

represents the covariance between the images.

C_{1}

,

C_{2}

, and

C_{3}

are three constants used to avoid division by zero, where

C_{1} = {(K_{1} L)}^{2}

and

C_{2} = {(K_{2} L)}^{2}

. L represents the range of pixel values in the image, equivalent to the MaxValue in Formula (15), where

L = 2^{B} - 1

for a B-bit image. For 8-bit unsigned integer data (uint8), the maximum value of the image is typically 255; for floating-point data, the maximum value is 1.

K_{1}

and

K_{2}

are usually much smaller than 1, and in this paper, we set

K_{1} = 0.01

and

K_{2} = 0.03

. The final

S S I M

formula can be expressed as follows:

SSIM (x, y) = {[L (x, y)]}^{α} \cdot {[C (x, y)]}^{β} \cdot {[S (x, y)]}^{γ} .

(20)

Let

α = β = γ = 1

and

C_{3} = C_{2} / 2

; the simplified

S S I M

formula is

SSIM (x, y) = \frac{(2 μ_{x} μ_{y} + C_{1}) (2 σ_{x y} + C_{2})}{(μ_{x}^{2} + μ_{y}^{2} + C_{1}) (σ_{x}^{2} + σ_{y}^{2} + C_{2})} .

(21)

S S I M (x, y) \in [0, 1]

, and the closer the

S S I M

value is to 1, the more similar the de-snowed image is to the ground truth snow-free image.

4.3. Experimental Results and Analysis

Experiments were conducted using the InsulatorSnow1600 dataset, and the comparison results of several de-snowing models are shown in Table 2.

The experimental results show that FocalNet outperforms in two evaluation metrics for de-snowing performance. As shown in Figure 8, a comparison of the insulator images before and after de-snowing using the FocalNet model is presented. The de-snowed dataset is named InsulatorDeSnow1600.

4.4. Comparative Experiment

Experiments were conducted using the YOLO series models on the InsulatorDeSnow1600 dataset. mAP@0.5 and mAP@0.5:0.95 are important metrics for evaluating the model’s training performance. Figure 9 shows the variation in these two key metrics during the training process. To provide a clearer view of the changes, only the values at key points are displayed in the figures.

To more comprehensively evaluate the improved YOLO11s model, this paper uses precision, recall, mAP@0.5, and mAP@0.5:0.95 as evaluation metrics. Table 3 presents a comparison of the improved model with other series models.

The experimental results in Table 3 show that when using the FocalNet de-snowing model, the improved YOLO11s performs the best across all four evaluation metrics. Under the premise of using YOLO11s as the base model, de-snowing with FocalNet and combining it with the improved YOLO11s for detection results in superior performance across all four evaluation metrics compared to the use of the other two de-snowing models. The improved YOLO11s outperforms the original YOLO11s by 0.9%, 3.9%, 3.3%, and 3% in terms of precision, recall, mAP@0.5, and mAP@0.5:0.95, respectively. This analysis demonstrates the effectiveness of the improved YOLO11s model.

4.5. Visualization of Detection Results

To visually compare the detection performance before and after the improvement of the YOLO11s model, the visualization is presented for two common insulator defect types: insulator damage and flashover. As shown in Figure 10, compared to the original YOLO11s model, the improved model demonstrates better detection performance, with higher confidence in defect detection. Additionally, the original model suffered from missed detections under complex background conditions, whereas the improved model adapts better to complex backgrounds, reducing the missed detection rate.

4.6. Ablation Study

As shown in Table 4, in the ablation study, the improved YOLO11s model was validated. After the introduction of the MPDIoU loss function and the dynamic serpentine convolution module, the model’s detection performance was significantly improved.

From the experimental results in Table 4, it can be seen that compared to Model A, Model D shows improvements of 4.6%, 1%, 2.4%, and 1.7% in terms of precision, recall, mAP@0.5, and mAP@0.5:0.95, respectively. In Model C, although the recall is 1% higher than that of Model D after adding the dynamic serpentine convolution module at layers 19 and 22, the precision, mAP@0.5, and mAP@0.5:0.95 are reduced by 2.6%, 1.4%, and 2.7%, respectively, compared to Model D.

Overall, Model D outperforms the other models in terms of overall performance, demonstrating a more balanced performance across all metrics. By incorporating the MPDIoU loss function, the boundary box regression accuracy was further optimized, while the dynamic serpentine convolution module enhanced the model’s ability to capture complex shapes during feature extraction. These improvements make the model more stable and efficient in detecting insulator defects in complex environments.

5. Conclusions

This paper proposes an enhanced YOLO11s framework to address insulator defect detection challenges under adverse weather conditions. By integrating data preprocessing, image desnowing, and network optimization, the proposed approach significantly improves detection accuracy in snowy environments compared to the baseline YOLO11s model. Our methodology comprises three key components: (1) Construction of a weather-robust insulator dataset through data curation and synthetic snow augmentation, (2) implementation of FocalNet for snow removal and detail restoration, and (3) architectural enhancements to YOLO11s via dynamic snake convolution and MPDIoU loss for optimized feature extraction and bounding box regression. Comprehensive experiments demonstrate state-of-the-art performance across evaluation metrics. The enhanced model outperforms both classical detectors (Faster R-CNN, SSD, TOOD, and Deformable DETR) and YOLO series variants (v5s–v10m). Ablation studies confirm the model’s robustness in complex weather scenarios. This research advances insulator defect detection accuracy while providing foundational support for intelligent grid monitoring systems and autonomous inspection technologies.

The proposed method demonstrates excellent detection performance in snowy environments, but it still has certain limitations. The current desnowing algorithm may not handle some specific snow pattern variations ideally. Future work will focus on the following directions: developing a more lightweight desnowing network architecture to improve real-time performance; exploring multi-weather joint optimization strategies to enhance model generalization; investigating an adaptive desnowing-detection joint optimization framework to reduce dependency on preprocessing stages; and incorporating real-world images to alleviate the scarcity of extreme weather samples.

Author Contributions

Conceptualization, methodology, writing—original draft preparation, Z.D. and S.D.; formal analysis, writing—review and editing, Z.D. and S.D.; investigation, resources, and supervision, S.D. and Q.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (51977113) and the State Grid Zhejiang Electric Power Co., Ltd. Science and Technology Project (Research on Key Data Full Lifecycle Security Protection Technology for Intelligent Unmanned Inspection Terminals in Power Systems, 5211JX240001).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors sincerely thank the anonymous reviewers for their critical comments and suggestions for improving the manuscript.

Conflicts of Interest

Author Qingsheng Liu was employed by the State Grid Jiashan Power Supply Company. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflicts of interest.

References

Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar] [CrossRef] [PubMed]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Li, D.; Lu, Y.; Gao, Q.; Li, X.; Yu, X.; Song, Y. LiteYOLO-ID: A lightweight object detection network for insulator defect detection. IEEE Trans. Instrum. Meas. 2024, 73, 1–12. [Google Scholar] [CrossRef]
Li, Z.; Jiang, C.; Li, Z. An insulator location and defect detection method based on improved yolov8. IEEE Access 2024, 12, 106781–106792. [Google Scholar] [CrossRef]
Qi, Y.; Sun, H. Defect Detection of Insulator Based on YOLO Network. In Proceedings of the 2024 9th International Conference on Electronic Technology and Information Science (ICETIS), Hangzhou, China, 17–19 May 2024; pp. 232–235. [Google Scholar]
Wang, Q.; Liao, Z.; Xu, M. Wire insulator fault and foreign body detection algorithm based on YOLO v5 and YOLO v7. In Proceedings of the 2023 IEEE International Conference on Electrical, Automation and Computer Engineering (ICEACE), Changchun, China, 26–28 December 2023; pp. 1412–1417. [Google Scholar]
He, M.; Qin, L.; Deng, X.; Liu, K. MFI-YOLO: Multi-fault insulator detection based on an improved YOLOv8. IEEE Trans. Power Deliv. 2023, 39, 168–179. [Google Scholar] [CrossRef]
Li, J.; Zhou, H.; Lv, G.; Chen, J. A2MADA-YOLO: Attention Alignment Multiscale Adversarial Domain Adaptation YOLO for Insulator Defect Detection in Generalized Foggy Scenario. IEEE Trans. Instrum. Meas. 2025, 74, 5011419. [Google Scholar] [CrossRef]
Wang, Y.; Song, X.; Feng, L.; Zhai, Y.; Zhao, Z.; Zhang, S.; Wang, Q. MCI-GLA plug-in suitable for YOLO series models for transmission line insulator defect detection. IEEE Trans. Instrum. Meas. 2024, 73, 9002912. [Google Scholar] [CrossRef]
Liu, Y.F.; Jaw, D.W.; Huang, S.C.; Hwang, J.N. Desnownet: Context-aware deep network for snow removal. IEEE Trans. Image Process. 2018, 27, 3064–3073. [Google Scholar] [CrossRef] [PubMed]
Chen, W.T.; Fang, H.Y.; Hsieh, C.L.; Tsai, C.C.; Chen, I.; Ding, J.J.; Kuo, S.Y. All snow removed: Single image desnowing algorithm using hierarchical dual-tree complex wavelet representation and contradict channel loss. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 10–17 October 2021; pp. 4196–4205. [Google Scholar]
Cheng, B.; Li, J.; Chen, Y.; Zeng, T. Snow mask guided adaptive residual network for image snow removal. Comput. Vis. Image Underst. 2023, 236, 103819. [Google Scholar] [CrossRef]
Zhang, K.; Li, R.; Yu, Y.; Luo, W.; Li, C. Deep dense multi-scale network for snow removal using semantic and depth priors. IEEE Trans. Image Process. 2021, 30, 7419–7431. [Google Scholar] [CrossRef] [PubMed]
Chen, W.T.; Fang, H.Y.; Ding, J.J.; Tsai, C.C.; Kuo, S.Y. JSTASR: Joint size and transparency-aware snow removal algorithm based on modified partial convolution and veiling effect removal. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 754–770. [Google Scholar]
Quan, Y.; Tan, X.; Huang, Y.; Xu, Y.; Ji, H. Image desnowing via deep invertible separation. IEEE Trans. Circuits Syst. Video Technol. 2023, 33, 3133–3144. [Google Scholar] [CrossRef]
Guo, X.; Fu, X.; Zha, Z.J. Exploring Local Sparse Structure Prior for Image Deraining and Desnowing. IEEE Signal Process. Lett. 2024, 32, 406–410. [Google Scholar] [CrossRef]
Zhang, T.; Jiang, N.; Lin, J.; Lin, J.; Zhao, T. Desnowformer: An effective transformer-based image desnowing network. In Proceedings of the 2022 IEEE International Conference on Visual Communications and Image Processing (VCIP), Suzhou, China, 13–16 December 2022; pp. 1–5. [Google Scholar]
Cui, Y.; Ren, W.; Cao, X.; Knoll, A. Focal network for image restoration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 13001–13011. [Google Scholar]
Qi, Y.; He, Y.; Qi, X.; Zhang, Y.; Yang, G. Dynamic snake convolution based on topological geometric constraints for tubular structure segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 6070–6079. [Google Scholar]
Ma, S.; Xu, Y. Mpdiou: A loss for efficient and accurate bounding box regression. arXiv 2023, arXiv:2307.07662. [Google Scholar] [CrossRef]
Wang, H.; Yang, Q.; Zhang, B.; Gao, D. Deep learning based insulator fault detection algorithm for power transmission lines. J. Real-Time Image Process. 2024, 21, 115. [Google Scholar] [CrossRef]
Chen, W.T.; Huang, Z.K.; Tsai, C.C.; Yang, H.H.; Ding, J.J.; Kuo, S.Y. Learning multiple adverse weather removal via two-stage knowledge learning and multi-contrastive regularization: Toward a unified model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 17653–17662. [Google Scholar]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switerland, 2016; pp. 21–37. [Google Scholar]
Feng, C.; Zhong, Y.; Gao, Y.; Scott, M.R.; Huang, W. Tood: Task-aligned one-stage object detection. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 11–17 October 2021; IEEE Computer Society: Washington, DC, USA, 2021; pp. 3490–3499. [Google Scholar]
Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable detr: Deformable transformers for end-to-end object detection. arXiv 2020, arXiv:2010.04159. [Google Scholar]

Figure 1. Flowchart of snow removal and detection for drone aerial images.

Figure 2. Synthesis of the insulator dataset under snowy weather conditions: (a) Snowy image

S (x)

. (b) Snow stripes

L (x)

. (c) Real image

R (x)

.

Figure 2. Synthesis of the insulator dataset under snowy weather conditions: (a) Snowy image

S (x)

. (b) Snow stripes

L (x)

. (c) Real image

R (x)

.

Figure 3. The model structure of FocalNet.

Figure 4. The model structure of YOLO11s.

Figure 5. Dynamic snake convolution.

Figure 6. Improved YOLO11s model structure.

Figure 7. (a) Illustration of the MPDIoU calculation; (b) illustration of the IoU calculation.

Figure 8. Insulator images before and after snow removal: (a) image1. (b) image2. (c) image3.

Figure 9. The variation in the two metrics during the training process: (a) Contrast curves of mAP@0.5. (b) Contrast curves of mAP@0.5:0.95.

Figure 10. Comparison of different defect detections before and after improving the YOLO11s model: (a,b) Comparison of broken defect detection; (c,d) comparison of flashover defect detection.

Table 1. Experimental environment configuration.

Laboratory Setting	Configuration Information
Running system	Ubuntu 20.04
CPU	14 vCPU Intel(R) Xeon(R) Platinum 8362 CPU @ 2.80 GHz
GPU	NVIDIA GeForce RTX 3090 (24 GB)
Deep learning framework	Pytorch 1.10.1
Programming language	Python 3.9.20
Cuda	11.3
Image size	640 × 640
Training epoch	1000
Learning rate	0.0001
Weight decay	0.0005
Batch size	32
Optimizer	Adam
Classes	7

Table 2. Comparison results of three snow removal models.

Type	Method	$PSNR$	$SSIM$
Desnowing	SMGARN [13]	23.11	0.86
	Chen et al. [23]	25.65	0.83
	FocalNet [19]	33.77	0.93

Table 3. Comparison experiment.

Method	Models		Precision (%)	Recall (%)	mAP@0.5 (%)	mAP@0.5:0.95 (%)
Method	Desnowing	Detection	Precision (%)	Recall (%)	mAP@0.5 (%)	mAP@0.5:0.95 (%)
Faster-RCNN [2]	FocalNet	Faster-RCNN	87.4	74.9	80.3	52.7
SSD [24]	FocalNet	SSD	89.5	77.7	81.2	52.3
Tood [25]	FocalNet	Tood	90.9	81.0	86.1	59.3
Deformable DETR [26]	FocalNet	Deformable DETR	89.1	81.3	84.3	55.3
YOLOv5s	FocalNet	YOLOv5s	90.2	82.7	84.5	54.3
YOLOv6s	FocalNet	YOLOv6s	92.1	79.5	84.3	57.6
YOLOv7-tiny	FocalNet	YOLOv7-tiny	92.6	83.8	86.8	56.1
YOLOv8s	FocalNet	YOLOv8s	92.0	82.9	85.7	57.1
YOLOv9s	FocalNet	YOLOv9s	92.0	83.3	86.4	58.9
YOLOv10s	FocalNet	YOLOv10s	87.5	77.4	82.5	55.2
YOLOv10m	FocalNet	YOLOv10m	82.2	79.5	82.6	55.6
YOLO11s	/	YOLO11s	88.5	74.3	77.8	50.4
YOLO11s	FocalNet	YOLO11s	92.6	81.1	85.3	57.1
YOLO11s	SMGARN	YOLO11s	92.1	80.1	85.6	56.6
YOLO11s	Chen et al.	YOLO11s	91.7	81.1	86.1	57.6
YOLOv12s	FocalNet	YOLOv12s	89.8	79.7	85.7	56.1
Ours	FocalNet	YOLO11s	93.5	85.0	88.6	60.1

Table 4. Ablation experiment.

Model	MPDIoU	DSConv			Precision (%)	Recall (%)	mAP@0.5 (%)	mAP@0.5:0.95 (%)
Model	MPDIoU	16	19	22	Precision (%)	Recall (%)	mAP@0.5 (%)	mAP@0.5:0.95 (%)
A	√				88.9	84.0	86.2	58.4
B	√	√			89.7	84.8	86.8	58.4
	√		√		88.5	84.4	86.4	57.7
	√			√	91.6	83.0	87.0	58.3
C	√		√	√	90.9	86.0	87.2	57.4
	√	√		√	91.3	84.6	88.0	58.5
	√	√	√		90.4	85.3	87.3	58.2
D	√	√	√	√	93.5	85.0	88.6	60.1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ding, Z.; Deng, S.; Liu, Q. Insulator Defect Detection Algorithm Based on Improved YOLO11s in Snowy Weather Environment. Symmetry 2025, 17, 1763. https://doi.org/10.3390/sym17101763

AMA Style

Ding Z, Deng S, Liu Q. Insulator Defect Detection Algorithm Based on Improved YOLO11s in Snowy Weather Environment. Symmetry. 2025; 17(10):1763. https://doi.org/10.3390/sym17101763

Chicago/Turabian Style

Ding, Ziwei, Song Deng, and Qingsheng Liu. 2025. "Insulator Defect Detection Algorithm Based on Improved YOLO11s in Snowy Weather Environment" Symmetry 17, no. 10: 1763. https://doi.org/10.3390/sym17101763

APA Style

Ding, Z., Deng, S., & Liu, Q. (2025). Insulator Defect Detection Algorithm Based on Improved YOLO11s in Snowy Weather Environment. Symmetry, 17(10), 1763. https://doi.org/10.3390/sym17101763

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Insulator Defect Detection Algorithm Based on Improved YOLO11s in Snowy Weather Environment

Abstract

1. Introduction

2. Datasets and Desnowing Models

2.1. Insulator Dataset Under Snowy Conditions

2.2. FocalNet Desnowing Network

3. YOLO11 Algorithm Model Structure and Improvements

3.1. The Model Structure of YOLO11

3.2. Dynamic Snake Convolution

3.3. MPDIoU Loss Function

4. Simulation Experiments and Result Analysis

4.1. Experimental Environment Configuration

4.2. Evaluation Metrics of the Experiment

4.3. Experimental Results and Analysis

4.4. Comparative Experiment

4.5. Visualization of Detection Results

4.6. Ablation Study

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI