LID-YOLO: A Lightweight Network for Insulator Defect Detection in Complex Weather Scenarios

Cao, Yangyang; Jin, Shuo; Liu, Yang

doi:10.3390/en19071640

Open AccessArticle

LID-YOLO: A Lightweight Network for Insulator Defect Detection in Complex Weather Scenarios

by

Yangyang Cao

¹,

Shuo Jin

^1,* and

Yang Liu

²

¹

Hubei Key Laboratory for High-Efficiency Utilization of Solar Energy and Operation Control of Energy Storage System, Hubei University of Technology, Wuhan 430068, China

²

The Ultra High Voltage Branch Company, State Grid Xinjiang Electric Power Co., Ltd., Urumqi 830063, China

^*

Author to whom correspondence should be addressed.

Energies 2026, 19(7), 1640; https://doi.org/10.3390/en19071640

Submission received: 7 March 2026 / Accepted: 24 March 2026 / Published: 26 March 2026

(This article belongs to the Section F: Electrical Engineering)

Download

Browse Figures

Versions Notes

Abstract

Ensuring the structural reliability of power transmission networks is a fundamental prerequisite for the stable operation of modern energy systems. To address the challenges posed by complex weather interference and the small scale of insulator defects during power line inspections, this paper proposes LID-YOLO, a lightweight insulator defect detection network. First, to mitigate image feature degradation caused by weather interference, we design the C3k2-CDGC module. By leveraging the input-adaptive characteristics of dynamic convolution and the spatial preservation properties of coordinate attention, this module enhances feature extraction capabilities and robustness in complex weather scenarios. Second, to address the detection challenges arising from the significant scale disparity between insulators and defects, we propose Detect-LSEAM, a detection head featuring an asymmetric decoupled architecture. This design facilitates multi-scale feature fusion while minimizing computational redundancy. Subsequently, we develop the NWD-MPDIoU hybrid loss function to balance the weights between distribution metrics and geometric constraints dynamically. This effectively mitigates gradient instability arising from boundary ambiguity and the minute size of insulator defects. Finally, we construct a synthetic multi-weather condition insulator defect dataset for training and validation. Compared to the baseline, LID-YOLO improves precision, recall, and mAP@0.5 by 1.7%, 3.6%, and 4.2%, respectively. With only 2.76 M parameters and 6.2 G FLOPs, it effectively maintains the lightweight advantage of the baseline, achieving an optimal balance between detection accuracy and computational efficiency for insulator inspections under complex weather conditions. This lightweight and robust framework provides a reliable algorithmic foundation for automated grid monitoring, supporting the continuous and resilient operation of modern energy systems.

Keywords:

power line inspections; insulator defects; YOLO; object detection; complex weather

1. Introduction

Ensuring the safety and reliability of power transmission is a fundamental prerequisite for the stable operation of modern energy systems [1]. As the backbone of the power system, transmission and distribution networks are exposed to complex, harsh outdoor environments and must operate under such conditions for prolonged periods [2]. Within this critical energy infrastructure, insulators provide essential mechanical support and electrical insulation for overhead transmission lines [3]. Subject to factors such as lightning flashovers, metal corrosion, and external mechanical damage, insulators are prone to defects, including string drops, surface breakage, and flashover burns [4]. Failure to detect and address these latent hazards promptly can allow localized defects to precipitate a sharp decline in insulation performance and subsequent line tripping. In severe cases, it may even induce large-scale cascading grid failures [5]. Consequently, routine and accurate inspection of insulators on overhead transmission lines is crucial not only for ensuring a secure power supply but also for safeguarding the long-term stable operation of modern energy systems [6,7].

Driven by systemic challenges such as expanding grid scale, aging infrastructure, and increasingly frequent extreme weather events, comprehensive monitoring of cross-regional and long-distance transmission lines has become more critical than ever for safeguarding overall system reliability [8]. To address these demanding requirements, current power system inspections typically combine ground-based manual patrols with various advanced reliability diagnostic technologies. For instance, infrared and ultraviolet imaging are primary non-invasive tools for pinpointing invisible electrical faults such as thermal anomalies and corona discharges [9]. Ultrasonic testing provides deep acoustic penetration for detecting internal defects in power components [10]. Typical examples include metal fatigue in transmission hardware and material voids in composite insulators [11,12]. Light detection and ranging (LiDAR) leverages high-precision 3D point clouds to identify macroscopic mechanical deformations of grid infrastructure, such as tower tilting and line sagging [13,14]. However, these methods are often limited in detecting the aforementioned early-stage physical degradation in components such as insulators and are constrained by cost and inspection efficiency when deployed across large-scale transmission networks. Given that critical insulator failures primarily manifest as externally visible structural degradation, integrating edge-computing platforms, as exemplified by unmanned aerial vehicles (UAVs), with machine vision provides an efficient and cost-effective approach for acquiring and analyzing large volumes of high-resolution inspection images [15].

In response to the aforementioned engineering demands, vision-based insulator defect detection has evolved significantly from early traditional image processing to modern deep learning methods, substantially improving the perception of subtle defects [16]. However, as power grid inspections shift towards an “all-weather, full-coverage” model, image degradation induced by complex meteorological conditions has emerged as a critical bottleneck [17]. Table 1 compares the advantages and disadvantages of typical methods for detecting insulators and their defects.

As illustrated in Table 1, typical research still confronts critical technical bottlenecks when dealing with complex grid inspection scenarios. On the one hand, although defect-specific optimization methods perform well under ideal conditions, their feature representations lack targeted measures against severe meteorological noise, making it challenging to stably extract discriminative features of insulator defects in harsh environments. On the other hand, existing methods designed for environmental robustness exhibit their own limitations: cascaded restoration–detection frameworks often struggle with diverse meteorological degradations and incur high inference latency. Meanwhile, typical end-to-end robust models often entail relatively high computational costs that restrict their deployment on resource-constrained edge devices. Therefore, achieving both robustness against complex degraded environments and high-efficiency edge-side inference remains a core challenge that current detection models must urgently address for reliable power grid inspections and stable energy system operations.

In response to this engineering dilemma in power grid inspections, this paper proposes LID-YOLO, a lightweight and robust insulator defect detection network designed to achieve an optimal balance between computational efficiency and detection accuracy. The main contributions of this study are summarized as follows:

To mitigate the feature extraction challenges induced by meteorological degradation in insulator defect detection, we propose the specifically engineered C3k2-CDGC adaptive feature extraction module. Under adverse weather conditions, insulator defect features are susceptible to distortion from heterogeneous degradation, compromising the reliability of autonomous grid inspections. This module leverages dynamic convolutions for kernel-level modulation of feature responses, enhancing generalization across diverse meteorological modalities. Furthermore, the integrated coordinate attention (CA) preserves critical spatial information, ensuring robust localization of subtle defects under complex environments. Meanwhile, group convolution is introduced to prevent the module from imposing an excessive computational burden, thereby ensuring the model’s lightweight design.
To overcome the extreme scale variation between macroscopic insulator bodies and localized subtle defects in power grid inspection images, we design the Detect-LSEAM detection head based on an asymmetric decoupled architecture. This design integrates a lightweight separated and enhanced attention module (LSEAM) with multi-scale receptive fields to simultaneously capture both the macroscopic morphology of insulators and the fine textures of localized defects. The streamlined architecture optimizes overall computational efficiency, establishing a lightweight structural foundation.
To address regression instability caused by blurred boundaries of small defects under severe weather conditions, we construct a hybrid loss function, NWD-MPDIoU, to improve the reliability of defect diagnosis. This function achieves this by dynamically fusing the metrical advantage of normalized Wasserstein distance (NWD) in modeling small target distributions with the geometric constraint of minimum point distance intersection over union (MPDIoU). This mechanism improves the regression accuracy for small-scale and blurred-boundary insulator defects.

2. Methods

2.1. YOLOv11n Network Architecture

In modern energy systems, the reliability of overhead transmission lines is paramount. Edge devices such as unmanned aerial vehicles (UAVs) have become the primary platforms for autonomous power line inspections. However, these edge devices operate under strict size, weight, and power (SWaP) constraints [35]. Consequently, computationally intensive vision models are impractical for real-time power system inspections. YOLO-series algorithms have demonstrated widespread success in industrial inspection tasks due to their superior real-time inference capability and efficient architectural design [36,37]. Building upon these proven advantages, YOLOv11n is selected as the baseline framework to balance detection precision with hardware constraints. As the most lightweight variant in the YOLOv11 series, it facilitates highly efficient deployment due to its low computational footprint and minimal parameter count. The standard architecture of YOLOv11n is illustrated in Figure 1 [38].

Architecturally, YOLOv11n is composed of three fundamental components: a backbone for feature extraction, a neck for feature fusion, and a detection head for prediction. The backbone utilizes the C3k2 module—a cross-stage partial bottleneck architecture—for hierarchical feature extraction, followed by spatial pyramid pooling-fast (SPPF) and cross-channel partial spatial attention (C2PSA) to integrate multi-scale global context. The neck employs a path aggregation network (PAN) structure to facilitate effective cross-scale feature fusion and information flow. Finally, the detection head utilizes an anchor-free design to directly regress target coordinates and categories, while optimizing bounding box convergence through the CIoU loss function.

2.2. LID-YOLO Network Architecture

Although YOLOv11n provides an efficient computational baseline, its standard architecture exhibits limitations in handling the diverse feature degradation patterns induced by adverse meteorological conditions in power grid inspection images. Furthermore, the baseline lacks the multi-scale sensitivity required to manage the extreme physical scale disparity between macroscopic insulator bodies and localized microscopic defects during energy infrastructure inspections. Consequently, the accurate identification of subtle defects remains challenging, potentially compromising the overall reliability of automated power grid monitoring. To address these technical gaps, we propose the specifically engineered LID-YOLO model. While maintaining a lightweight footprint, LID-YOLO incorporates targeted architectural refinements to achieve robust insulator defect detection under complex meteorological interferences. The integrated architecture of LID-YOLO, featuring its specific enhancement modules, is illustrated in Figure 2.

The hierarchical feature extraction process initiates in the backbone network, where 640 × 640 input images are progressively downsampled through standard convolutional layers utilizing a 3 × 3 kernel and a stride of 2. To mitigate feature degradation induced by complex meteorological interferences, C3k2-CDGC modules are integrated across four distinct scales to enhance the network’s adaptive extraction capabilities for subtle defect features. After successive downsampling stages, the backbone employs SPPF and C2PSA modules to broaden the receptive field, ultimately yielding three scales of high-level semantic feature maps with dimensions of 80 × 80, 40 × 40, and 20 × 20.

Building upon the extracted features, the neck module orchestrates multi-scale fusion through bidirectional pathways. In the top–down pathway, deep low-resolution features undergo a 2× upsampling before being concatenated with intermediate features from the backbone. Conversely, the bottom-up pathway downsamples shallow high-resolution features via standard convolutions to integrate them with higher-level semantic information. The C3k2-CDGC modules are specifically reintegrated within these fusion pathways to maintain feature consistency. This ensures that the feature representations of subtle defects effectively resist meteorological degradation during the cross-scale integration process. Through successive cross-scale sampling, convolutions, and concatenation, the neck robustly aggregates multi-level semantics, leveraging the C3k2-CDGC modules. The neck ultimately yields fused feature maps at three scales: 80 × 80 for tiny defects, 40 × 40 for medium targets, and 20 × 20 for larger macro-structures.

Finally, we propose the Detect-LSEAM detection head to strengthen multi-scale prediction, given the extreme size disparity between macroscopic insulator bodies and their microscopic defects in power system inspections. Specifically, the head receives the three fused feature maps from the neck, with dimensions of 80 × 80, 40 × 40, and 20 × 20, as its primary inputs. As a unified detection block, it embeds the LSEAM mechanism to adaptively refine the multi-scale features for final defect classification and localization. During the training phase, bounding box regression is further optimized through the proposed NWD-MPDIoU loss function to enhance localization accuracy for subtle defects.

2.2.1. C3k2-CDGC Module

In the standard YOLOv11n architecture, the C3k2 module utilizes a cross-stage partial (CSP) structure to facilitate efficient gradient propagation [39]. However, its core bottleneck block relies exclusively on standard 3 × 3 static convolutions. In practical outdoor power grid inspection environments, diverse meteorological scenarios and varying illumination exhibit highly dynamic and time-varying characteristics. Without an input-adaptive mechanism, static convolutions with fixed parameters exhibit limitations in accommodating diverse modalities of feature degradation. Furthermore, the limited 3 × 3 receptive field is highly susceptible to being overwhelmed by environmental noise when encountering localized rain or snow occlusions on insulators. The confluence of these structural limitations severely impedes the extraction of discriminative features for microscopic insulator defects in automated power line diagnostics.

To address these limitations, we propose the C3k2-CDGC module, the detailed structure of which is illustrated in Figure 3. Specifically, we engineer a dynamic group convolution and incorporate coordinate attention to architecturally reconstruct the core bottleneck block of the original C3k2 module.

By implementing the CSP partitioning strategy, the C3k2-CDGC module reconfigures the feature flow of the input data. Initially, the module splits the input feature map with

C_{i n}

channels into two parallel branches, each containing

C_{i n} / 2

channels. One branch serves as a cross-stage shortcut, routed directly to the final concatenation layer to preserve original gradient information, while the main branch sequentially passes through

N

redesigned bottleneck blocks for deep feature extraction. To optimize the integration of multi-level semantic information, the module aggregates feature maps from the shortcut branch, the initial state of the main branch, and the outputs of all

N

bottleneck blocks. The concatenation layer then aggregates these feature maps along the channel dimension, forming a comprehensive feature representation with a total of

(N + 2) {\times C}_{i n} / 2

channels. Within this redesigned bottleneck architecture, we introduce dynamic group convolution (DGC) to replace the standard convolution, as illustrated in Figure 4.

The DGC structure comprises a weight generation branch and a feature extraction path. In practical power grid inspections using edge-based platforms, adverse meteorological interferences often cause target features to degrade across diverse modalities and varying intensities. To address this challenge, the weight generation branch first employs global average pooling (GAPool) to aggregate global contextual statistics from the input features. Subsequently, it generates

M

normalized weight coefficients

π_{m}

(

M = 4

in this study) via linear mapping, where each coefficient corresponds to a specific static convolution kernel

W_{m}

. These coefficients are used to perform a linear weighted aggregation in the parameter space, synthesizing a dynamic convolution kernel

W

that adapts to current input features, as defined by:

W = \sum_{m = 1}^{M} π_{m} W_{m}

(1)

Notably, we set the spatial size of the static kernels

W_{m}

to 5 × 5. This enlarged receptive field effectively mitigates localized noise blind spots induced by persistent rain and snow, which frequently obscure defect features, enabling the model to capture a more comprehensive morphology of insulator defects during power grid inspections. Furthermore, to accommodate the restricted computational resources of edge platforms, the synthesized dynamic kernel

W

is implemented via group convolution (where the number of groups

g = 4

in this study) within the feature extraction path to optimize the computational footprint. This design effectively manages the computational overhead while introducing input-adaptive filtering and enlarging the receptive field, promoting feature robustness and computational efficiency under complex weather conditions.

Although a larger receptive field improves noise robustness, it can blur the spatial location cues of tiny defects. To compensate for this loss of spatial precision, the module integrates the CA mechanism immediately after the feature extraction path [40]. Its structure is illustrated in Figure 5.

Unlike conventional attention mechanisms that may lose explicit spatial coordinate cues, CA explicitly encodes positional information into channel features. The module performs 1D global average pooling along the horizontal (X) and vertical (Y) directions on the input features with

C_{i n} / 2

channels, generating direction-aware feature encodings. The two encoding tensors are then concatenated and fed into a shared convolution layer for feature interaction and channel reduction, producing a compact latent representation of size

(H + W) \times 1 \times C_{i n} / 4

. After splitting the latent representation into horizontal and vertical tensors, the module restores their channel dimension to

C_{i n} / 2

via separate convolution transformations and sigmoid activations, thereby generating spatial attention weights. Finally, these weights are applied to the original features via a broadcasting mechanism. This process provides explicit coordinate guidance while suppressing environmental noise, enabling the model to focus on the insulator strings and their tiny defect regions within power grid images.

2.2.2. Detect-LSEAM

In practical transmission line inspections, insulator defect detection faces an extreme scale disparity. The insulator structure is large, whereas surface breakage defects are often tiny and concealed. For defects such as string drops, which are characterized by missing local features, the model cannot rely solely on local textures; instead, it must reason from contextual cues such as the periodicity and overall structural integrity of the insulator string. However, the baseline YOLOv11n head mainly uses single-scale convolution layers for feature mapping, limiting the model’s ability to extract fine-grained local features while capturing large-scale global structures.

To address this limitation, we propose the Detect-LSEAM detection head, whose detailed structure is illustrated in Figure 6. This design inherits an asymmetric decoupled architecture and integrates LSEAM to facilitate accurate cross-scale perception, ranging from microscopic breakage to macroscopic structural absence under complex backgrounds.

Given an input feature map of size

H \times W \times C_{i n}

, Detect-LSEAM performs asymmetric initial feature mapping to meet the distinct requirements of classification and regression.

The regression branch relies on geometric cues such as edges for bounding box prediction; thus, a standard convolution is employed to preserve high-resolution spatial details. In contrast, the classification branch emphasizes abstract semantics and is less sensitive to spatial precision; therefore, depthwise separable convolution (DSConv) is adopted for feature extraction. This asymmetric design alleviates the computational pressure on edge platforms when processing high-resolution feature maps while maintaining localization accuracy. Following this stage, the channel dimensions of the feature maps from both branches are uniformly reduced, resulting in a unified output size of

H \times W \times 64

.

These unified feature maps are then fed into the LSEAM for multi-scale contextual enhancement. As shown in the dashed box in Figure 6, LSEAM routes the input

H \times W \times 64

features into 3 × 3, 5 × 5, and 7 × 7 DSConv paths, respectively. These three branches perform spatial downsampling on the features at different scales, thereby achieving a hierarchical receptive field. Smaller kernels focus on capturing local textures of tiny breakage defects. In contrast, the 7 × 7 kernel covers a broader spatial extent to extract the global context of the insulator string, supporting the identification of large-scale structural absence.

Features from different scales are compressed into channel descriptors using adaptive average pooling (AAPool) and then fused by averaging. A fully connected network (FC) and a sigmoid function then generate global channel weights in the range of [0, 1]. These weights recalibrate the input features via element-wise multiplication, thereby suppressing noise from complex environmental backgrounds. The recalibrated robust features are then sent to the final convolutional layers and mapped to their corresponding classification and localization outputs.

2.2.3. NWD-MPDIoU

The baseline YOLOv11n model employs the complete intersection over union (CIoU) loss function for bounding box regression. Formally, it is defined as follows:

L_{C I o U} = 1 - I o U + \frac{ρ^{2} (b^{p r e d}, b^{g t})}{c^{2}} + α v

(2)

where

ρ (b^{p r e d}, b^{g t})

denotes the Euclidean distance between the center points of the predicted box

b^{p r e d}

and the ground truth box

b^{g t}

, while

c

represents the diagonal length of the smallest enclosing box covering both boxes. The term

α v

serves as the aspect ratio consistency penalty, defined as follows:

v = \frac{4}{π^{2}} {(a r c t a n \frac{w^{g t}}{h^{g t}} - a r c t a n \frac{w^{p r e d}}{h^{p r e d}})}^{2}

(3)

α = \frac{v}{1 - I o U + v}

(4)

Here,

w^{p r e d}

and

h^{p r e d}

denote the width and height of the predicted box, while

w^{g t}

and

h^{g t}

represent the corresponding dimensions of the ground truth box.

In insulator defect detection tasks, objects exhibit significant scale disparities: insulators are relatively large, whereas defects are typically small. For such small objects, although CIoU mitigates the vanishing gradient problem associated with non-overlapping boxes, it remains heavily dependent on geometric overlap and boundary distance, which causes drastic IoU fluctuations from even minor positional deviations. Due to this dependency, when dealing with targets possessing extremely limited pixel coverage, CIoU fails to provide smooth and stable regression gradients. Moreover, image degradation induced by weather conditions not only blurs the texture details of insulator surfaces and defects but also reduces edge contrast. In such scenarios, the geometric regression mechanism of CIoU proves inadequate for handling boundary ambiguity, ultimately compromising detection accuracy.

Motivated by these limitations, we propose a hybrid loss function named NWD-MPDIoU, which effectively integrates distribution metrics with geometric regression by employing a dynamic balancing coefficient. The total loss is formulated as follows:

L_{T o t a l} = β \cdot L_{M P D I o U} + (1 - β) \cdot L_{N W D}

(5)

In this formulation,

L_{M P D I o U}

and

L_{N W D}

represent the geometric regression loss and the distribution metric loss, respectively. The balancing coefficient

β

is dynamically defined as the explicit IoU value between the prediction and the ground truth (

β = I o U

). Consequently, in scenarios characterized by low IoU, particularly for small or occluded objects, a smaller

β

ensures that the loss function is primarily dominated by

L_{N W D}

. Specifically, a bounding box, denoted as

R = (x, y, w, h)

, is modeled as a two-dimensional Gaussian distribution

N (μ, \sum)

defined as follows [41]:

μ = [\frac{x}{y}]

(6)

\sum = [\begin{matrix} \frac{w^{2}}{4} & 0 \\ 0 & \frac{h^{2}}{4} \end{matrix}]

(7)

Here,

(x, y)

denote the center coordinates, while

w

and

h

represent the width and height of the bounding box, respectively. By exploiting the spatial continuity of 2D Gaussian distributions, this approach ensures that the similarity metric changes continuously with positional deviations, thereby yielding smoother regression gradients and avoiding the abrupt fluctuations inherent in geometric overlap calculations. Moreover, unlike methods requiring strict boundary alignment, NWD quantifies distribution similarity, which grants high tolerance for ambiguous boundaries and enables effective optimization for small objects even when geometric overlap is absent, effectively addressing the instability of CIoU. The calculation of

L_{N W D}

is defined as:

L_{N W D} = 1 - \frac{1}{1 + \frac{W_{2}^{2}}{S}}

(8)

where

S

represents the area of the ground truth box, and

W_{2}^{2}

denotes the second-order Wasserstein distance. Employing

S

for normalization ensures scale invariance. This normalization effectively mitigates the impact of absolute scale differences, preventing the optimization process from being dominated by the large magnitude of Wasserstein distances associated with large insulators.

As the IoU increases, the corresponding rise in

β

amplifies the influence of

L_{M P D I o U}

. A high IoU indicates that the object’s coarse localization is established. At this stage, the model leverages the corner constraints provided by MPDIoU to refine the bounding box, thereby minimizing localization deviation. The formulation of

L_{M P D I o U}

is given by [42]:

L_{M P D I o U} = 1 - (I o U - \frac{d_{1}^{2}}{w_{c}^{2} + h_{c}^{2}} - \frac{d_{2}^{2}}{w_{c}^{2} + h_{c}^{2}})

(9)

Here,

d_{1}^{2}

and

d_{2}^{2}

denote the squared Euclidean distances between the top-left and bottom-right corners of the predicted and ground truth boxes, respectively. The parameters

w_{c}

and

h_{c}

represent the width and height of the smallest enclosing box, serving as normalization factors for these distance metrics. Finally, Figure 7 presents a schematic comparison between the traditional CIoU and our proposed NWD-MPDIoU.

As depicted in Figure 7, compared to the rigid geometric penalties of CIoU, the proposed NWD-MPDIoU dynamically integrates two highly complementary mechanisms. The concentric ellipses represent the Gaussian distributions used by NWD to maintain optimization continuity for small or non-overlapping targets. Simultaneously, the corner distances illustrate the rigorous geometric constraints imposed by MPDIoU during late-stage alignment. This dynamic complementary mechanism effectively mitigates the geometric inflexibility of traditional CIoU, thereby enhancing the model’s detection robustness under extreme scale variations and adverse weather conditions for insulator defect recognition.

3. Implementation and Analysis

3.1. Dataset

To address the scarcity of real-world power grid insulator defect datasets under complex weather conditions, we initially sourced 1788 high-quality real-world images from multiple publicly available datasets [43,44], captured under normal weather conditions, to serve as the foundation for a synthetic dataset. Specifically, Figure 8a details the number of instances across different categories in the real-world images, whereas Figure 8b illustrates the normalized width and height distribution of all annotated bounding boxes. As indicated by Figure 8b, the target instances in the dataset predominantly exhibit small scales and high aspect ratios. To provide a comprehensive profile of the image quality alongside these geometric characteristics, Figure 8c further details the spatial resolution distribution of the foundation real-world images: resolutions under 1200 px account for 21.73%, between 1200 and 1920 px account for 62.25%, and over 1920 px account for 16.02%. The results indicate that the image resolutions are primarily concentrated in the common range near HD/FHD, while also covering samples with both lower and higher resolutions.

We partitioned the dataset into training, validation, and testing sets with a ratio of 7:2:1. Subsequently, offline augmentation was employed by randomly applying one of four typical weather effects—rain, snow, fog, or low-light conditions—to the original samples, expanding the dataset to a total of 3576 images. The simulation methods for these meteorological scenarios are detailed as follows:

(1): Simulation of Rainy Conditions

To simulate the visual degradation induced by rain, an additive noise model was adopted. Rainy samples were synthesized by superimposing a layer of directional rain streaks onto the original images. The mathematical formulation is expressed as:

I_{r a i n} = I_{o r i g i n a l} + λ (G_{σ} * M_{r a i n}) + δ

(10)

Here,

I_{r a i n}

and

I_{o r i g i n a l}

denote the synthesized rainy image and the original image, respectively.

M_{r a i n}

represents the initial linear rain streak mask, which is generated using randomized orientations

θ \in [70 °, 110 °]

to model varying wind-driven rainfall.

G_{σ}

denotes a Gaussian blur kernel with a standard deviation of

σ

, used to simulate the motion blur induced by high-velocity raindrops. The symbol

*

represents the convolution operation. Additionally,

λ

serves as the rain streak intensity coefficient, randomly sampled from [0.1, 0.4] to reflect diverse precipitation levels, and

δ

introduces random noise following a Gaussian (0, 0.01) distribution to model environmental lighting perturbations.

(2): Simulation of Snowy Conditions

Snowflakes are typically opaque or semi-transparent solid particles that cause spatial occlusion of insulators and their defects [45]. To mimic this characteristic, a physical occlusion model was employed to simulate the random spatial distribution of snowflakes. The synthesis formula is expressed as:

I_{s n o w} = I_{o r i g i n a l} (1 - M_{s n o w}) + M_{s n o w} \times L_{s n o w} + δ

(11)

where

M_{s n o w}

denotes the randomly generated snowflake mask, characterized by a coverage density

η \in [0.01, 0.05]

to represent varying snowfall intensities.

L_{s n o w}

represents the snowflake luminance intensity, assigned values in the range [0.8, 1.0] to simulate the high reflectivity characteristic of snowflakes.

(3): Simulation of Foggy Conditions

Fog primarily induces contrast reduction and color shifts [46]. To simulate this visual degradation, the standard atmospheric scattering model was adopted, expressed as:

I_{f o g} = I_{o r i g i n a l} \cdot t + L_{a t m} (1 - t)

(12)

where

I_{f o g}

represents the synthesized foggy image,

L_{a t m}

denotes the global atmospheric light sampled from [0.7, 0.9] to represent typical overcast illuminance, and

t

refers to the medium transmission. Given that monocular images lack explicit depth information, a vertical gradient was employed as a proxy for depth

z

. Consequently, the transmission

t

is formulated as

t = e^{- τ z}

, where the scattering coefficient

τ \in [1.5, 3.0]

controls the fog density. This approach approximates the physical behavior that atmospheric scattering effects intensify as the observation distance between the optical sensor and the insulator target increases.

(4): Simulation of Low-light Conditions

To simulate visibility degradation and the loss of detail stemming from insufficient illumination during dusk, dawn, or heavily overcast conditions in power grid monitoring, linear scaling was applied to the original images. This process effectively replicates the imaging characteristics of optical sensor underexposure in edge devices. The mathematical model is expressed as:

I_{l o w} = I_{o r i g i n a l} \cdot γ

(13)

where

I_{l o w}

represents the synthesized low-light image, and

γ

denotes the luminance attenuation coefficient, randomly sampled from [0.3, 0.6] to emulate varying degrees of illumination deficiency without completely obliterating structural features.

As observed in Figure 9, the augmented samples replicate the visual characteristics of real-world meteorological conditions. Specifically, rainy images display clear directional streaks and motion blur, while snowy samples demonstrate realistic spatial occlusion and contrast reduction. Foggy scenarios exhibit a notable whitening effect with blurred boundaries, and low-light images effectively reflect the texture loss caused by underexposure.

3.2. Experimental Setup and Evaluation Metrics

All experiments were conducted on the Ubuntu 20.04 operating system. The implementation was developed using Python 3.10 and the PyTorch 2.3.0 deep learning framework. Training was accelerated by an NVIDIA GeForce RTX 4090 GPU configured with CUDA 12.1. The key hyperparameter configurations adopted for training are summarized in Table 2.

To comprehensively evaluate detection performance for insulator defects, precision (P), recall (R), and mean average precision (mAP) were employed as performance metrics. Additionally, to assess the model’s computational complexity and lightweight characteristics for edge deployment, parameters (Params) and floating point operations (FLOPs) were utilized. The specific calculations for precision and recall are defined as follows:

P = \frac{T P}{T P + F P}

(14)

R = \frac{T P}{T P + F N}

(15)

where TP (True Positive) denotes the number of correctly detected positive samples, FP (False Positive) refers to negative samples misclassified as positive, and FN (False Negative) represents missed positive samples. Furthermore, the mAP@0.5 metric assesses the overall detection performance across multiple classes and is formulated as:

m A P @ 0.5 = \frac{1}{k} \sum_{i = 1}^{k} {A P}_{i} (I o U = 0.5)

(16)

where k represents the total number of classes, and

{A P}_{i}

denotes the Average Precision for the i-th class at an IoU threshold of 0.5.

3.3. Ablation Studies and Detailed Analysis

To systematically validate how each proposed component of the LID-YOLO framework addresses specific bottlenecks in insulator defect detection under complex environments, we conducted comprehensive ablation studies under identical experimental settings. The quantitative results, which reflect the model’s reliability in identifying insulator defects, are detailed in Table 3. Here, Model A denotes the integration of the C3k2-CDGC feature extraction module; Model B represents the addition of the Detect-LSEAM detection head; and Model C indicates the adoption of the NWD-MPDIoU hybrid loss function.

As presented in Table 3, integrating the C3k2-CDGC module into the baseline model yields a 2.3% increase in mAP@0.5. This substantial improvement in recognition accuracy for insulator defects verifies that the proposed dynamic grouped convolution, combined with the coordinate attention mechanism, effectively extracts critical defect features when dealing with feature degradation under complex weather conditions. Furthermore, despite the introduction of dynamic convolution and coordinate attention mechanisms, the computational cost and parameter count only slightly increase by 0.4 G (FLOPs) and 0.1 M (Params), respectively. This efficiency is attributed to the module’s grouped structure, which partially offsets the additional computational overhead introduced by these dynamic and attention components.

When the Detect-LSEAM head is deployed independently, mAP@0.5 improves by 1.9%, accompanied by a notable rise in precision (from 87.9% to 89.2%). By utilizing the lightweight LSEAM mechanism, this module enhances the discriminative power between the macroscopic insulator body and its localized microscopic defects, while simultaneously reducing FLOPs by 0.6 G. This computational reduction further solidifies the model’s viability for UAV-based edge inference. Finally, training with the improved NWD-MPDIoU loss function leads to a 1.3% increase in mAP@0.5. This confirms that integrating distribution metrics with geometric constraints successfully addresses boundary ambiguity during bounding box regression, specifically mitigating the difficulty of isolating subtle breakage anomalies from the main insulator structure.

With the progressive integration of the proposed modules, the model’s defect detection capability exhibits a steady upward trend. Ultimately, the complete LID-YOLO model achieved an mAP@0.5 of 87.5%, representing a 4.2% improvement over the baseline YOLOv11n, alongside precision and recall gains of 1.7% and 3.6%, respectively. Furthermore, while the parameter count increases marginally by 0.17 M, the overall computational complexity (FLOPs) decreases from 6.4 G to 6.2 G, ensuring that the model does not introduce excessive computational and storage burdens overall.

Collectively, these structural optimizations enable LID-YOLO to achieve higher detection accuracy under typical image degradation across various weather conditions without incurring severe computational penalties. This confirms the validity of the proposed architectural modifications for reliable insulator defect detection.

Building upon the architectural validations, we further examined the proposed NWD-MPDIoU loss function, which incorporates a linear dynamic weighting strategy (

β = I o U

) by design. To validate the rationality of this configuration, a sensitivity analysis was conducted on

β

. We compared the adopted linear strategy (

β = I o U

) against a fixed weight (

β = 0.5

) and two non-linear dynamic strategies (

β = {I o U}^{2}

and

β = \sqrt{I o U}

), with the results presented in Table 4.

As shown in the table, the fixed weight strategy yielded the lowest mAP@0.5 of 85.5%, indicating that maintaining a static balance between distribution metrics and geometric constraints is suboptimal throughout the dynamic training process. Among the non-linear strategies,

β = \sqrt{I o U}

achieved a slightly higher recall (80.1%) but suffered a noticeable drop in precision (88.6%). This aggressive strategy heavily penalizes geometric misalignment even when the prediction deviates significantly from the ground truth, making the model overly sensitive to background noise induced by complex weather conditions. In practical power system operations, the resulting increase in false positives may trigger excessive false alarms, potentially leading to more follow-up inspections and a higher maintenance workload. Conversely, the conservative strategy

β = {I o U}^{2}

achieved the highest precision (91.1%) but a lower recall (78.5%), as it relies predominantly on the NWD loss unless the bounding boxes are highly overlapped. While this effectively filters out weather-induced artifacts, it weakens the geometric guidance for structurally ambiguous defects, resulting in missed detections of critical insulator faults.

Ultimately, the linear strategy demonstrated the most effective balance, yielding the highest mAP@0.5 of 87.5%. By dynamically shifting the optimization focus from distribution distance (NWD) in the early stages to geometric constraints (MPDIoU) in the later stages, it effectively mitigates the interference of image degradation while facilitating precise localization for tiny defects. This analysis corroborates the rationale behind adopting the linear dynamic strategy, demonstrating its capability to provide the necessary reliability and robustness for insulator defect detection in complex environments.

Having established the optimal internal configuration, we further investigated the overall efficacy of the proposed regression loss optimization in handling precise insulator defect localization. Table 5 presents the comparative results of the NWD-MPDIoU loss function against the baseline CIoU, GIoU, and MPDIoU.

Comparative analysis indicates that although GIoU achieves the highest recall of 80.5%, its overall mAP is limited to 86.8%. Similarly, the baseline CIoU demonstrates balanced metrics but yields a comparatively lower overall detection accuracy with an mAP of 86.1%. While the standard MPDIoU attains the peak precision of 90.8%, its recall is limited to 76.6%, suggesting a tendency to miss difficult targets such as concealed insulator defects. By incorporating the distribution metric of NWD, the proposed method effectively mitigates this limitation for small-scale ambiguous defects, boosting recall by 3.2% compared to MPDIoU. Although accompanied by a slight decrease in precision, this trade-off results in a more robust overall performance. Consequently, the proposed NWD-MPDIoU achieves the highest mAP of 87.5%, outperforming CIoU, GIoU, and the standard MPDIoU in insulator defect detection tasks under complex weather conditions.

To intuitively demonstrate the guiding effect of different loss functions on model training, the bounding box regression loss curves are visualized in Figure 10. As illustrated, the loss curve of NWD-MPDIoU exhibits a steep initial descent and stabilizes at a comparatively lower value. The variations in the loss curves validate the advantages of NWD-MPDIoU: it accelerates the regression convergence of boundary-ambiguous insulator defects via NWD in the early stages, while employing geometric constraints for fine-grained adjustments of the targets in the later training stages to achieve more accurate localization precision.

Beyond the bounding box regression optimization, to further evaluate the classification performance across different categories, Figure 11 compares the confusion matrices of YOLOv11n and LID-YOLO. The improved model exhibits higher values along the diagonal, indicating lower misclassification rates across multiple classes. The most notable improvements are observed in the ‘breakage’ and ‘flashover’ categories, with absolute increases of 0.09 and 0.05, respectively. As these categories were the baseline’s weakest points due to ambiguous visual features, this result demonstrates that the proposed method significantly improves the recognition capability for such challenging defects even when complex weather degrades image features, verifying the rationality of the algorithmic improvements.

3.4. Comparative Experiments

To evaluate LID-YOLO’s insulator defect detection under weather-induced visual degradation in power grids, we benchmarked it against mainstream models. The comparative models include mainstream lightweight models from the YOLO series and the Transformer-based RT-DETR-r18 [47]. All models were evaluated under identical experimental environments and hyperparameters, using the same dataset. The comparative results are presented in Table 6.

As presented in Table 6, the proposed LID-YOLO achieves an mAP@0.5 of 87.5%, significantly outperforming other lightweight YOLO models of comparable scale. Importantly, the model achieves a recall of 79.8%, exhibiting a notable advantage over the comparative lightweight YOLO variants in identifying concealed or tiny defects. In power system maintenance, a critical challenge is minimizing missed detections that can escalate into severe flashovers and grid-wide outages, while simultaneously avoiding excessive false positives that waste maintenance resources. Compared to other YOLO variants, LID-YOLO successfully achieves this optimal balance. This high recall, coupled with the highest overall mAP@0.5, indicates that the synergistic design of dynamic noise filtering and robust bounding box regression effectively intercepts subtle insulator faults without overwhelming the inspection system, thereby securing the operational resilience of the transmission network.

When compared to the larger YOLOv11s model, LID-YOLO achieves a 0.9% higher mAP@0.5 and a 2.0% higher recall, while reducing parameters and FLOPs by approximately 70.7% and 70.9%, respectively. This efficiency demonstrates that addressing extreme scale variations via targeted contextual enhancement is a more effective strategy than merely scaling up the network capacity. While the Transformer-based RT-DETR-r18 achieves the highest detection metrics, it incurs a substantial computational burden, requiring 57.0 G FLOPs and 19.9 M parameters. In contrast, the proposed model is only 0.3% lower in mAP@0.5, yet reduces FLOPs and parameters by approximately 89% and 86%, respectively, as the multi-scale contextual enhancement serves as a highly efficient alternative to heavy Transformer blocks for capturing necessary structural dependencies. This confirms that LID-YOLO achieves a favorable balance between computational efficiency and detection accuracy, making it more suitable for resource-constrained edge devices utilized in power line inspections. Figure 12 illustrates the Precision–Recall (PR) curves of all comparative models.

As depicted, the PR curve of LID-YOLO consistently envelops those of most lightweight YOLO variants, exhibiting a noticeably more gradual decline in precision as recall increases. This reveals that the model can successfully retrieve more challenging or weather-obscured insulator defects without inadvertently introducing a substantial number of false positives. Consequently, this robust PR performance visually corroborates the effectiveness of the proposed dynamic convolution and contextual enhancement mechanisms, reaffirming the model’s capability to minimize missed detections and secure the operational reliability of energy systems.

3.5. Visualization Results

To intuitively demonstrate the insulator defect detection performance of LID-YOLO under complex weather conditions, representative samples from various weather scenarios were selected for visual comparison, as shown in Figure 13.

Under normal lighting conditions, the majority of models accurately localized both insulators and their defects. An exception is YOLOv8n, which failed to detect the small breakage defect. In rainy and snowy scenarios, which are particularly prone to inducing severe high-frequency background noise and spatial occlusion, the feature extraction capabilities of the models are heavily tested. Consequently, YOLOv10n failed to detect both breakage defects on the insulator, while YOLOv5n, YOLOv8n, and YOLOv9t each missed one instance, and YOLOv11s failed to detect a flashover defect under snowy conditions. In contrast, YOLOv12n, RT-DETR-r18, and LID-YOLO exhibited robust performance, successfully detecting all targets despite the severe meteorological interference. In foggy environments, the whitening effect severely degrades the texture and color features on the insulator surface, making characteristics like flashovers much more difficult to discern. Under these conditions, half of the compared models failed to detect all flashover defects. However, LID-YOLO successfully localized all flashover regions, demonstrating the adaptability of the proposed loss function to such boundary-blurred defects. Under low-light conditions, all models, with the exception of YOLOv10n, successfully detected the small defects, with LID-YOLO achieving the highest confidence score. In summary, the visual results reveal a prevalent tendency of missed detections among standard YOLO variants across various complex weather conditions. These visual results corroborate the issue of generally low recall rates among the YOLO models shown in Table 6. In practical energy systems, such missed detections are highly critical; undetected latent defects can rapidly deteriorate under continuous electrical stress, eventually triggering more severe accidents. Conversely, the proposed LID-YOLO effectively overcomes this bottleneck, achieving a high recall of 79.8%—second only to the computationally heavy RT-DETR-r18. Overall, these qualitative results confirm LID-YOLO’s robust adaptability for insulator defect detection under various weather-induced visual degradations typical of automated power grids and demonstrate that it can achieve better performance while maintaining the computational efficiency of the YOLO series.

To provide a deeper qualitative interpretation of the internal feature representations beyond the bounding box predictions, gradient-weighted class activation mapping (Grad-CAM) was employed to visualize class activation heatmaps. Figure 14 illustrates the comparative heatmaps generated by the baseline YOLOv11n and the proposed LID-YOLO under both normal conditions and the four complex weather scenarios.

The visualizations reveal that the baseline model generally struggles with severe feature dispersion. Its activation regions frequently extend to adjacent tower structures or drift towards environmental noise, resulting in discontinuous and patchy heatmaps on the target insulators. In power line inspections, such feature drift often leads to background overfitting, causing the model to generate false alarms or fail entirely when deployed in varying geographic corridors. In contrast, the proposed LID-YOLO consistently maintains an enhanced and robust attentional focus across all evaluated scenarios. Its high-activation regions align precisely with the actual insulator bodies and potential defect areas, effectively suppressing background clutter and environmental interference. This consistent activation stability visually reflects that the proposed modules not only enhance the overall feature representation capacity of the network but also effectively filter out variable environmental noise to anchor structural features under severe degradations. Consequently, this robust feature extraction capability directly supports more reliable insulator defect detection, thereby safeguarding the operational stability of energy transmission systems under complex weather conditions.

3.6. Data Balancing Strategy for Hard Examples

In the model performance evaluation in Section 3.3, the proposed LID-YOLO model significantly improved the detection accuracy for the hard category ‘breakage’. However, because the loss gradients of these features are relatively weak, they are easily overwhelmed by a massive number of easy samples and noise.

In this section, the application of a data balancing technique is further discussed to mitigate this issue and further improve LID-YOLO’s detection capability for hard targets. By duplicating the training images labeled as ‘breakage’ and applying geometric transformations, the learning frequency of these hard samples is increased. As shown in Table 7, the aforementioned data balancing technique further enhances the model’s ability to identify hard samples, increasing the mAP@0.5 of the ‘breakage’ category from 78.2% to 83.6%, which consequently drives the overall mAP@0.5 up to 89.4%.

The confusion matrix after data balancing is shown in Figure 15. It can be observed that the true positive prediction rate has increased to 0.80, reducing the probability of breakage being misclassified as background. The performance improvement on hard examples suggests that the LID-YOLO architecture is capable of capturing complex defect features when the training distribution is appropriately adjusted, thereby enhancing the overall robustness of the detection.

4. Conclusions

In this study, the LID-YOLO framework was proposed to advance automated insulator inspections by addressing the critical issues of weather-induced image degradation and the minute scale of physical defects. To meet the demands of power inspections, targeted architectural optimizations were introduced. Specifically, the proposed C3k2-CDGC module, Detect-LSEAM head, and NWD-MPDIoU loss function are integrated to comprehensively tackle the interference of severe weather, the extreme scale variations in targets, and the boundary ambiguity of tiny defects. Together, these optimizations establish a reliable algorithmic foundation for automated fault diagnosis in power transmission networks.

Experimental evaluations on a synthetic complex weather dataset demonstrate the superiority of the proposed method. LID-YOLO achieves an mAP@0.5 of 87.5%, outperforming the YOLOv11n baseline by 4.2%, with precision and recall increasing by 1.7% and 3.6%, respectively. In terms of computational efficiency, although the functional enhancements result in a marginal parameter increase of 0.17 M compared to the baseline, the overall FLOPs are reduced by 0.2 G, effectively preserving its lightweight advantage. This trade-off between diagnostic accuracy and computational efficiency makes LID-YOLO suitable for deployment on resource-constrained edge devices, satisfying a critical prerequisite for the automated monitoring and reliable operation of energy systems.

In future research, we plan to collect and annotate insulator defect images under real-world complex weather scenarios to further validate and enhance the model’s practical applicability. Additionally, we will further adapt and expand the current detection framework to incorporate other critical transmission components, thereby establishing a more comprehensive inspection model.

Author Contributions

Conceptualization, S.J. and Y.L.; methodology, Y.C.; software, Y.C.; validation, Y.C.; investigation, Y.C.; resources, S.J. and Y.L.; data curation, Y.L.; writing—original draft preparation, Y.C.; writing—review and editing, S.J. and Y.L.; supervision, S.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

Author Yang Liu was employed by the Ultra High Voltage Branch Company, State Grid Xinjiang Electric Power Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Mejia-Ruiz, G.E.; Marasini, G.; Qu, Z.; Kundu, S.; Pushpak, S. Cybersecurity challenges in power networks with distributed energy resources: A comprehensive survey. Renew. Sustain. Energy Rev. 2025, 224, 116100. [Google Scholar] [CrossRef]
Guddanti, K.P.; Bharati, A.K.; Nekkalapu, S.; Mcwherter, J.; Morris, S.L. A comprehensive review: Impacts of extreme temperatures due to climate change on power grid infrastructure and operation. IEEE Access 2025, 13, 49375–49415. [Google Scholar] [CrossRef]
Sharma, P.; Saurav, S.; Singh, S. Object detection in power line infrastructure: A review of the challenges and solutions. Eng. Appl. Artif. Intell. 2024, 130, 107781. [Google Scholar] [CrossRef]
Liu, J.; Hu, M.; Dong, J.; Lu, X. Summary of insulator defect detection based on deep learning. Electr. Power Syst. Res. 2023, 224, 109688. [Google Scholar] [CrossRef]
Zhang, Q.; Zhang, J.; Li, Y.; Zhu, C.; Wang, G. ID-YOLO: A multi-module optimized algorithm for insulator defect detection in power transmission lines. IEEE Trans. Instrum. Meas. 2025, 74, 3505611. [Google Scholar] [CrossRef]
Liu, Y.; Liu, D.; Huang, X.; Li, C. Insulator defect detection with deep learning: A survey. IET Gener. Transm. Distrib. 2023, 17, 3541–3558. [Google Scholar] [CrossRef]
Alhassan, A.B.; Zhang, X.; Shen, H.; Xu, H. Power transmission line inspection robots: A review, trends and challenges for future research. Int. J. Electr. Power Energy Syst. 2020, 118, 105862. [Google Scholar] [CrossRef]
Mejia-Ruiz, G.E.; Paternina, M.R.A.; Qu, Z.; Ahmed, S.; Konstantinou, C. Multiple ancillary services provision by optimal control of aggregated inverter-based resources. Int. J. Electr. Power Energy Syst. 2025, 171, 110953. [Google Scholar] [CrossRef]
Zhao, X.; Zhao, Y.; Hu, S.; Wang, H.; Zhang, Y.; Ming, W. Progress in active infrared imaging for defect detection in the renewable and electronic industries. Sensors 2023, 23, 8780. [Google Scholar] [CrossRef]
Zhou, D.; Chen, F.; Liang, J.; Zhang, Y.; Zheng, W.; Li, X. Battery defect detection using ultrasonic guided waves and a convolutional neural network model. J. Energy Storage 2025, 119, 116352. [Google Scholar] [CrossRef]
Zheng, H.; Cai, Q.; Zheng, J.; Zhen, Z.; Zou, W.; Chen, J. Quantitative method for ultrasonic testing of lead seal defects in high voltage cable accessories. IEEE Access 2025, 13, 76047–76057. [Google Scholar] [CrossRef]
Ziaja-Sujdak, A.; Nowak, T.; Ho, C.H.; Bobrowski, P. Ultrasonic testing of glass fiber-reinforced polymer composites used in high-voltage insulating components. IEEE Trans. Dielectr. Electr. Insul. 2024, 32, 83–91. [Google Scholar] [CrossRef]
Li, Z.; Feiran, W.; Han, G.; Xinyang, G.; Shi, Z. Towards real-time spatial distance monitoring of power transmission lines using LiDAR point clouds and visual imaging. EAI Endorsed Trans. Energy Web 2024, 12, 1–18. [Google Scholar] [CrossRef]
Liu, Y.; Zhao, X.; Jiao, Y.; Yang, X.; Xu, H. Method for real-time reconstruction of a transmission line based on the LiDAR point cloud data of a partial line segment. Sustain. Energy Technol. Assess. 2023, 57, 103180. [Google Scholar] [CrossRef]
Wong, S.Y.; Choe, C.W.C.; Goh, H.H.; Low, Y.W.; Cheah, D.Y.S.; Pang, C. Power transmission line fault detection and diagnosis based on artificial intelligence approach and its development in UAV: A review. Arab. J. Sci. Eng. 2021, 46, 9305–9331. [Google Scholar] [CrossRef]
Li, X.; Blancaflor, E.B. A review of image-based insulator defect detection algorithms for transmission lines. In Proceedings of the 2024 5th International Conference on Computer Vision, Image and Deep Learning (CVIDL), Zhuhai, China, 19–21 April 2024; pp. 718–724. [Google Scholar] [CrossRef]
Song, Y.; Lu, Y. A review of unmanned visual target detection in adverse weather. Electronics 2025, 14, 2582. [Google Scholar] [CrossRef]
Tan, P.; Li, X.; Xu, J.; Ma, J.; Wang, F.; Ding, J.; Fang, Y.; Ning, Y. Catenary insulator defect detection based on contour features and gray similarity matching. J. Zhejiang Univ. Sci. A 2020, 21, 64–73. [Google Scholar] [CrossRef]
Mei, H.; Jiang, H.; Yin, F.; Wang, L.; Farzaneh, M. Terahertz imaging method for composite insulator defects based on edge detection algorithm. IEEE Trans. Instrum. Meas. 2021, 70, 4504310. [Google Scholar] [CrossRef]
Zhang, Z.; Chen, H.; Huang, S. Detection of missing insulator caps based on machine learning and morphological detection. Sensors 2023, 23, 1557. [Google Scholar] [CrossRef]
Liu, X.; Tian, H.; Wang, Y.; Jiang, F.; Zhang, C. Research on image segmentation algorithm and performance of power insulator based on adaptive region growing. J. Electr. Eng. Technol. 2022, 17, 3601–3612. [Google Scholar] [CrossRef]
Surya Prasad, P.; Prabhakara Rao, B. Condition monitoring of 11 kV overhead power distribution line insulators using combined wavelet and LBP-HF features. IET Gener. Transm. Distrib. 2017, 11, 1144–1153. [Google Scholar] [CrossRef]
Song, L.; Liang, Q.; Chen, H.; Hu, H.; Luo, Y.; Luo, Y. A new approach to optimize SVM for insulator state identification based on improved PSO algorithm. Sensors 2023, 23, 272. [Google Scholar] [CrossRef] [PubMed]
Al Kharusi, K.; El Haffar, A.; Mesbah, M. Fault detection and classification in transmission lines connected to inverter-based generators using machine learning. Energies 2022, 15, 5475. [Google Scholar] [CrossRef]
Wang, S.; Zou, X.; Zhu, W.; Zeng, L. Insulator defects detection for aerial photography of the power grid using you only look once algorithm. J. Electr. Eng. Technol. 2023, 18, 3287–3300. [Google Scholar] [CrossRef]
Li, T.; Zhu, C.; Li, J.; Cao, H.; Bai, H. A real-time insulator condition detection model for UAV inspection based on FG-YOLO. Meas. Sci. Technol. 2025, 36, 056208. [Google Scholar] [CrossRef]
Hu, J.; Wan, W.; Qiao, P.; Zhou, Y.; Ouyang, A. Power insulator defect detection method based on enhanced YOLOv7 for aerial inspection. Electronics 2025, 14, 408. [Google Scholar] [CrossRef]
Xu, J.; Zhao, S.; Li, Y.; Song, W.; Zhang, K. MRB-YOLOv8: An algorithm for insulator defect detection. Electronics 2025, 14, 830. [Google Scholar] [CrossRef]
Deng, S.; Chen, L.; He, Y. Insulator defect detection from aerial images in adverse weather conditions. Appl. Intell. 2025, 55, 365. [Google Scholar] [CrossRef]
Ding, Z.; Deng, S.; Liu, Q. Insulator defect detection algorithm based on improved YOLO11s in snowy weather environment. Symmetry 2025, 17, 1763. [Google Scholar] [CrossRef]
Yi, L.; Luo, L.; Wang, Y.; She, H.; Liu, J.; Dong, T.; Luo, S. Flaw detection of railway catenary insulator based on DP-YOLOv5 algorithm with bright and dark channel enhancement. Phys. Scr. 2024, 99, 126004. [Google Scholar] [CrossRef]
Li, J.; Zhou, H.; Lv, G.; Chen, J. A2MADA-YOLO: Attention alignment multiscale adversarial domain adaptation YOLO for insulator defect detection in generalized foggy scenario. IEEE Trans. Instrum. Meas. 2025, 74, 5011419. [Google Scholar] [CrossRef]
Li, J.; Wu, Y.; Zhu, S. Insulator defect detection in severe weather using improved YOLOv8. PLoS ONE 2025, 20, e0333175. [Google Scholar] [CrossRef] [PubMed]
Zhang, Z.; Zhang, B.; Lan, Z.; Liu, H.; Li, D.; Pei, L.; Yu, W. FINet: An insulator dataset and detection benchmark based on synthetic fog and improved YOLOv5. IEEE Trans. Instrum. Meas. 2022, 71, 6006508. [Google Scholar] [CrossRef]
Wang, Q.; Hu, Z.; Li, E.; Wu, G.; Yang, W.; Hu, Y.; Peng, W.; Sun, J. YOLOLS: A lightweight and high-precision power insulator defect detection network for real-time edge deployment. Energies 2025, 18, 1668. [Google Scholar] [CrossRef]
Hussain, M. YOLO-v1 to YOLO-v8, the rise of YOLO and its complementary nature toward digital manufacturing and industrial defect detection. Machines 2023, 11, 677. [Google Scholar] [CrossRef]
Terven, J.; Córdova-Esparza, D.M.; Romero-González, J.A. A comprehensive review of YOLO architectures in computer vision: From YOLOv1 to YOLOv8 and YOLO-NAS. Mach. Learn. Knowl. Extr. 2023, 5, 1680–1716. [Google Scholar] [CrossRef]
Khanam, R.; Hussain, M. YOLOv11: An overview of the key architectural enhancements. arXiv 2024, arXiv:2410.17725. [Google Scholar] [CrossRef]
Wang, C.; Liao, H.M.; Wu, Y.; Chen, P.; Hsieh, J.; Yeh, I. CSPNet: A new backbone that can enhance learning capability of CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020; pp. 390–391. [Google Scholar]
Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 13713–13722. [Google Scholar]
Wang, J.; Xu, C.; Yang, W.; Yu, L. A normalized Gaussian Wasserstein distance for tiny object detection. arXiv 2021, arXiv:2110.13389. [Google Scholar]
Ma, S.; Xu, Y. MPDIoU: A loss for efficient and accurate bounding box regression. arXiv 2023, arXiv:2307.07662. [Google Scholar] [CrossRef]
Hu, Z.; Zhai, Y.; Zhao, Z.; Wang, Q.; Zhai, B.; Yang, K.; Hu, P. Towards defect detection of transmission line insulator: A dataset, benchmarks and challenges. In Proceedings of the 2025 10th International Conference on Power and Renewable Energy (ICPRE), Hangzhou, China, 19–22 September 2025; pp. 473–482. [Google Scholar] [CrossRef]
Zheng, J.; Wu, H.; Zhang, H.; Wang, Z.; Xu, W. Insulator-defect detection algorithm based on improved YOLOv7. Sensors 2022, 22, 8801. [Google Scholar] [CrossRef]
Shan, L.; Zhang, H.; Cheng, B. SGNet: Efficient snow removal deep network with a global windowing transformer. Mathematics 2024, 12, 1424. [Google Scholar] [CrossRef]
He, K.; Sun, J.; Tang, X. Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 2341–2353. [Google Scholar] [CrossRef]
Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Liu, Y.; Chen, J. DETRs beat YOLOs on real-time object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 17–21 June 2024; pp. 16965–16974. [Google Scholar] [CrossRef]

Figure 1. Network architecture of YOLOv11n. The squares in the output image represent the predicted bounding boxes.

Figure 2. Network architecture of LID-YOLO. The squares represent the predicted bounding boxes.

Figure 3. Structure of the C3k2-CDGC module. The symbol ⊕ denotes element-wise addition.

Figure 4. Structure of the DGC. The symbol ⊗ denotes multiplication operations.

Figure 5. Structure of the CA mechanism.

Figure 6. Structure of the Detect-LSEAM.

Figure 7. Schematic comparison of calculation mechanisms between CIoU and NWD-MPDIoU. (a) CIoU; (b) the proposed NWD-MPDIoU. The concentric dashed ellipses represent 2D Gaussian distributions.

Figure 8. Distribution of the insulator defect dataset. (a) Instance counts across different categories; (b) scatter plot of normalized width and height for all bounding boxes; (c) spatial resolution distribution of the foundation images.

Figure 9. Comparison of images before and after data augmentation. (a) Simulated rainy conditions; (b) Simulated snowy conditions; (c) Simulated foggy conditions; (d) Simulated low-light conditions. The red boxes highlight the magnified view of the defect area.

Figure 10. Comparison of training box loss curves for different loss functions.

Figure 11. Confusion matrices. (a) Confusion matrix of the baseline YOLOv11n; (b) Confusion matrix of LID-YOLO.

Figure 12. Precision–Recall curves of different models.

Figure 13. Visual comparison of detection results across different models.

Figure 14. Comparison of heatmaps between YOLOv11n and LID-YOLO. Warmer colors (red) indicate higher activation regions.

Figure 15. Confusion matrix of the LID-YOLO model after applying data balancing.

Table 1. Typical insulator defect detection methods and their limitations in power grid inspections.

Method Category	Representative Approaches	Key Advantages	Limitations
Traditional vision methods	Image matching [18]; Edge detection [19,20]; Threshold segmentation [21]; Hand-crafted features with SVM/AdaBoost classifiers [22,23,24]	Low computational overhead; facilitates edge deployment	Poor generalization; highly sensitive to illumination and background variations
Defect-specific optimization	Anchor optimization [25]; Contextual feature aggregation [26]; Multi-spectral and receptive field attention [27,28]	Enhanced feature extraction for small-scale targets; high detection precision for subtle defects under favorable weather conditions	Lacks targeted measures against meteorological interference
Weather-robust detection	Independent image restoration modules combined with object detectors (e.g., deraining [29], snow removal [30], low-light enhancement [31])	Significantly restores visual features for specific weather degradations	Error amplification across disjoint modules; limited scalability to multi-weather scenarios
Weather-robust detection	End-to-end architectures (e.g., adversarial domain adaptation [32], feature decoupling [33,34])	Joint optimization effectively avoids feature misalignment and error accumulation	Often requires considerable computational resources; presents challenges for direct deployment on edge devices

Table 2. Experimental hyperparameter settings.

Parameter	Value
Epoch	300
Batch Size	24
Initial Learning Rate	0.01
Weight Decay	0.0005
Momentum	0.937
Optimizer	SGD
Image Size	640 × 640

Table 3. Ablation Experiments Results.

A	B	C	P (%)	R (%)	mAP@0.5 (%)	FLOPs (G)	Params (M)
			87.9	76.2	83.3	6.4	2.59
√			88.2	78.9	85.6	6.8	2.69
	√		89.2	78.1	85.2	5.8	2.66
		√	89.3	75.9	84.6	6.4	2.59
√	√		90.0	78.8	86.1	6.2	2.76
√		√	90.4	78.6	86.3	6.8	2.69
	√	√	88.2	79.4	86.6	5.8	2.66
√	√	√	89.6	79.8	87.5	6.2	2.76

Table 4. Sensitivity analysis of the weighting coefficient

β

in the NWD-MPDIoU.

Table 4. Sensitivity analysis of the weighting coefficient

β

in the NWD-MPDIoU.

Methods	P (%)	R (%)	mAP@0.5 (%)
$β$ = 0.5	89.9	78.3	85.5
$β$ = IoU	89.6	79.8	87.5
$β = {I o U}^{2}$	91.1	78.5	87.0
$β = \sqrt{I o U}$	88.6	80.1	87.1

Table 5. Performance comparison of different loss functions.

Methods	P (%)	R (%)	mAP@0.5 (%)
CIoU	90.0	78.8	86.1
GIoU	89.1	80.5	86.8
MPDIoU	90.8	76.6	86.6
NWD-MPDIoU	89.6	79.8	87.5

Table 6. Comparative experimental results.

Model	P (%)	R (%)	mAP@0.5 (%)	FLOPs (G)	Params (M)
YOLOv5n	86.9	75.5	83.2	5.8	2.18
YOLOv8n	86.8	77.5	84.3	6.8	2.69
YOLOv9t	89.1	77.0	85.1	6.4	1.73
YOLOv10n	85.7	77.9	83.9	6.5	2.27
YOLOv12n	88.6	76.3	84.0	6.3	2.55
YOLOv11s	90.8	77.8	86.6	21.3	9.41
RT-DETR-r18	90.8	81.1	87.8	57.0	19.9
Ours	89.6	79.8	87.5	6.2	2.76

Table 7. Performance comparison of the model before and after applying data balancing.

Methods	Class	P (%)	R (%)	mAP@0.5 (%)
Before Balancing	All	89.6	79.8	87.5
Before Balancing	Breakage	81.8	69.6	78.2
After Balancing	All	92.2	80.5	89.4
After Balancing	Breakage	84.7	74.6	83.6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cao, Y.; Jin, S.; Liu, Y. LID-YOLO: A Lightweight Network for Insulator Defect Detection in Complex Weather Scenarios. Energies 2026, 19, 1640. https://doi.org/10.3390/en19071640

AMA Style

Cao Y, Jin S, Liu Y. LID-YOLO: A Lightweight Network for Insulator Defect Detection in Complex Weather Scenarios. Energies. 2026; 19(7):1640. https://doi.org/10.3390/en19071640

Chicago/Turabian Style

Cao, Yangyang, Shuo Jin, and Yang Liu. 2026. "LID-YOLO: A Lightweight Network for Insulator Defect Detection in Complex Weather Scenarios" Energies 19, no. 7: 1640. https://doi.org/10.3390/en19071640

APA Style

Cao, Y., Jin, S., & Liu, Y. (2026). LID-YOLO: A Lightweight Network for Insulator Defect Detection in Complex Weather Scenarios. Energies, 19(7), 1640. https://doi.org/10.3390/en19071640

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

LID-YOLO: A Lightweight Network for Insulator Defect Detection in Complex Weather Scenarios

Abstract

1. Introduction

2. Methods

2.1. YOLOv11n Network Architecture

2.2. LID-YOLO Network Architecture

2.2.1. C3k2-CDGC Module

2.2.2. Detect-LSEAM

2.2.3. NWD-MPDIoU

3. Implementation and Analysis

3.1. Dataset

3.2. Experimental Setup and Evaluation Metrics

3.3. Ablation Studies and Detailed Analysis

3.4. Comparative Experiments

3.5. Visualization Results

3.6. Data Balancing Strategy for Hard Examples

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI