3.1. Dataset
To address the scarcity of real-world power grid insulator defect datasets under complex weather conditions, we initially sourced 1788 high-quality real-world images from multiple publicly available datasets [
43,
44], captured under normal weather conditions, to serve as the foundation for a synthetic dataset. Specifically,
Figure 8a details the number of instances across different categories in the real-world images, whereas
Figure 8b illustrates the normalized width and height distribution of all annotated bounding boxes. As indicated by
Figure 8b, the target instances in the dataset predominantly exhibit small scales and high aspect ratios. To provide a comprehensive profile of the image quality alongside these geometric characteristics,
Figure 8c further details the spatial resolution distribution of the foundation real-world images: resolutions under 1200 px account for 21.73%, between 1200 and 1920 px account for 62.25%, and over 1920 px account for 16.02%. The results indicate that the image resolutions are primarily concentrated in the common range near HD/FHD, while also covering samples with both lower and higher resolutions.
We partitioned the dataset into training, validation, and testing sets with a ratio of 7:2:1. Subsequently, offline augmentation was employed by randomly applying one of four typical weather effects—rain, snow, fog, or low-light conditions—to the original samples, expanding the dataset to a total of 3576 images. The simulation methods for these meteorological scenarios are detailed as follows:
- (1)
Simulation of Rainy Conditions
To simulate the visual degradation induced by rain, an additive noise model was adopted. Rainy samples were synthesized by superimposing a layer of directional rain streaks onto the original images. The mathematical formulation is expressed as:
Here, and denote the synthesized rainy image and the original image, respectively. represents the initial linear rain streak mask, which is generated using randomized orientations to model varying wind-driven rainfall. denotes a Gaussian blur kernel with a standard deviation of , used to simulate the motion blur induced by high-velocity raindrops. The symbol represents the convolution operation. Additionally, serves as the rain streak intensity coefficient, randomly sampled from [0.1, 0.4] to reflect diverse precipitation levels, and introduces random noise following a Gaussian (0, 0.01) distribution to model environmental lighting perturbations.
- (2)
Simulation of Snowy Conditions
Snowflakes are typically opaque or semi-transparent solid particles that cause spatial occlusion of insulators and their defects [
45]. To mimic this characteristic, a physical occlusion model was employed to simulate the random spatial distribution of snowflakes. The synthesis formula is expressed as:
where
denotes the randomly generated snowflake mask, characterized by a coverage density
to represent varying snowfall intensities.
represents the snowflake luminance intensity, assigned values in the range [0.8, 1.0] to simulate the high reflectivity characteristic of snowflakes.
- (3)
Simulation of Foggy Conditions
Fog primarily induces contrast reduction and color shifts [
46]. To simulate this visual degradation, the standard atmospheric scattering model was adopted, expressed as:
where
represents the synthesized foggy image,
denotes the global atmospheric light sampled from [0.7, 0.9] to represent typical overcast illuminance, and
refers to the medium transmission. Given that monocular images lack explicit depth information, a vertical gradient was employed as a proxy for depth
. Consequently, the transmission
is formulated as
, where the scattering coefficient
controls the fog density. This approach approximates the physical behavior that atmospheric scattering effects intensify as the observation distance between the optical sensor and the insulator target increases.
- (4)
Simulation of Low-light Conditions
To simulate visibility degradation and the loss of detail stemming from insufficient illumination during dusk, dawn, or heavily overcast conditions in power grid monitoring, linear scaling was applied to the original images. This process effectively replicates the imaging characteristics of optical sensor underexposure in edge devices. The mathematical model is expressed as:
where
represents the synthesized low-light image, and
denotes the luminance attenuation coefficient, randomly sampled from [0.3, 0.6] to emulate varying degrees of illumination deficiency without completely obliterating structural features.
As observed in
Figure 9, the augmented samples replicate the visual characteristics of real-world meteorological conditions. Specifically, rainy images display clear directional streaks and motion blur, while snowy samples demonstrate realistic spatial occlusion and contrast reduction. Foggy scenarios exhibit a notable whitening effect with blurred boundaries, and low-light images effectively reflect the texture loss caused by underexposure.
3.3. Ablation Studies and Detailed Analysis
To systematically validate how each proposed component of the LID-YOLO framework addresses specific bottlenecks in insulator defect detection under complex environments, we conducted comprehensive ablation studies under identical experimental settings. The quantitative results, which reflect the model’s reliability in identifying insulator defects, are detailed in
Table 3. Here, Model A denotes the integration of the C3k2-CDGC feature extraction module; Model B represents the addition of the Detect-LSEAM detection head; and Model C indicates the adoption of the NWD-MPDIoU hybrid loss function.
As presented in
Table 3, integrating the C3k2-CDGC module into the baseline model yields a 2.3% increase in mAP@0.5. This substantial improvement in recognition accuracy for insulator defects verifies that the proposed dynamic grouped convolution, combined with the coordinate attention mechanism, effectively extracts critical defect features when dealing with feature degradation under complex weather conditions. Furthermore, despite the introduction of dynamic convolution and coordinate attention mechanisms, the computational cost and parameter count only slightly increase by 0.4 G (FLOPs) and 0.1 M (Params), respectively. This efficiency is attributed to the module’s grouped structure, which partially offsets the additional computational overhead introduced by these dynamic and attention components.
When the Detect-LSEAM head is deployed independently, mAP@0.5 improves by 1.9%, accompanied by a notable rise in precision (from 87.9% to 89.2%). By utilizing the lightweight LSEAM mechanism, this module enhances the discriminative power between the macroscopic insulator body and its localized microscopic defects, while simultaneously reducing FLOPs by 0.6 G. This computational reduction further solidifies the model’s viability for UAV-based edge inference. Finally, training with the improved NWD-MPDIoU loss function leads to a 1.3% increase in mAP@0.5. This confirms that integrating distribution metrics with geometric constraints successfully addresses boundary ambiguity during bounding box regression, specifically mitigating the difficulty of isolating subtle breakage anomalies from the main insulator structure.
With the progressive integration of the proposed modules, the model’s defect detection capability exhibits a steady upward trend. Ultimately, the complete LID-YOLO model achieved an mAP@0.5 of 87.5%, representing a 4.2% improvement over the baseline YOLOv11n, alongside precision and recall gains of 1.7% and 3.6%, respectively. Furthermore, while the parameter count increases marginally by 0.17 M, the overall computational complexity (FLOPs) decreases from 6.4 G to 6.2 G, ensuring that the model does not introduce excessive computational and storage burdens overall.
Collectively, these structural optimizations enable LID-YOLO to achieve higher detection accuracy under typical image degradation across various weather conditions without incurring severe computational penalties. This confirms the validity of the proposed architectural modifications for reliable insulator defect detection.
Building upon the architectural validations, we further examined the proposed NWD-MPDIoU loss function, which incorporates a linear dynamic weighting strategy (
) by design. To validate the rationality of this configuration, a sensitivity analysis was conducted on
. We compared the adopted linear strategy (
) against a fixed weight (
) and two non-linear dynamic strategies (
and
), with the results presented in
Table 4.
As shown in the table, the fixed weight strategy yielded the lowest mAP@0.5 of 85.5%, indicating that maintaining a static balance between distribution metrics and geometric constraints is suboptimal throughout the dynamic training process. Among the non-linear strategies, achieved a slightly higher recall (80.1%) but suffered a noticeable drop in precision (88.6%). This aggressive strategy heavily penalizes geometric misalignment even when the prediction deviates significantly from the ground truth, making the model overly sensitive to background noise induced by complex weather conditions. In practical power system operations, the resulting increase in false positives may trigger excessive false alarms, potentially leading to more follow-up inspections and a higher maintenance workload. Conversely, the conservative strategy achieved the highest precision (91.1%) but a lower recall (78.5%), as it relies predominantly on the NWD loss unless the bounding boxes are highly overlapped. While this effectively filters out weather-induced artifacts, it weakens the geometric guidance for structurally ambiguous defects, resulting in missed detections of critical insulator faults.
Ultimately, the linear strategy demonstrated the most effective balance, yielding the highest mAP@0.5 of 87.5%. By dynamically shifting the optimization focus from distribution distance (NWD) in the early stages to geometric constraints (MPDIoU) in the later stages, it effectively mitigates the interference of image degradation while facilitating precise localization for tiny defects. This analysis corroborates the rationale behind adopting the linear dynamic strategy, demonstrating its capability to provide the necessary reliability and robustness for insulator defect detection in complex environments.
Having established the optimal internal configuration, we further investigated the overall efficacy of the proposed regression loss optimization in handling precise insulator defect localization.
Table 5 presents the comparative results of the NWD-MPDIoU loss function against the baseline CIoU, GIoU, and MPDIoU.
Comparative analysis indicates that although GIoU achieves the highest recall of 80.5%, its overall mAP is limited to 86.8%. Similarly, the baseline CIoU demonstrates balanced metrics but yields a comparatively lower overall detection accuracy with an mAP of 86.1%. While the standard MPDIoU attains the peak precision of 90.8%, its recall is limited to 76.6%, suggesting a tendency to miss difficult targets such as concealed insulator defects. By incorporating the distribution metric of NWD, the proposed method effectively mitigates this limitation for small-scale ambiguous defects, boosting recall by 3.2% compared to MPDIoU. Although accompanied by a slight decrease in precision, this trade-off results in a more robust overall performance. Consequently, the proposed NWD-MPDIoU achieves the highest mAP of 87.5%, outperforming CIoU, GIoU, and the standard MPDIoU in insulator defect detection tasks under complex weather conditions.
To intuitively demonstrate the guiding effect of different loss functions on model training, the bounding box regression loss curves are visualized in
Figure 10. As illustrated, the loss curve of NWD-MPDIoU exhibits a steep initial descent and stabilizes at a comparatively lower value. The variations in the loss curves validate the advantages of NWD-MPDIoU: it accelerates the regression convergence of boundary-ambiguous insulator defects via NWD in the early stages, while employing geometric constraints for fine-grained adjustments of the targets in the later training stages to achieve more accurate localization precision.
Beyond the bounding box regression optimization, to further evaluate the classification performance across different categories,
Figure 11 compares the confusion matrices of YOLOv11n and LID-YOLO. The improved model exhibits higher values along the diagonal, indicating lower misclassification rates across multiple classes. The most notable improvements are observed in the ‘breakage’ and ‘flashover’ categories, with absolute increases of 0.09 and 0.05, respectively. As these categories were the baseline’s weakest points due to ambiguous visual features, this result demonstrates that the proposed method significantly improves the recognition capability for such challenging defects even when complex weather degrades image features, verifying the rationality of the algorithmic improvements.
3.4. Comparative Experiments
To evaluate LID-YOLO’s insulator defect detection under weather-induced visual degradation in power grids, we benchmarked it against mainstream models. The comparative models include mainstream lightweight models from the YOLO series and the Transformer-based RT-DETR-r18 [
47]. All models were evaluated under identical experimental environments and hyperparameters, using the same dataset. The comparative results are presented in
Table 6.
As presented in
Table 6, the proposed LID-YOLO achieves an mAP@0.5 of 87.5%, significantly outperforming other lightweight YOLO models of comparable scale. Importantly, the model achieves a recall of 79.8%, exhibiting a notable advantage over the comparative lightweight YOLO variants in identifying concealed or tiny defects. In power system maintenance, a critical challenge is minimizing missed detections that can escalate into severe flashovers and grid-wide outages, while simultaneously avoiding excessive false positives that waste maintenance resources. Compared to other YOLO variants, LID-YOLO successfully achieves this optimal balance. This high recall, coupled with the highest overall mAP@0.5, indicates that the synergistic design of dynamic noise filtering and robust bounding box regression effectively intercepts subtle insulator faults without overwhelming the inspection system, thereby securing the operational resilience of the transmission network.
When compared to the larger YOLOv11s model, LID-YOLO achieves a 0.9% higher mAP@0.5 and a 2.0% higher recall, while reducing parameters and FLOPs by approximately 70.7% and 70.9%, respectively. This efficiency demonstrates that addressing extreme scale variations via targeted contextual enhancement is a more effective strategy than merely scaling up the network capacity. While the Transformer-based RT-DETR-r18 achieves the highest detection metrics, it incurs a substantial computational burden, requiring 57.0 G FLOPs and 19.9 M parameters. In contrast, the proposed model is only 0.3% lower in mAP@0.5, yet reduces FLOPs and parameters by approximately 89% and 86%, respectively, as the multi-scale contextual enhancement serves as a highly efficient alternative to heavy Transformer blocks for capturing necessary structural dependencies. This confirms that LID-YOLO achieves a favorable balance between computational efficiency and detection accuracy, making it more suitable for resource-constrained edge devices utilized in power line inspections.
Figure 12 illustrates the Precision–Recall (PR) curves of all comparative models.
As depicted, the PR curve of LID-YOLO consistently envelops those of most lightweight YOLO variants, exhibiting a noticeably more gradual decline in precision as recall increases. This reveals that the model can successfully retrieve more challenging or weather-obscured insulator defects without inadvertently introducing a substantial number of false positives. Consequently, this robust PR performance visually corroborates the effectiveness of the proposed dynamic convolution and contextual enhancement mechanisms, reaffirming the model’s capability to minimize missed detections and secure the operational reliability of energy systems.
3.5. Visualization Results
To intuitively demonstrate the insulator defect detection performance of LID-YOLO under complex weather conditions, representative samples from various weather scenarios were selected for visual comparison, as shown in
Figure 13.
Under normal lighting conditions, the majority of models accurately localized both insulators and their defects. An exception is YOLOv8n, which failed to detect the small breakage defect. In rainy and snowy scenarios, which are particularly prone to inducing severe high-frequency background noise and spatial occlusion, the feature extraction capabilities of the models are heavily tested. Consequently, YOLOv10n failed to detect both breakage defects on the insulator, while YOLOv5n, YOLOv8n, and YOLOv9t each missed one instance, and YOLOv11s failed to detect a flashover defect under snowy conditions. In contrast, YOLOv12n, RT-DETR-r18, and LID-YOLO exhibited robust performance, successfully detecting all targets despite the severe meteorological interference. In foggy environments, the whitening effect severely degrades the texture and color features on the insulator surface, making characteristics like flashovers much more difficult to discern. Under these conditions, half of the compared models failed to detect all flashover defects. However, LID-YOLO successfully localized all flashover regions, demonstrating the adaptability of the proposed loss function to such boundary-blurred defects. Under low-light conditions, all models, with the exception of YOLOv10n, successfully detected the small defects, with LID-YOLO achieving the highest confidence score. In summary, the visual results reveal a prevalent tendency of missed detections among standard YOLO variants across various complex weather conditions. These visual results corroborate the issue of generally low recall rates among the YOLO models shown in
Table 6. In practical energy systems, such missed detections are highly critical; undetected latent defects can rapidly deteriorate under continuous electrical stress, eventually triggering more severe accidents. Conversely, the proposed LID-YOLO effectively overcomes this bottleneck, achieving a high recall of 79.8%—second only to the computationally heavy RT-DETR-r18. Overall, these qualitative results confirm LID-YOLO’s robust adaptability for insulator defect detection under various weather-induced visual degradations typical of automated power grids and demonstrate that it can achieve better performance while maintaining the computational efficiency of the YOLO series.
To provide a deeper qualitative interpretation of the internal feature representations beyond the bounding box predictions, gradient-weighted class activation mapping (Grad-CAM) was employed to visualize class activation heatmaps.
Figure 14 illustrates the comparative heatmaps generated by the baseline YOLOv11n and the proposed LID-YOLO under both normal conditions and the four complex weather scenarios.
The visualizations reveal that the baseline model generally struggles with severe feature dispersion. Its activation regions frequently extend to adjacent tower structures or drift towards environmental noise, resulting in discontinuous and patchy heatmaps on the target insulators. In power line inspections, such feature drift often leads to background overfitting, causing the model to generate false alarms or fail entirely when deployed in varying geographic corridors. In contrast, the proposed LID-YOLO consistently maintains an enhanced and robust attentional focus across all evaluated scenarios. Its high-activation regions align precisely with the actual insulator bodies and potential defect areas, effectively suppressing background clutter and environmental interference. This consistent activation stability visually reflects that the proposed modules not only enhance the overall feature representation capacity of the network but also effectively filter out variable environmental noise to anchor structural features under severe degradations. Consequently, this robust feature extraction capability directly supports more reliable insulator defect detection, thereby safeguarding the operational stability of energy transmission systems under complex weather conditions.