YOLOv8-IDX: Optimized Deep Learning Model for Transmission Line Insulator-Defect Detection

Farooq, Umer; Yang, Fan; Shahzadi, Maryam; Ali, Umar; Li, Zhimin

doi:10.3390/electronics14091828

Open AccessCommunication

YOLOv8-IDX: Optimized Deep Learning Model for Transmission Line Insulator-Defect Detection

by

Umer Farooq

¹,

Fan Yang

^1,*

,

Maryam Shahzadi

²,

Umar Ali

¹

and

Zhimin Li

¹

State Key Laboratory of Power Transmission Equipment Technology, School of Electrical Engineering, Chongqing University, Chongqing 400044, China

²

Department of Computer Science, The University of Faisalabad, Faisalabad 38000, Pakistan

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(9), 1828; https://doi.org/10.3390/electronics14091828

Submission received: 6 March 2025 / Revised: 15 April 2025 / Accepted: 23 April 2025 / Published: 29 April 2025

Download

Browse Figures

Versions Notes

Abstract

:

Efficient insulator-defect detection in transmission lines is crucial for ensuring the reliability and safety of power systems. This study introduces YOLOv8-IDX (You Only Look Once v8—Insulator Defect eXtensions), an enhanced DL (Deep Learning) based model designed specifically for detecting defects in transmission line insulators. The model builds upon the YOLOv8 framework, incorporating advanced modules, such as C3k2 in the backbone for enhanced feature extraction and C2fCIB in the neck for improved contextual understanding. These modifications aim to address the challenges of detecting small and complex defects under diverse environmental conditions. The results demonstrate that YOLOv8-IDX significantly outperforms the baseline YOLOv8 in terms of mean Average Precision (mAP) by 4.7% and 3.6% on the IDID and CPLID datasets, respectively, with F1 scores of 93.2 and 97.2 on the IDID and CPLID datasets, respectively. These findings underscore the model’s potential in automating power line inspections, reducing manual effort, and minimizing maintenance-related downtime. In conclusion, YOLOv8-IDX represents a step forward in leveraging DL and AI for smart grid applications, with implications for enhancing the reliability and efficiency of power transmission systems. Future work will focus on extending the model to multi-class defect detection and real-time deployment using UAV platforms.

Keywords:

insulator-defect detection; deep learning; object detection; YOLOv8-IDX; UAVs; power line inspection

1. Introduction

Electricity transmission through power grids relies on high-voltage lines supported by power towers and electrical insulators. While insulators do not conduct electricity, they play a crucial role in suspending transmission lines above the ground and preventing grounding. These components must withstand supply voltage stresses and bear the mechanical load of transmission lines, making them integral to the safety and stability of high-voltage systems [1,2,3,4,5]. Insulators facilitate electrical separation and support during transmission by serving as insulating controls on power towers. However, prolonged exposure to outdoor environments subjects them to erosion caused by sunlight, rain, and snow, which can degrade their performance [6]. In addition to environmental factors, insulators are vulnerable to overvoltage shocks from lightning strikes, switching operations, mechanical loads, and the weights of wires and metal accessories. These stressors increase the likelihood of self-blasting, breakage, and other defects that reduce their service lifespans [7]. Defective or aged insulators can compromise the operational integrity of the power grid, leading to regional failures and significant economic losses if not detected and addressed promptly [6,7].

Detecting insulator defects is critical for maintaining transmission line safety. Traditionally, manual inspections have been employed; however, these methods are labor-intensive, time-consuming, and pose safety risks to personnel [8]. The adoption of drone-based inspection technology has significantly improved its efficiency. Drones equipped with high-resolution cameras can capture extensive visual data on power lines, enabling the timely identification of defects [9]. Transmission line detection is crucial for UAV inspections; however, it faces noise issues and segmentation limitations. A lightweight detection network is proposed, which integrates the FPN with the Hough transform for better edge detection and a multi-scale output structure to improve performance. A dedicated dataset was developed to enhance UAV-based inspections [10]. This advancement reduces the burden on inspection teams and enhances the quality of power line monitoring. While traditional machine learning techniques, such as a convolution neural network (CNN), a region-based convolutional neural network (R-CNN) [11], and Faster R-CNN [12]. Earlier versions of You Only Look Once (YOLO) [13], such as v3, v4, Hough transforms, and edge detection, have been used for defect identification, but they struggle to handle the noise and complex backgrounds typical of aerial images. Furthermore, inconsistent defect characteristics often pose challenges for these methods. In recent years, deep learning has emerged as a robust alternative, offering improved feature extraction and real-time detection capabilities [1,2,3,4,5].

Deep-learning-based methods for insulator-defect detection are generally categorized into one-stage and two-stage object detection algorithms. One-stage models are known for their efficiency and high frames-per-second (FPS) output, while two-stage models prioritize accuracy, albeit at a higher computational cost. The recent introduction of transformer-based detection methods has further enhanced this field by providing end-to-end solutions for defect detection [14,15,16]. These advancements mark a significant step toward improving the reliability and longevity of high-voltage power systems.

Various studies have addressed the challenges of detecting insulator defects in transmission lines using YOLO-based approaches. One approach minimizes the adverse effects of uneven lighting on YOLOv5-based insulator-defect detection through a combination of multi-scale gradient-domain guided filtering and two-dimensional adaptive gamma transformation [17]. Another solution integrated ResNet-18 with YOLOv5-X in a hybrid model to enable the detection of faulty components in transmission lines using a single UAV system [18]. Additionally, the uncertainty in defect detection was mitigated by employing a Gaussian function preceding the inspection head of YOLOX [19]. Additionally, a high-precision object detection model based on YOLOX is incorporated to improve fault identification in high-voltage transmission line inspections [20]. These efforts demonstrate the applicability of models such as SDD, YOLOv4, YOLOv5, and YOLOX for transmission line defect detection. However, practical experiments have revealed that these methods require high-quality images, struggle to detect small insulators and defects in complex backgrounds and require enhancements in both detection accuracy and efficiency. Currently, there is limited research on the application of the YOLOv8 model for insulator-defect detection. Additionally, while the model shows promise, its accuracy still requires improvement, particularly when dealing with the challenges posed by complex transmission line backgrounds and small defect targets.

This work introduces two key modifications to the YOLOv8 model for insulator-defect detection, enhancing its accuracy, robustness, and efficiency. The first modification integrates the C3k2 module, replacing the C2f module. The C3k2 module improves feature aggregation by combining multi-scale extraction with a cross-stage partial network (CSPNet)-inspired feature partitioning, ensuring better hierarchical feature integration and reducing the vanishing gradient problem. Additionally, the K2 component refines the feature maps across scales, improving spatial awareness for detecting small defects. The lightweight structure optimizes the computational efficiency by balancing high accuracy with faster inference times. The second modification adds a Compact Inverted Block (CIB) to the C2f module, improving the detection of small defects. The CIB adaptively selects and fuses relevant features, bypassing unnecessary features to reduce redundant computations. This enhances feature utilization, speeds up convergence, and improves the detection of subtle defects in transmission lines. The combination of CIB and C2f strengthens the model’s accuracy, efficiency, and robustness in real-world defect detection.

The remainder of this paper is organized as follows: Section 2 reviews previous work on transmission line insulator-defect detection based on deep learning. Then, in Section 3, we describe the proposed methodology. In Section 4, we present the experimental verification and discuss our results. Finally, in Section 5, we summarize our work and propose future research plans.

2. Related Work

Object detection involves predicting a bounding box to indicate an object’s category and location. Similarly, insulator and insulator-defect detection focuses on identifying insulators by surrounding them with bounding boxes and classifying their categories or defect types. Early efforts in insulator defect detection relied on a combination of computer vision and traditional machine learning techniques, which depended heavily on the use of hand-crafted features. These approaches are time-intensive and require expert input [21,22]. In recent years, deep-learning-based detectors have gained prominence for insulator detection tasks [23,24,25,26]. These detectors can be broadly categorized into one-stage and two-stage frameworks. One-stage detectors, such as those in the YOLO family [13,27,28,29], prioritize detection speed, while two-stage detectors, including R-CNN and its derivatives [11,30,31], offer higher accuracy by incorporating an additional refinement step.

One-stage detectors have been extensively explored for the detection of insulator defects. Yang et al. introduced a lightweight backbone into the YOLOv3 architecture, leveraging MobileNet [32] and spatial pyramid pooling [33] to identify missing-cap insulators [24]. Similarly, lightweight YOLOv4 models have been proposed to balance the detection accuracy and speed for insulator detection [23]. These approaches employ MobileNet as a backbone replacement to optimize computational efficiency. Han et al. enhanced the TinyYOLOv4 framework by integrating a self-attention module into a Feature Pyramid Network (FPN), improving channel-level feature fusion and feature representation [23,34]. YOLOv5 has also been utilized in insulator-defect detection research, with multiple versions evaluated to identify the most suitable architecture [35]. Gao et al. improved YOLOv5 by incorporating a triplet attention module to enhance the detection of small defects [36]. Lan et al. further augmented YOLOv5 with a Convolutional Block Attention Module (CBAM) to capture more channel and spatial context information [37]. Two-stage object detection frameworks have also been applied to insulator detection tasks. Faster R-CNN, for instance, has been employed to generate region proposals that are refined in a second-stage network for precise defect localization [26,38]. Tao et al. modeled insulator-defect detection as a two-level task, combining two Faster RCNNs—one with a VGG16 backbone for localization and another for detecting defective regions [25]. Zhong et al. extended the Faster R-CNN pipeline to handle arbitrarily oriented insulator localization [39].

In addition to detection frameworks, segmentation approaches have been explored for insulator-defect detection and have demonstrated promising results. Li et al. proposed a cascaded framework that incorporated an improved U-Net with an attention mechanism for global insulator detection and local defect segmentation [40,41]. Efficient channel attention (ECA-Net) was introduced as the U-Net encoder, further enhancing the segmentation performance [42]. Yu et al. refined the SINet architecture by integrating fine-grained textures and improving the positioning network for defect segmentation [5]. Antwi-Bekoe et al. utilized an instance segmentation framework that combines detection and mask branches for instance-level segmentation [43]. Xuan et al. introduced a squeeze-excitation module in the backbone and a spatial attention module to enhance insulator mask prediction, achieving excellent segmentation results [44]. Building on these advancements, this work modifies the YOLOv8 model by leveraging segmentation techniques to enhance its accuracy, robustness, and efficiency. The integration of the C3k2 module improves feature aggregation by combining multi-scale extraction with CSPNet-inspired feature partitioning, ensuring better hierarchical feature integration and reducing the vanishing gradient problem. Additionally, the K2 component refines the feature maps across scales, improving spatial awareness and enabling the efficient detection of small defects. Furthermore, incorporating a Compact Inverted Block (CIB) enhances the model’s ability to adaptively select and fuse relevant features, reducing redundant computations and improving the detection of subtle defects. These modifications significantly enhance the model’s accuracy, efficiency, and robustness in real-world defect identification and localization.

3. Methodology

3.1. YOLOv8 Arcitecture

A comparison between the YOLOv5 (v6.0) and YOLOv8 architectures highlights significant advancements in design, optimization, and feature extraction, which help justify the selection of YOLOv8 as the baseline for this research. In terms of the backbone, YOLOv5 employs C3 modules based on CSPNet for feature partitioning and aggregation. While this approach effectively captures hierarchical features, it tends to introduce computational overhead due to the complexity of its module configurations. Additionally, YOLOv5 lacks advanced mechanisms for progressive feature fusion, which may limit its performance when detecting smaller objects. In contrast, YOLOv8 uses more streamlined C2f modules, which enhance feature extraction through a split-and-merge structure. This design improves efficiency, reduces memory usage, and facilitates a better gradient flow during training. These enhancements make YOLOv8 more suitable for applications that require both accuracy and high computational efficiency.

The head architecture also highlights the evolution from YOLOv5 to YOLOv8. YOLOv5 relies on concatenation operations, followed by C3 modules for multi-scale detection at the P3, P4, and P5 levels. While reliable for detecting objects of varying sizes, this design may not fully optimize small object detection. In contrast, YOLOv8 incorporates C2f modules into the detection head, thereby improving the handling of multi-scale features and spatial awareness. This refinement allows for better localization and recognition of objects across scales, particularly for smaller or more subtle defects. Additionally, YOLOv8, as shown in Figure 1, simplifies the processing pipeline, achieving an effective balance between performance and computational cost. YOLOv8 offers several key advantages over YOLOv5. Its lightweight architecture enables faster inference times, making it highly suitable for real-time applications, such as insulator-defect detection. The C2f modules enhance feature aggregation and spatial awareness, thereby improving the model’s ability to detect small and subtle defects. Furthermore, YOLOv8 the modular design supports the scalability and seamless integration of additional enhancements, such as the C3k2 and CIB modules planned in this research. Given these improvements, YOLOv8 provides a more robust foundation for insulator-defect detection. Its modernized architecture ensures high accuracy, efficiency, and adaptability, aligning with the goal of achieving reliable detection while maintaining real-time capabilities. These qualities position YOLOv8 as a superior baseline to YOLOv5.

3.2. YOLOv8-IDX

The YOLOv8-IDX model shown in Figure 2 represents a significant enhancement over the baseline YOLOv8 architecture, particularly for detecting insulator defects. This improved model introduces key modifications to the backbone and head to increase precision, feature utilization, and adaptability to real-world challenges. The backbone of YOLOv8-IDX retains the initial convolutional pipeline for low-level feature extraction, which ensures efficiency in handling the earlier layers. However, a critical upgrade is the inclusion of the C3k2 module instead of the last C2f module at the deepest level of the backbone. The C3k2 module was added to address the specific challenges of detecting small and intricate defects, which are common in insulator images. This module enhances feature aggregation by leveraging multi-scale feature integration combined with CSPNet-inspired feature partitioning, thereby enabling the model to capture hierarchical and spatially consistent features at various resolutions. The K2 component within the C3k2 module further improves spatial awareness, making it particularly effective for identifying fine-grained anomalies, even in challenging conditions, such as low contrast or occlusions.

The head of YOLOv8-IDX also incorporates a transformative change by replacing the C2f module at the largest detection scale with the C2fCIB module. The Compact Inverted Block (CIB) embedded in this module dynamically selects and fuses features based on spatial and contextual cues that are essential for accurate defect localization. This mechanism reduces redundancy in feature selection, enhances computational efficiency, and improves the model’s ability to detect subtle anomalies, particularly in scenarios involving overlapping objects or low contrast. Meanwhile, the C2f modules are retained for smaller detection scales to maintain the baseline’s strength in efficient multi-scale feature aggregation and refinement. The rationale for adding the C2fCIB module is to enhance the model’s ability to differentiate and localize defects accurately, even when they are partially occluded or have low contrast. By combining these enhancements, YOLOv8-IDX achieves superior feature representation and defect detection performance compared to the baseline. The introduction of the C3k2 module strengthens the model’s ability to process complex, large-scale features, while the inclusion of C2fCIB improves feature utilization for large-scale object detection. These modifications not only improve the accuracy but also ensure computational efficiency, making the model well-suited for real-world defect detection applications. The overall architecture maintains a balance between lightweight processing and high precision, establishing YOLOv8-IDX as a robust and scalable solution for insulator-defect detection.

3.2.1. C3k2

The C3k2 module is a pivotal enhancement of the YOLOv8-IDX model, designed to address critical challenges in feature aggregation and representation, particularly for applications requiring fine-grained defect detection. Its introduction at the deepest level of the backbone significantly improves the model’s capability to process complex and hierarchical features. One of the key innovations of the C3k2 module is its ability to integrate multi-scale feature extraction with CSPNet-inspired feature partitioning. This ensures that features from different resolutions are effectively combined, resulting in richer and more discriminative feature maps. Such multi-scale integration is essential for capturing both the global context and fine details, which is especially beneficial for identifying small and intricate defects in transmission line insulators. The K2 component of the module adds another layer of refinement by improving spatial awareness at different scales. This enhances the ability of the model to maintain consistency in spatial information, which is crucial for detecting small anomalies that might otherwise be overlooked in traditional architectures. This capability is particularly significant in scenarios involving occlusions, low contrast, or environmental noise, in which precise spatial alignment is critical.

Additionally, the lightweight nature of the C3k2 module ensures that these enhancements do not come at the cost of increased computational overhead. By optimizing feature aggregation and partitioning, the module strikes a balance between accuracy and efficiency, thereby enabling faster inference without compromising performance. This makes the C3k2 module an ideal choice for real-time applications in resource-constrained environments. In summary, the C3k2 module plays a crucial role in enhancing the YOLOv8-IDX model by improving hierarchical feature integration, refining spatial consistency, and maintaining computational efficiency. Its design directly addresses the challenges of small and complex defect detection, making it a cornerstone of the model’s advancement over the baseline architecture.

The module leverages the CSPNet-inspired approach of feature partitioning and recombination. Let the input feature map be

X \in R^{H \times W \times C}

, where

H, W, C

represent the height, width, and number of channels, respectively.

The feature map

X

is split into two partitions:

X_{1}, X_{2} = S p l i t (X)

(1)

were,

X_{1}, X_{2}

are subsets of the input features with dimensions

H \times W \times \frac{C}{2}

. The input feature map

X

is split into two equal partitions,

X_{1}

and

X_{2}

to allow for independent processing. This is a key step in CSPNet-inspired partitioning, where each partition can be processed differently, enabling the network to focus on distinct patterns within the feature map. This partitioning is crucial for enhancing both feature extraction efficiency and computational efficiency.

Each partition is processed independently through a series of convolutions:

Y_{1} = f_{C o n v} (X_{1}), Y_{2} = f_{C o n v} (X_{2})

(2)

each partition,

X_{1}

and

X_{2}

, is processed through a series of convolutions denoted by

f_{C o n v}

. The aim here is to refine each partition by learning local patterns, which is particularly important for detecting fine details in insulator defects. The convolution operation is essentially a feature extractor that detects local textures and structures.

The multi-scale extraction incorporates

k = 2

, which applies convolutional kernels of size

3 \times 3

and

5 \times 5

in parallel to improve spatial awareness:

Y_{1}^{'} = f_{{C o n v}_{3 \times 3}} (Y_{1}), Y_{2}^{'} = f_{{C o n v}_{5 \times 5}} (Y_{2})

(3)

The multi-scale feature extraction involves using kernels of size

3 \times 3

and

5 \times 5

in parallel. This design allows the network to capture both small and large spatial patterns simultaneously. The parallel convolution layers ensure that the model can learn features at multiple scales, which is important for capturing defects of varying sizes.

These processed features are concatenated and passed through additional layers for feature aggregation:

Z = C o n c a t (Y_{1}^{'}, Y_{2}^{'})

(4)

Once the features from both partitions are processed, they are concatenated. This concatenation allows the model to combine information from both partitions, facilitating richer feature maps. The goal is to merge the extracted information in a way that enhances the model’s overall understanding of the input data.

Z^{'} = f_{{C o n v}_{1 \times 1}} (Z)

(5)

After concatenation, a 1 × 1 convolution is applied to further refine and aggregate the features. The purpose of this operation is to reduce the dimensionality of the concatenated feature map while retaining the most important information. This process reduces computational overhead and helps the model focus on the most relevant features for defect detection. This approach ensures hierarchical feature extraction and aggregation, improving the network’s capability to handle small and large-scale defect features while maintaining computational efficiency.

Reduced Vanishing Gradient Problem:

By partitioning and fusing features, the module reduces the effective path length of gradients during backpropagation, expressed as:

\frac{δ L}{δ W} = \sum_{i = 1}^{N} \frac{δ L}{δ Z_{i}} . \frac{δ Z_{i}}{δ W}

(6)

where

L

is the loss,

Z_{i}

represents intermediate layers, and

W

represents weights. Equation (6) demonstrates the reduction in the vanishing gradient problem by partitioning and fusing features. In deep networks, backpropagated gradients can diminish as they move backward through the layers, making training difficult. The shorter paths in this architecture, due to feature partitioning, help maintain gradient flow, which leads to more effective learning and faster convergence.

3.2.2. C2fCIB

The C2fCIB module combines the strengths of the C2f structure with the Compact Inverted Block (CIB), resulting in a robust and efficient framework for feature extraction and processing. The integration of CIB within C2f enables the model to leverage a compact inverted bottleneck structure. This structure enhances computational efficiency by expanding the feature dimensions in the intermediate layers and refining them using depth-wise convolutions, effectively capturing spatial details while maintaining a lightweight design.

The CIB further employs adaptive feature fusion by dynamically selecting and refining relevant features while bypassing redundant information. This mechanism not only improves feature utilization but also accelerates convergence during training, making the model more effective in scenarios that require real-time performance. By retaining identity shortcuts, the CIB ensures a seamless feature flow across layers, enabling robust gradient propagation and preserving essential information for accurate defect detection. In the context of the C2fCIB module, the combination of hierarchical feature processing in C2f with the compact and adaptive design of the CIB significantly enhances the model’s capacity to detect subtle and small-scale defects. This synergy ensures a balance between high accuracy and computational efficiency, making it a pivotal innovation for improving the baseline YOLOv8 architecture for insulator-defect detection.

The C2fCIB module combines the functionality of the C2f structure with a Compact Inverted Block (CIB) to enhance feature refinement and utilization.

C2f Structure:

The C2f approach processes the input

X

as:

Split input into

n

-partitions:

X_{i} = S p l i t (X), i = 1,2, \dots ., n

(7)

This illustrates how the input

X

is split into

n

partitions. This allows the network to process different parts of the input independently, ensuring that no information is lost during the feature extraction process. By using multiple partitions, the model can learn from various regions in the data in parallel.

Here, each partition

X_{i} \in R^{H \times W \times \frac{C}{n}}

is processed through a bottleneck layer:

Y_{i} = f_{B o t t l e n e c k} (X_{i})

(8)

Bottleneck layers reduce the computational cost by projecting

C

channels into

\frac{C}{e}

, where

e

is the bottleneck expansion ratio. This reduces computational costs without significantly sacrificing performance, especially for real-time applications.

Once the partitions are processed independently, they are concatenated:

Z = C o n c a t (Y_{1}, Y_{2}, \dots ., Y_{n})

(9)

The purpose of this concatenation is to bring together the features learned from each partition, allowing the model to integrate different perspectives of the input data and create a richer feature map.

Z^{'} = f_{{C o n v}_{1 \times 1}} (Z)

(10)

A final convolution with a 1 × 1 kernel is applied after the concatenation step. This operation helps merge the feature maps into a unified representation, which is crucial for making final predictions about the detected defects.

Compact Inverted Block (CIB):

The CIB further enhances feature refinement by employing an inverted bottleneck structure:

In the Compact Inverted Block (CIB), the feature dimensions are expanded first before applying depth-wise convolutions:

X^{'} = f_{E x p a n d} (X)

(11)

where

f_{E x p a n d}

projects

C

channels to

C . e

. Expanding the features allows the model to capture richer information while the depth-wise convolutions refine this information in a more computationally efficient manner.

Depth-wise convolution for spatial refinement is applied:

Y = f_{D e p t h w i s e C o n c} (X^{'})

(12)

this operation is more efficient than traditional convolutions because it operates on each channel separately, allowing the model to refine spatial features without increasing computational cost.

Here, the feature fusion occurs using a pointwise convolution and a gating mechanism, denoted by

α

:

Z = α . f_{P o i n t w i s e C o n v} (Y) + (1 - α) . X

(13)

Here,

α

dynamically controls how much of the original input feature map and the newly learned feature map contribute to the final output. This dynamic selection helps the network focus on the most important features for accurate defect detection.

This gating mechanism can be represented as:

α = σ (W_{α}^{T} X + b_{α})

(14)

where

σ

is a sigmoid activation,

W_{α}

are learnable weights and

b_{α}

is the bias term. The gating mechanism works by applying a sigmoid activation function to a linear transformation of the input. The result is a value between 0 and 1, which determines the weight of each feature in the final output. This adaptability ensures that only the most relevant features are passed forward.

Adaptive Feature Selection:

The CIB adaptively selects relevant features by minimizing redundancy, ensuring efficient feature utilization:

Z = \sum_{i = 1}^{n} a_{i} . f_{C o n v} (X_{i}) + β_{i} . X_{i}

(15)

where

a_{i}

and

β_{i}

are learnable parameters for feature weighting which adjust how the features are combined. This adaptive fusion mechanism allows the model to fine-tune its feature representation, ensuring that redundancy is minimized and that only the most important features are used for defect detection.

4. Experimentation and Results

4.1. Dataset

4.1.1. IDID Dataset

Lewis et al. [45] developed the Insulator Defect Image Dataset (IDID¹), a collection of high-resolution images of insulator chains with defective components. The dataset, consisting of 1596 images, is accessible on IEEEDataPort. It is divided into three subcategories: intact insulators, broken insulator shells, and those damaged by flashovers, with the class balance of 34.5%, 19.5 and 50%, respectively, for model training. Table 1 shows the dataset division for training, validation and testing. Representative images with successful detections on the IDID dataset by YOLOv8-IDX are shown in Figure 3.

4.1.2. CPLID Dataset

Raimundo et al. [46] introduced the China Power Line Insulator Dataset (CPLID²), a publicly available resource hosted on IEEEDataPort. This dataset contains a diverse collection of images, including 600 UAV-captured images of normal insulators and 248 synthetically generated images of defective insulators. The class balance for defects and insulators is 17% and 83%, respectively, for model training. Table 1 shows the dataset division for training, validation and testing. Defective images are created by isolating small segments of the original insulator images through the TVSeg algorithm. Affine transformations are then employed to augment the dataset, producing numerous variations of the original mask images. These augmented images are further integrated into diverse backgrounds, such as urban landscapes, rivers, fields, and mountainous regions, thereby enhancing the dataset’s versatility. Representative images with successful detections on the CPLID dataset using YOLOv8-IDX are shown in Figure 4.

4.2. Evaluation Matrix and Hyperparameters

The evaluation of insulator-defect detection performance is typically based on widely used object detection metrics, including precision (P), recall (R), F1 score, and mean Average Precision (mAP). These metrics provide a comprehensive understanding of the model’s ability to correctly identify and localize the defects. Below, we provide the equations for these metrics, which are derived from the key detection outcomes: true positives (TP), false positives (FP), and false negatives (FN).

P = \frac{T P}{(T P + F P)}

(16)

R = \frac{T P}{(T P + F N)}

(17)

F 1 = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s o n + R e c a l l}

(18)

m A P @ 0.5 = \frac{1}{C} \sum_{i = 1}^{C} A P_{i}

(19)

In these equations, TP refers to correctly identified defects, FP represents incorrectly identified or false detections, and FN indicates the actual defects that were not detected. Precision measures the accuracy of the model predictions while recall evaluates the ability of the model to detect all relevant defects in the image. The F1 score combines precision and recall into a single value, and the mAP provides a holistic measure by averaging the AP across all classes.

The YOLOv8-IDX model was trained in a robust computational environment provided by Google Colab, utilizing an NVIDIA A100 GPU. This powerful GPU offers 40 GB of dedicated VRAM, allowing for the efficient handling of high-resolution images and large batch sizes during training. The system’s approximately 83.5 GB of RAM further supports the computational requirements, ensuring the smooth execution of data preprocessing, augmentation, and model training tasks. This environment enabled optimal performance and accelerated training of the YOLOv8 model, ensuring high precision and accuracy in detecting and classifying objects. The hyperparameters for training are listed in Table 2.

4.3. YOLOv5 and YOLOv8 Experiments

This section presents a detailed analysis of the YOLOv8-IDX model’s performance compared to other mainstream models, specifically focusing on two datasets, IDID and CPLID, with training times of 46 min and 10 min, respectively. The performance metrics include precision (P), recall (R), mean average precision at an IoU threshold of 0.5 (mAP@0.5), mean average precision at an IoU threshold of 0.95 (mAP@0.95), and F1 score. These metrics provide a comprehensive understanding of the model’s capabilities in terms of both detection accuracy and robustness. Table 3 shows the results of the comparison between YOLOv8-IDX and the other models.

On the IDID dataset, YOLOv8-IDX outperformed all other models across all evaluation metrics, demonstrating its superior capability for defect detection and classification.

YOLOv8-IDX achieved a precision score of 0.941, surpassing the baseline YOLOv8 model’s precision score of 0.914. This improvement highlights the enhanced ability of YOLOv8-IDX to minimize false positives, which is critical in real-world defect detection scenarios. Furthermore, its recall score of 0.925 marked a substantial improvement compared with YOLOv8’s 0.880, indicating that YOLOv8-IDX was better at identifying all relevant instances in the dataset. This high recall minimizes the likelihood of missed detections, which is especially crucial for safety-critical applications. The mAP@0.5 score of 0.951 showcases YOLOv8-IDX’s robust performance in identifying defects with a high degree of confidence. This result not only outperformed YOLOv8 (0.904) but also models like YOLOv8+Transformer (0.939), reflecting the efficacy of the modifications introduced in YOLOv8-IDX’s architecture. Additionally, the mAP@0.95 score of 0.654 demonstrates its ability to maintain accuracy even under stringent IoU thresholds, surpassing the baseline YOLOv8 score of 0.624 and YOLOv7 score of 0.553. This capability is indicative of the model’s precision in predicting object boundaries, which is a critical aspect of defect localization. The F1 score of 0.932, the highest among all models, underscores YOLOv8-IDX’s well-rounded performance by achieving a superior balance between precision and recall. It outperformed YOLOv8+Ghost (0.920) and YOLOv8+Transformer (0.923), further affirming its capability to excel in complex and diverse detection scenarios. The results demonstrate that the integration of innovative components, such as the C32k module in the backbone and C2fCIB in the neck, significantly enhances the detection performance. The consistent improvements across all metrics confirm the robustness and reliability of YOLOv8-IDX, making it a promising model for defect detection in challenging environments.

On the CPLID dataset, YOLOv8-IDX continued to demonstrate its superior performance, setting new benchmarks for all evaluation metrics. This dataset, characterized by its complexity and variability, highlights the robustness and adaptability of YOLOv8-IDX. The model achieved a precision score of 0.968, which is a notable improvement over YOLOv8 (0.935) and YOLOv8+Ghost (0.951). This high precision underscores YOLOv8-IDX’s effectiveness in reducing false positives, which is critical in industrial applications where accuracy is paramount. Additionally, the recall score of 0.977 was the highest among all evaluated models, significantly outperforming YOLOv8 (0.944) and YOLOv7 (0.905). This exceptional recall demonstrates YOLOv8-IDX’s ability to detect nearly all relevant instances, ensuring comprehensive coverage in defect identification tasks. The mAP@0.5 score of 0.990 highlights YOLOv8-IDX’s unmatched ability to maintain high detection accuracy across various object classes. This score surpassed those of YOLOv8+Transformer (0.962) and YOLOv8+GhostBottleneck (0.942), reflecting its superiority in identifying objects with varying levels of complexity. Furthermore, the mAP@0.95 score of 0.833 represents a significant improvement over YOLOv8 (0.776) and YOLOv7 (0.740), demonstrating its precise localization capabilities even under stricter IoU thresholds. The F1 score of 0.972, the highest on the CPLID dataset, reflects YOLOv8-IDX’s balanced and comprehensive performance. By surpassing YOLOv8 (0.939) and YOLOv8+GhostBottleneck (0.942), YOLOv8-IDX has proven to be the most effective model for detection tasks requiring both high precision and recall. These results validate YOLOv8-IDX’s ability to generalize effectively across datasets with different characteristics. Its advanced architectural enhancements ensure both accuracy and reliability, making it a robust solution for challenging defect detection scenarios. The consistent performance improvements over the baseline YOLOv8 and its enhanced variants highlight the impact of the design innovations introduced in YOLOv8-IDX.

Overall, YOLOv8-IDX demonstrates state-of-the-art performance on both the IDID and CPLID datasets. The integration of novel architectural components has significantly enhanced its detection capabilities, making it highly effective for real-world applications in defect detection and classification. These results affirm that YOLOv8-IDX sets a new benchmark for object detection tasks in this domain.

4.4. YOLOv8-IDX Experiments and Ablation Study

To further understand the impact of different architectural choices, an ablation study was conducted on YOLOv8-IDX. This study evaluated the contributions of specific components, including the backbone layers (C3k2) and head layers (C2fCIB), to the model’s overall performance. Metrics such as precision (P), recall (R), mAP@0.5, mAP@0.95, and F1 score were analyzed for both the IDID and CPLID datasets, and the experiment results are shown in Table 4.

The backbone of YOLOv8-IDX employs multiple configurations of C3k2 layers at different scales (i.e., 256, 512, and 1024). The ablation study revealed that incorporating all three scales of C3k2 layers significantly enhanced the model’s feature extraction capability. For instance, on the IDID dataset, the inclusion of all backbone layers resulted in a mAP@0.5 score of 0.951, compared to 0.934 when only a single scale of C3k2 was used. This underscores the importance of a multi-scale backbone in capturing the diverse features of defects.

Similarly, for the CPLID dataset, using the full backbone configuration improved the precision from 0.962 to 0.968 and mAP@0.95 from 0.844 to 0.833. These results highlight the critical role of a comprehensive backbone in achieving state-of-the-art performance on diverse datasets.

The C2fCIB layers in the neck/head of YOLOv8-IDX were pivotal in refining the model’s detection capabilities. The ablation study demonstrated that removing or simplifying the C2fCIB layers led to a significant drop in performance. For example, on the IDID dataset, the mAP@0.95 score decreased from 0.654 to 0.625 when the head layers were reduced. A similar trend was observed for the CPLID dataset, where the F1 score decreased from 0.972 to 0.967 under the same conditions.

These results highlight that C2fCIB layers play a crucial role in improving both localization precision and detection robustness. By combining compact inverted blocks with advanced feature fusion techniques, the head layers effectively integrate information from different scales, enhancing the model’s overall performance. The synergy between the backbone (C3k2) and head (C2fCIB) layers was also investigated. The results indicate that the simultaneous use of a full-scale backbone and optimized head layer yielded the best results. On the IDID dataset, this configuration achieved the highest mAP@0.5 of 0.951 and F1 score of 0.932, whereas, on the CPLID dataset, the same configuration resulted in a precision of 0.968 and mAP@0.95 of 0.833.

This finding suggests that the interaction between the backbone and head layers is vital for maximizing the detection accuracy and robustness of YOLOv8-IDX. The multi-scale backbone provides rich features, while the C2fCIB-based head efficiently processes these features to deliver precise detections. Overall, the ablation study highlights the importance of each architectural component of YOLOv8-IDX. The combination of C3k2 layers in the backbone and C2fCIB layers in the head was instrumental in achieving a superior performance. These results provide valuable insights into the design principles that underlie YOLOv8-IDX’s success and set a benchmark for future research in object detection.

The F1-confidence curve shown in Figure 5 provides critical insights into the performance of YOLOv8-IDX on the IDID dataset. The F1 score, which represents the harmonic mean of precision and recall, is a key metric for evaluating the balance between false positives and false negatives in defect detection. The curve illustrates how the F1 score varies with different confidence thresholds for the three defect categories: “Pollution-Flashover,” “Broken,” and “Insulator.” For “Pollution-Flashover,” the F1 score starts low but steadily increases, peaking at a moderate confidence threshold and declines sharply at higher thresholds. The “Broken” class exhibits a more consistent trend, achieving a higher F1 score across most confidence levels. The “Insulator” class demonstrates the most robust performance, with an F1 score consistently close to its peak. The overall F1 score for all classes reaches a maximum value of 0.93 at a confidence threshold of 0.388, as highlighted by the bold curve. This optimal threshold strikes the best balance between precision and recall for the IDID dataset. These results highlight the ability of YOLOv8-IDX to effectively detect and classify insulator defects, making it a reliable model for power transmission line inspections. Further optimization could target improving the F1 score for individual defect categories to enhance the overall robustness. Figure 6 provides the graphical results of YOLOv8-IDX on the IDID dataset.

The F1-confidence curve provides an insightful evaluation of the YOLOv8-IDX performance on the CPLID dataset, as shown in Figure 7. This curve relates the F1 score to various confidence thresholds, illustrating the trade-offs between precision and recall across the entire range of predictions. The thick blue line represents the overall performance across all classes, while individual lines show the performance for specific classes: “defect” and “insulator.” The overall F1 score peaks at 0.97 when the confidence threshold is set to 0.693. This indicates that the model achieves the best balance of precision and recall at this threshold. The model maintains a high F1 score (above 0.8) across a wide range of confidence levels, showcasing its robustness and reliability in predictions. When examining individual classes, the “defect” class demonstrates a sharper decline in the F1 score as confidence increases. This suggests that predictions for defects are more sensitive to changes in the confidence threshold. In contrast, the “insulator” class exhibits a smoother decline, indicating that the model makes consistent predictions for this category over a broad range of confidence levels. Overall, the performance of YOLOv8-IDX on the CPLID dataset is excellent, and the model is both precise and robust. Its high F1 score across classes highlights its suitability for this task, making it a reliable choice for detecting and categorizing defects and insulators in this domain. Figure 6 provides the graphical results of YOLOv8-IDX for the CPLID dataset.

As the number of epochs rises, Figure 6 and Figure 8 illustrate how measures like train loss, val loss, precision, recall, mAP@0.5, and mAP@0.5:0.95 vary. As time passes, both the training and validation losses drop, suggesting that the model is learning better. The model’s improved detection capability is demonstrated by the increased trends in precision, recall, mAP@0.5, and mAP@0.5:0.95. These outcomes demonstrate the effectiveness of the pedestrian recognition model in recognizing pedestrians in a variety of situations.

Real-World Deployment

We successfully deployed YOLOv8-IDX in a real-world UAV-based inspection of transmission line insulators, demonstrating its effectiveness in detecting defects under diverse environmental conditions. The original imagery used in this study to test the model in real-world settings was provided by a large-scale Chinese power transmission company’s footage in the southeastern region of China. The model performed well even against small insulator defects, complex backgrounds, and challenging lighting conditions, such as cloudy weather and low illumination, as shown in Figure 9. Its ability to distinguish defects despite these variations highlights its robustness and adaptability in practical scenarios, which is superior to that of previous work. Moreover, during testing, YOLOv8-IDX achieved an average inference speed of 16.2 ms per image, ensuring real-time processing capabilities, which is crucial for UAV-based automated inspection. These results confirm the model’s potential for efficient and accurate defect detection in field deployments.

5. Discussion

The YOLOv8-IDX model represents a significant advancement in insulator-defect detection, demonstrating enhanced performance compared to the baseline YOLOv8 architecture. Key modifications, including the introduction of the C3k2 and C2fCIB modules, have proven effective in improving feature aggregation, spatial awareness, and computational efficiency, making them well-suited for real-world applications. The C3k2 module, with its ability to capture hierarchical and spatially consistent features, is particularly valuable for detecting small and intricate defects that might otherwise be overlooked under challenging conditions. Meanwhile, the C2fCIB module improves feature utilization, enhancing detection accuracy, particularly in scenarios with overlapping objects or low contrast.

However, despite these advancements, a few aspects merit further exploration. The performance of the current model is primarily based on datasets collected under controlled conditions, such as those from the IDID and CPLID datasets. While these datasets offer valuable insights, they do not fully encompass the challenges posed by complex environmental factors, such as extreme weather conditions (i.e., fog and rain), varying levels of image quality, and various types of insulators (i.e., color or shape differences). These are not presented in the dataset due to our model is only limited to detecting damage in the insulators that are similar in color and shape to those provided in the publicly available datasets used in this study. These factors can potentially impact the model’s performance in practical, real-world scenarios, suggesting the need for future work to diversify datasets with more varied environmental conditions. In order to expand the existing dataset or collect a new dataset for future model training, it is recommended to fulfill some requirements, such as using a good-quality drone camera for dataset collection, avoiding blurred images by stabilizing the drone, avoiding glare issues in the collected images, and flying the drone around 2–4 m from the power line so that images can be collected without using the digital zoom in order to retain better pixel quality. Dataset collection should be conducted on a visible day, and images of the damaged insulators should be captured from as many angles as possible in order to train the model with all possible scenarios of broken insulators or flashovers.

In terms of deployment, while YOLOv8-IDX currently operates on pre-recorded images and videos, real-time applications in a drone-based system would require further development. Specifically, future work should focus on integrating the model with drone controllers for direct video input, which would enhance the reliability and operational efficiency. Currently, commercially available drones do not have open transmission channels for modifications, limiting the ability to implement real-time inference on the fly.

Additionally, although the model was designed with efficiency in mind, optimization for lightweight devices, such as drones with limited computational resources, remains an area for improvement. Future research could explore reducing model complexity without sacrificing detection accuracy, thereby enabling more efficient deployment on resource-constrained platforms.

Overall, YOLOv8-IDX provides a robust solution for insulator-defect detection; however, its practical deployment in diverse real-world environments and systems still requires addressing these limitations. Expanding datasets, enhancing model efficiency, and refining deployment strategies will further improve the model’s robustness and applicability in various contexts.

6. Conclusions

This research presents YOLOv8-IDX, a specialized deep learning model for detecting insulator defects in transmission lines. By integrating the C32k module in the backbone for enhanced feature extraction and the C2fCIB module in the neck for improved contextual understanding, YOLOv8-IDX demonstrated significant improvements in accuracy and reliability compared with the baseline models. The model was evaluated using two benchmark datasets: the IDID dataset and the CPLID dataset. On the IDID dataset, YOLOv8-IDX achieved precision, recall, and mean Average Precision (mAP) scores of 0.941, 0.925, and 0.951, respectively. Similarly, the model attained precision, recall, and mAP scores of 0.968, 0.977, and 0.990 on the CPLID dataset. These results highlight the model’s robustness in identifying diverse defect types across varying environmental conditions.

The superior performance of YOLOv8-IDX can be attributed to its ability to extract high-quality features and maintain contextual integrity, making it particularly suitable for the challenges of insulator-defect detection. The proposed model has significant potential for real-world applications, including automated inspection systems for power transmission networks, minimizing manual inspection costs, and enhancing grid reliability. Future research will focus on extending YOLOv8-IDX to support multi-class defect detection, real-time UAV-based deployment for large-scale inspections, and training the model on a new dataset collected according to the points discussed in the discussion section.

Author Contributions

Conceptualization, U.F. and F.Y.; methodology, U.F.; software, U.F.; validation, U.F., F.Y., M.S. and U.A.; formal analysis, U.F.; investigation, U.F.; resources, U.F. and M.S.; data curation, U.F. and M.S.; writing—original draft preparation, U.F., U.A. and M.S.; writing—review and editing, U.F., M.S., U.A. and Z.L.; visualization, U.F., M.S. and U.A. supervision, F.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the science and technology project of China Southern Power Grid Corporation, “Multispectral Imaging and State Intelligent Detection System for Power Transmission and Transformation Equipment Based on High Performance edge computing”, project number 090000KK52220019.

Data Availability Statement

The public datasets that are used for YOLOv8-IDX model training and validation can be found on the links given below. IDID dataset: https://ieee-dataport.org/competitions/insulator-defect-detection. CPLID dataset: https://dx.doi.org/10.21227/qtxb-2s61.

Acknowledgments

The author sincerely appreciates the kind supervision of Fan Yang and the valuable suggestions provided by him after the review.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Stefenon, S.F.; Corso, M.P.; Nied, A.; Perez, F.L.; Yow, K.-C.; Gonzalez, G.V.; Leithardt, V.R.Q. Classification of insulators using neural network based on computer vision. IET Gener. Transm. Distrib. 2022, 16, 1096–1107. [Google Scholar] [CrossRef]
Stefenon, S.F.; Singh, G.; Souza, B.J.; Freire, R.Z.; Yow, K.-C. Optimized hybrid YOLOu-Quasi-ProtoPNet for insulators classification. IET Gener. Transm. Distrib. 2023, 17, 3501–3511. [Google Scholar] [CrossRef]
Liu, Y.; Liu, D.; Huang, X.; Li, C. Insulator defect detection with deep learning: A survey. IET Gener. Transm. Distrib. 2023, 17, 3541–3558. [Google Scholar] [CrossRef]
Fortes, M.Z.; Ferreira, V.H.; Zanghi, R. Fault Diagnosis in Transmission Lines: Trends and Main Research Areas. IEEE Lat. Am. Trans. 2015, 13, 3324–3332. [Google Scholar] [CrossRef]
Yu, J.; Liu, K.; He, M.; Qin, L. Insulator defect detection: A detection method of target search and cascade recognition. Energy Rep. 2021, 7, 750–759. [Google Scholar] [CrossRef]
Hao, Y.; Liang, W.; Yang, L.; He, J.; Wu, J. Methods of image recognition of overhead power line insulators and ice types based on deep weakly-supervised and transfer learning. IET Gener. Transm. Distrib. 2022, 16, 2140–2153. [Google Scholar] [CrossRef]
Peng, S.; Ding, L.; Li, W.; Sun, W.; Li, Q. Research on intelligent recognition method for self-blast state of glass insulator based on mixed data augmentation. High Volt. 2023, 8, 668–681. [Google Scholar] [CrossRef]
Ahmed, M.D.F.; Mohanta, J.C.; Sanyal, A. Inspection and identification of transmission line insulator breakdown based on deep learning using aerial images. Electr. Power Syst. Res. 2022, 211, 108199. [Google Scholar] [CrossRef]
Yang, Z.; Xu, Z.; Wang, Y. Bidirection-Fusion-YOLOv3: An Improved Method for Insulator Defect Detection Using UAV Image. IEEE Trans. Instrum. Meas. 2022, 71, 1–8. [Google Scholar] [CrossRef]
Hu, J.; He, J.; Guo, C. End-to-End Powerline Detection Based on Images from UAVs. Remote Sens. 2023, 15, 1570. [Google Scholar] [CrossRef]
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Region-Based Convolutional Networks for Accurate Object Detection and Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 142–158. [Google Scholar] [CrossRef] [PubMed]
Lu, X.; Jiang, C.; Ma, Z.; Li, H.; Liu, Y. A Simple and Effective Surface Defect Detection Method of Power Line Insulators for Difficult Small Objects. Comput. Mater. Contin. 2024, 79, 373–390. [Google Scholar] [CrossRef]
Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Li, F.; Zhang, H.; Liu, S.; Guo, J.; Ni, L.M.; Zhang, L. DN-DETR: Accelerate DETR Training by Introducing Query DeNoising. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 13609–13617. [Google Scholar]
Liu, S.; Li, F.; Zhang, H.; Yang, X.; Qi, X.; Su, H.; Zhu, J.; Zhang, L. DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR. arXiv 2022, arXiv:2201.12329. [Google Scholar]
Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Liu, Y.; Chen, J. DETRs Beat YOLOs on Real-time Object Detection. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 16965–16974. [Google Scholar]
Li, Y.; Ni, M.; Lu, Y. Insulator defect detection for power grid based on light correction enhancement and YOLOv5 model. Energy Rep. 2022, 8, 807–814. [Google Scholar] [CrossRef]
Souza, B.J.; Stefenon, S.F.; Singh, G.; Freire, R.Z. Hybrid-YOLO for classification of insulators defects in transmission lines based on UAV. Int. J. Electr. Power Energy Syst. 2023, 148, 108982. [Google Scholar] [CrossRef]
Dai, Z. Uncertainty-aware accurate insulator fault detection based on an improved YOLOX model. Energy Rep. 2022, 8, 12809–12821. [Google Scholar] [CrossRef]
Li, Z.; Zhang, Y.; Wu, H.; Suzuki, S.; Namiki, A.; Wang, W. Design and Application of a UAV Autonomous Inspection System for High-Voltage Power Transmission Lines. Remote Sens. 2023, 15, 865. [Google Scholar] [CrossRef]
Guo, L.; Liao, Y.; Yao, H.; Chen, J.; Wang, M. An Electrical Insulator Defects Detection Method Combined Human Receptive Field Model. J. Control Sci. Eng. 2018, 2018, 2371825. [Google Scholar] [CrossRef]
Zuo, D.; Hu, H.; Qian, R.; Liu, Z. An insulator defect detection algorithm based on computer vision. In Proceedings of the 2017 IEEE International Conference on Information and Automation (ICIA), Macau, China, 18–20 July 2017; pp. 361–365. [Google Scholar]
Xing, Z.; Chen, X. Lightweight algorithm of insulator identification applicable to electric power engineering. Energy Rep. 2022, 8, 353–362. [Google Scholar] [CrossRef]
Yang, L.; Fan, J.; Song, S.; Liu, Y. A light defect detection algorithm of power insulators from aerial images for power inspection. Neural Comput. Appl. 2022, 34, 17951–17961. [Google Scholar] [CrossRef]
Tao, X.; Zhang, D.; Wang, Z.; Liu, X.; Zhang, H.; Xu, D. Detection of Power Line Insulator Defects Using Aerial Images Analyzed With Convolutional Neural Networks. IEEE Trans. Syst. Man Cybern. Syst. 2020, 50, 1486–1498. [Google Scholar] [CrossRef]
Kang, G.; Gao, S.; Yu, L.; Zhang, D. Deep Architecture for High-Speed Railway Insulator Surface Defect Detection: Denoising Autoencoder With Multitask Learning. IEEE Trans. Instrum. Meas. 2018, 68, 2679–2690. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
Redmon, J.; Farhadi, A.J.A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
Ren, S.; He, K.; Girshick, R.B.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 39, 1137–1149. [Google Scholar] [CrossRef]
Howard, A.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef]
Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Pro-ceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA, 21–26 July 2017; pp. 936–944. [Google Scholar]
Feng, Z.; Guo, L.; Huang, D.; Li, R. Electrical Insulator Defects Detection Method Based on YOLOv5. In Proceedings of the 2021 IEEE 10th Data Driven Control and Learning Systems Conference (DDCLS), Suzhou, China, 14–16 May 2021; pp. 979–984. [Google Scholar]
Gao, J.; Chen, X.; Lin, D. Insulator Defect Detection Based on improved YOLOv5. In Proceedings of the 2021 5th Asian Conference on Artificial Intelligence Technology (ACAIT), Haikou, China, 29–31 October 2021; pp. 53–58. [Google Scholar]
Wang, T.; Zhai, Y.; Li, Y.; Wang, W.; Ye, G.; Jin, S. Insulator Defect Detection Based on ML-YOLOv5 Algorithm. Sensors 2023, 24, 204. [Google Scholar] [CrossRef]
Zhao, W.; Xu, M.; Cheng, X.; Zhao, Z. An Insulator in Transmission Lines Recognition and Fault Detection Model Based on Improved Faster RCNN. IEEE Trans. Instrum. Meas. 2021, 70, 1–8. [Google Scholar] [CrossRef]
Zhong, J.; Liu, Z.; Yang, C.; Wang, H.; Gao, S.; Nunez, A. Adversarial Reconstruction Based on Tighter Oriented Localization for Catenary Insulator Defect Detection in High-Speed Railways. IEEE Trans. Intell. Transp. Syst. 2020, 23, 1109–1120. [Google Scholar] [CrossRef]
Li, X.; Su, H.; Liu, G. Insulator Defect Recognition Based on Global Detection and Local Segmentation. IEEE Access 2020, 8, 59934–59946. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015, Proceedings, Part III 18; Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
Han, G.; Zhang, M.; Wu, W.; He, M.; Liu, K.; Qin, L.; Liu, X. Improved U-Net based insulator image segmentation method based on attention mechanism. Energy Rep. 2021, 7, 210–217. [Google Scholar] [CrossRef]
Antwi-Bekoe, E.; Liu, G.; Ainam, J.-P.; Sun, G.; Xie, X. A deep learning approach for insulator instance segmentation and defect detection. Neural Comput. Appl. 2022, 34, 7253–7269. [Google Scholar] [CrossRef]
Xuan, Z.; Ding, J.; Mao, J. Intelligent Identification Method of Insulator Defects Based on CenterMask. IEEE Access 2022, 10, 59772–59781. [Google Scholar] [CrossRef]
Lewis, D.; Kulkarni, P. Insulator Defect Detection. 2021. Available online: https://ieee-dataport.org/competitions/insulator-defect-detection (accessed on 15 January 2024).
Raimundo, A. Insulator Data Set—Chinese Power Line Insulator Dataset (CPLID). 2020. Available online: https://ieee-dataport.org/open-access/insulator-data-set-chinese-power-line-insulator-dataset-cplid (accessed on 15 January 2024).

Figure 1. Basic Architecture of YOLOv8.

Figure 2. Architecture of YOLOv8-IDX.

Figure 3. Successful detections on the IDID dataset by YOLOv8-IDX.

Figure 4. Successful detections on the CPLID dataset by YOLOv8-IDX.

Figure 5. F1-confidence curve of YOLOv8-IDX over IDID dataset.

Figure 6. Precision, recall, and mAP as YOLOv8-IDX training progress during 100 epochs over the IDID dataset.

Figure 7. F1-confidence curve of YOLOv8-IDX over CPLID dataset.

Figure 8. Precision, recall, and mAP as YOLOv8-IDX training progress during 100 epochs over CPLID dataset.

Figure 9. Real-world deployment of YOLOv8-IDX.

Table 1. Division of datasets used for model training.

Dataset	Total	Training	Validation	Test
IDID ¹	1596	1296	144	160
CPLID ²	848	592	166	87

IDID ¹: https://ieee-dataport.org/competitions/insulator-defect-detection, accessed on 22 April 2025; CPLID ²: https://dx.doi.org/10.21227/qtxb-2s61.

Table 2. Training hyperparameters.

Parameters	Values
Input Size	640
Batch Size	16
Epochs	100
Optimizer	SGD
Learning Rate	0.001
Weight Decay	0.0005
Momentum (for SGD)	0.937

Table 3. Results comparison of YOLOv8-IDX with other models.

Model	IDID					CPLID
Model	P	R	mAP@0.5	F1	mAP@0.95	P	R	mAP@0.5	F1	mAP@0.95
YOLOv3	0.776	0.750	0.786	0.762	0.485	0.794	0.805	0.817	0.799	0.651
YOLOv4	0.800	0.784	0.810	0.791	0.470	0.831	0.838	0.854	0.834	0.679
YOLOv5	0.826	0.717	0.766	0.767	0.495	0.857	0.883	0.921	0.869	0.708
YOLOv7	0.881	0.850	0.892	0.865	0.553	0.902	0.905	0.876	0.903	0.740
YOLOv8	0.914	0.880	0.904	0.896	0.624	0.935	0.944	0.954	0.939	0.776
Yolov8-Ghost	0.924	0.882	0.920	0.902	0.629	0.951	0.900	0.962	0.924	0.804
Yolov8+Transformer	0.927	0.919	0.939	0.923	0.635	0.947	0.908	0.921	0.927	0.766
Yolov8+Bottleneck	0.914	0.886	0.912	0.899	0.625	0.926	0.910	0.925	0.917	0.762
Yolov8+GhostBottleneck	0.917	0.922	0.935	0.919	0.644	0.928	0.933	0.942	0.930	0.787
YOLOv8-IDX	0.941	0.925	0.951	0.932	0.654	0.968	0.977	0.990	0.972	0.833

Table 4. Ablation experiments for YOLOv8-IDX.

Backbone (Layers)			Head (Layers)			IDID					CPLID
256	512	1024	256	512	1024	P	R	mAP@0.5	F1	mAP@0.95	P	R	mAP@0.5	F1	mAP@0.95
	C3k2				C2fCIB	0.901	0.928	0.934	0.914	0.648	0.962	0.98	0.987	0.970	0.844
		C3k2		C2fCIB		0.92	0.926	0.941	0.922	0.657	0.946	0.965	0.978	0.955	0.802
	C3k2			C2fCIB		0.92	0.924	0.937	0.921	0.652	0.957	0.974	0.986	0.965	0.791
C3k2			C2fCIB			0.925	0.907	0.936	0.915	0.646	0.97	0.976	0.987	0.972	0.821
C3k2	C3k2	C3k2	C2fCIB	C2fCIB	C2fCIB	0.934	0.86	0.911	0.895	0.607	0.961	0.976	0.985	0.968	0.771
C3k2	C3k2	C3k2	-	-	-	0.914	0.886	0.912	0.899	0.625	0.964	0.971	0.984	0.967	0.774
		C3k2			C2fCIB	0.941	0.925	0.951	0.932	0.654	0.968	0.977	0.990	0.972	0.833

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Farooq, U.; Yang, F.; Shahzadi, M.; Ali, U.; Li, Z. YOLOv8-IDX: Optimized Deep Learning Model for Transmission Line Insulator-Defect Detection. Electronics 2025, 14, 1828. https://doi.org/10.3390/electronics14091828

AMA Style

Farooq U, Yang F, Shahzadi M, Ali U, Li Z. YOLOv8-IDX: Optimized Deep Learning Model for Transmission Line Insulator-Defect Detection. Electronics. 2025; 14(9):1828. https://doi.org/10.3390/electronics14091828

Chicago/Turabian Style

Farooq, Umer, Fan Yang, Maryam Shahzadi, Umar Ali, and Zhimin Li. 2025. "YOLOv8-IDX: Optimized Deep Learning Model for Transmission Line Insulator-Defect Detection" Electronics 14, no. 9: 1828. https://doi.org/10.3390/electronics14091828

APA Style

Farooq, U., Yang, F., Shahzadi, M., Ali, U., & Li, Z. (2025). YOLOv8-IDX: Optimized Deep Learning Model for Transmission Line Insulator-Defect Detection. Electronics, 14(9), 1828. https://doi.org/10.3390/electronics14091828

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

YOLOv8-IDX: Optimized Deep Learning Model for Transmission Line Insulator-Defect Detection

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. YOLOv8 Arcitecture

3.2. YOLOv8-IDX

3.2.1. C3k2

3.2.2. C2fCIB

4. Experimentation and Results

4.1. Dataset

4.1.1. IDID Dataset

4.1.2. CPLID Dataset

4.2. Evaluation Matrix and Hyperparameters

4.3. YOLOv5 and YOLOv8 Experiments

4.4. YOLOv8-IDX Experiments and Ablation Study

Real-World Deployment

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI