Deep Learning-Based Visual Analytics for Efficiency and Safety Optimization in Power Infrastructure

Afanaseva, Olga Vladimirovna; Tulyakov, Timur Faritovich; Shaimardanov, Artur Airatovich

doi:10.3390/eng7030135

Open AccessArticle

Deep Learning-Based Visual Analytics for Efficiency and Safety Optimization in Power Infrastructure

by

Olga Vladimirovna Afanaseva

^1,*

,

Timur Faritovich Tulyakov

¹

and

Artur Airatovich Shaimardanov

²

¹

Department of System Analysis and Control, Empress Catherine II Saint Petersburg Mining University, 199106 Saint Petersburg, Russia

²

Integral Data, Hong Kong, China

^*

Author to whom correspondence should be addressed.

Eng 2026, 7(3), 135; https://doi.org/10.3390/eng7030135

Submission received: 16 February 2026 / Revised: 10 March 2026 / Accepted: 13 March 2026 / Published: 15 March 2026

(This article belongs to the Section Electrical and Electronic Engineering)

Download

Browse Figures

Versions Notes

Abstract

The paper presents a comprehensive deep learning-based framework for automated visual inspection of overhead power line infrastructure using unmanned aerial vehicles. Traditional manual and helicopter inspections are costly, time-consuming, and hazardous for maintenance personnel. The proposed approach integrates UAV imaging with advanced computer vision models such as YOLOv8, EfficientDet-D2, and Faster R-CNN to automatically detect defects in critical components, including insulators, conductors, and transmission towers. Several open datasets (InsPLAD, TTPLA, MPID) were used for training and validation, ensuring robustness under diverse lighting and environmental conditions. Experimental results demonstrate that YOLOv8 achieved the best performance, reaching 88.5% mAP@0.5 with real-time inference capabilities (over 50 FPS on GPU). The system significantly enhances inspection efficiency, allowing for a threefold increase in coverage capacity and an up to 70% reduction in defect remediation time. The integration of AI-powered visual analytics with maintenance and SCADA systems enables a shift from reactive to predictive maintenance, improving the safety, reliability, and resilience of power transmission infrastructure.

Keywords:

deep learning; UAV inspection; computer vision; YOLOv8; power line defects; predictive maintenance; visual analytics; power infrastructure; automated inspection; smart grid

1. Introduction

Aging power transmission infrastructure demands regular inspections to prevent failures, but traditional manual methods are costly, slow, and hazardous [1]. Inspectors must climb towers and work near high-voltage lines, risking life and limb [2]. Routine patrols via foot or helicopter are labor-intensive and expensive, yet still prone to human error. Automated drone-based inspection offers a safer alternative: unmanned aerial vehicles (UAVs) can capture high-resolution imagery of overhead power lines (OPLs) without exposing workers to danger [3]. Moreover, replacing manual review with computer vision (CV) analysis dramatically improves efficiency [4]. Recent studies report that using AI-driven drones can cut inspection labor and operational costs by roughly 30–50% compared to legacy methods [5]. This efficiency gain enables more frequent surveys and higher data quality, paving the way for predictive maintenance strategies instead of reactive repairs. By collecting detailed visual data on a routine basis, utilities can detect early signs of component degradation and fix issues before they escalate into outages or safety hazards.

The convergence of drones and deep learning thus represents a paradigm shift in grid asset management. High-quality aerial images and videos collected by UAVs serve as the primary data source for automated inspection [6]. Modern computer vision algorithms can sift through these massive image troves to pinpoint defects such as cracked insulators, corroded fittings, or frayed conductors. This approach effectively decouples the inspection process into two stages: data collection (via UAVs covering large spans quickly) and data analysis (via AI models). The result is a scalable pipeline where human experts are augmented—or even replaced—by AI for the tedious task of defect identification [7]. By leveraging deep learning, the inspection process becomes safer (since humans remain on the ground), faster (drones can cover more line length per hour than climbing crews), and more consistent (AI applies the same criteria to every image, reducing missed detections).

To provide a visual overview of the proposed approach, Figure 1 illustrates the conceptual architecture of the AI-assisted UAV inspection system. The framework integrates drone-based data acquisition, deep learning-driven defect detection, and automated feedback to SCADA and maintenance platforms, creating a closed-loop process for predictive maintenance of power transmission infrastructure.

As shown in Figure 1, the proposed concept forms an integrated visual analytics pipeline that links aerial data acquisition with automated defect recognition and maintenance decision-making. By establishing a continuous feedback loop between UAV imaging, deep learning analysis, and SCADA-based maintenance management, the framework enables proactive rather than reactive grid maintenance. This transition toward predictive monitoring significantly enhances both operational safety and inspection efficiency.

The reliability of power transmission systems critically depends on the timely detection of defects in key overhead power line (OPL) components. Failures such as broken insulators or loose fittings can lead to severe consequences, including large-scale outages and even wildfires [8,9]. However, traditional inspection approaches face significant scalability challenges: transmission networks often span thousands of kilometers across remote and difficult terrain [10,11]. Many utilities still rely on periodic visual inspections performed by linemen or helicopter crews, which are not only hazardous but also prone to missing subtle early-warning signs due to human fatigue or limited viewing angles.

A data-driven, automated monitoring framework can effectively overcome these limitations [12]. Drones equipped with RGB and thermal cameras can operate beyond visual line of sight (BVLOS), capturing high-resolution imagery of towers, conductors, insulators, and auxiliary components from multiple perspectives. These images can then be analyzed by AI algorithms in near-real time to identify and localize anomalies. The result is a dramatic increase in inspection throughput: a recent case study reported a threefold improvement in coverage capacity when combining UAV and AI-based analysis compared with manual inspection [13]. Moreover, integrating automated defect detection into maintenance workflows has been shown to reduce average repair times by up to 70% [14], as faults are detected earlier and corrective actions can be initiated automatically.

The present study leverages the convergence of drone-based imaging and state-of-the-art computer vision to enable scalable monitoring of power transmission infrastructure [15,16]. We propose a deep learning pipeline that processes UAV imagery and generates structured defect reports. The proposed approach emphasizes the computer vision aspects—from data handling and annotation, through model training (using advanced detectors such as YOLOv8, EfficientDet, and Faster R-CNN), to visualization and performance evaluation [17,18]. By demonstrating robust automated detection of structural defects in overhead power line components, the system aims to enhance grid reliability and reduce dependence on hazardous manual inspections [19,20]. Moreover, the developed framework can be integrated with existing utility asset management platforms—for instance, linking detected defects with supervisory control and data acquisition (SCADA) or maintenance dispatch systems—thereby closing the loop from detection to corrective action [21,22]. Ultimately, such integration fosters the transition toward predictive maintenance, where data analytics anticipate failures and support a self-healing smart grid [23].

The main objective of this research is to develop and validate a deep learning-based defect detection pipeline for overhead power line infrastructure [24,25]. The study focuses on visual diagnostics of key OPL components—including insulators, conductors, towers, and fittings—and their associated defects such as cracks [26,27], corrosion, and missing elements, using high-resolution drone imagery as the primary data source [28]. The proposed framework [29,30] encompasses the complete workflow: compiling and annotating a comprehensive multi-source dataset, training state-of-the-art object detection models, and rigorously evaluating their performance in identifying defects under realistic operational conditions.

The remainder of this paper is structured as follows: Section 2 outlines the methodology, describing the datasets used, annotation and preprocessing procedures, model architectures, training configurations, and evaluation metrics. Section 3 presents the experimental results, comparing model performance through both quantitative indicators and qualitative detection examples. Section 4 discusses the implications of the findings, technical challenges—including small-object detection, illumination variability, and edge deployment—and integration prospects with broader smart grid systems. Finally, Section 5 summarizes the main contributions and provides an outlook for future research toward fully autonomous power grid inspection.

2. Materials and Methods

To provide a clearer understanding of the proposed research workflow, this section describes the overall methodology adopted for automated power line inspection using deep learning. The developed system consists of a complete visual analytics pipeline—from image acquisition to maintenance integration—designed to optimize both efficiency and safety in monitoring high-voltage infrastructure.

The process begins with unmanned aerial vehicles capturing high-resolution imagery of overhead power lines. The collected data then undergo several preprocessing and annotation stages before being analyzed by deep neural networks trained to recognize and classify different types of components and defects. The resulting detections are subsequently aggregated into structured reports that can be linked to maintenance management or supervisory control systems.

To illustrate this workflow, Figure 2 presents the overall architecture of the proposed system and highlights how each stage is connected within the end-to-end inspection process.

As shown in Figure 2, the system can be divided into five main functional modules:

UAV Image Capture—drones equipped with RGB or thermal cameras collect detailed imagery of towers, conductors, and insulators under various environmental conditions.
Data Preprocessing—raw images are filtered, resized, and annotated to ensure consistency across datasets and facilitate training.
Deep Learning Model—modern detectors such as YOLOv8, EfficientDet-D2, and Faster R-CNN analyze the visual data to identify both components and potential defects.
Defect Detection and Visualization—the models produce bounding boxes and class labels indicating the type and severity of each anomaly.
SCADA/Maintenance Integration—detected issues are transmitted to the utility’s asset management system, where maintenance teams can prioritize and schedule repairs.

To further clarify the interaction between these modules, the inspection workflow operates as a sequential data-processing pipeline. During inspection flights, UAVs capture high-resolution images together with geospatial metadata along the transmission corridor. The collected imagery is transmitted either to an onboard edge-computing module or to a ground processing station where preprocessing operations are applied, including image normalization, resizing, and annotation alignment. The preprocessed images are then analyzed by trained deep learning models that perform automated component and defect detection. The resulting predictions (bounding boxes, class labels, and confidence scores) are aggregated and linked to the geographic locations of the inspected infrastructure. Finally, detected anomalies are transferred to maintenance management or SCADA systems, allowing operators to prioritize inspection findings and initiate corrective actions.

This modular architecture ensures flexibility and scalability: each block can be independently updated or replaced as new models and data sources become available. The following Section 2.1 details the datasets used for training and evaluating the proposed system.

2.1. Datasets for Power Line Inspection

A major hurdle in automated power line inspection has been the lack of large, public datasets reflecting real-world conditions. We leverage several recently released open datasets to train and validate our models, ensuring coverage of diverse OPL components and defect types:

InsPLAD (Inspection of Power Line Assets Dataset)—comprehensive benchmark introduced in 2023, consisting of 10,607 high-resolution UAV images of actual power lines. InsPLAD covers 17 unique power line components (e.g., insulators, dampers, conductors, towers, fittings) and includes images of components in both healthy condition and with 6 defect types. The defects encompass common issues like metal corrosion, broken parts, and even bird nests on structures. Notably, the inclusion of non-structural anomalies (e.g., bird nests) highlights that inspection must consider environmental factors, not just engineering failures. InsPLAD was specifically designed for three tasks: object detection (localizing components, evaluated by Average Precision), defect classification [31,32] (determining defect type vs. normal, evaluated by balanced accuracy), and anomaly detection (distinguishing defective vs. normal via AUROC). The dataset poses significant real-world vision challenges, including multi-scale objects, occlusions, cluttered backgrounds, varied viewpoints, and lighting differences [33,34]. These characteristics make InsPLAD an ideal source to train robust models that generalize to field conditions. We utilize InsPLAD’s detection annotations (bounding boxes around each asset) and defect labels in our pipeline.

TTPLA (Transmission Towers and Power Lines Aerial dataset) an aerial image dataset introduced for detecting and segmenting transmission towers (TTs) and lines. TTPLA contains 1100 images (3840 × 2160 resolution) with 8987 labeled instances of towers and lines, captured from various view angles. It supports object detection and instance segmentation tasks, providing pixel-wise annotations for lines and towers. In addition, TTPLA includes classification of each tower’s structure type (e.g., steel lattice vs. concrete or wood pole), which is relevant for vulnerability analysis of different tower materials [35]. We mainly use TTPLA to augment our training data for tower detection, ensuring that our models learn to recognize towers of different types and from different perspectives. The diverse backgrounds and altitudes in TTPLA help models handle the wide appearance variation in towers in drone imagery.

To illustrate the use of TTPLA data in training, Figure 3 shows an example of YOLOv8 learning to detect transmission towers and power lines in defect-free aerial imagery. The figure highlights how the model localizes structural elements such as towers and conductors under varying perspectives and backgrounds.

MPID (Merged Public Insulator Dataset)—specialized dataset focusing on insulators, compiled by merging multiple public image sets. MPID was introduced in 2025 to address the need for a diverse, high-quality dataset of insulators for UAV-based inspection [36,37]. It comprises 4807 images with 7850 annotated insulators of three types: glass, porcelain, and composite. By integrating images from different regions, times, and conditions, MPID captures a wide range of scenarios—various lighting (morning, noon, dusk), weather (clear, cloudy), and vantage points (distant views, full-frame shots, close-ups). This diversity makes MPID valuable for training models that are robust to scale and appearance variance in insulators [38]. Each insulator instance in MPID is labeled with a tight bounding box and a type label (material), and some images contain insulators with anomalies (defects) [39,40] or no anomalies, enabling defect detection research.

VSB Power Line Fault Detection (Kaggle)—unlike the above image datasets, VSB is an open dataset of electrical sensor signals rather than images. Included for completeness, this Kaggle competition dataset contains high-frequency voltage waveforms (800,000 samples over 20 ms per instance) from power lines, labeled for partial discharge events (insulation faults) [41,42]. While outside the scope of purely visual inspection, we note that integrating non-visual data like VSB’s electrical signals can complement vision-based systems. In practice, thermal images and corona discharge sensors are also used in power line inspection [43]. However, in this paper, we focus on visual analytics, using VSB only to emphasize the multimodal nature of modern grid monitoring. Our defect detection pipeline is designed to eventually incorporate such data (see Section 4), though our experiments center on image-based detection.

To further demonstrate the defect-oriented training process, Figure 4 presents an example of YOLOv8 learning to detect missing insulator caps. The image illustrates how the model adapts to localized structural anomalies and distinguishes defective components [44,45] from normal ones within the same scene.

Similarly, Figure 5 shows a training example focused on identifying corrosion on metallic fittings and conductor joints. This type of defect represents one of the most frequent visual anomalies in overhead power line components and serves as a key validation case for the model’s robustness under varied lighting and texture conditions.

We also draw on a few specialized datasets [46] for additional training examples of specific components or defects. IDID (Insulator Defect Image Dataset) is a collection of high-quality images focusing on suspension insulator defects, including flashover burns and broken shells. We use IDID to enrich our insulator defect examples. Similarly, the Chinese Power Line Insulator Dataset (CPLID) provides thermal and visible images of insulators with missing caps and other faults [47], which we employ for data augmentation in defect training. For tower detection, in addition to TTPLA, we use some images from a pylon component dataset that labels tower plates, bolts, etc. [48,49], to help detect small sub-components (e.g., tower ID signs). All these sources contribute to a robust aggregated dataset for our deep learning models.

In this study, several complementary public datasets were combined to ensure sufficient coverage of different power line components and defect types. The InsPLAD dataset served as the primary source of training and evaluation data due to its large number of annotated UAV images and diverse defect categories. The TTPLA dataset was mainly used to improve the detection of transmission towers and power lines in aerial imagery with high spatial resolution (3840 × 2160). The MPID dataset contributed additional annotated examples of insulators of different materials (glass, porcelain, and composite), improving model robustness to variations in appearance and scale. Finally, smaller specialized datasets such as IDID and CPLID were incorporated primarily for data augmentation purposes to increase the representation of rare defect types such as broken shells or missing caps.

For the training pipeline, images from these datasets were merged into a unified dataset with a consistent annotation format. The aggregated dataset was then split into training (70%), validation (15%), and testing (15%) subsets, while ensuring that images from the same UAV flight or depicting the same infrastructure elements were assigned exclusively to a single subset to prevent data leakage. This strategy allowed the models to learn from diverse visual conditions while maintaining a reliable evaluation protocol.

2.2. Data Annotation and Preprocessing

Accurate and consistent annotations are critical to achieving high model performance. All images were labeled with tight bounding boxes around each target object—either a structural component or a detected defect—following established best practices. Each bounding box was carefully adjusted to fully enclose the object of interest while minimizing the inclusion of background pixels to preserve signal quality [50]. Partially visible components (for instance, an insulator appearing only halfway within the frame) were annotated whenever the object could be reliably identified, thereby teaching the model to recognize objects even under partial visibility. Occluded components were also labeled: when one object obstructed another, the hidden instance was annotated according to its estimated full extent.

For defect labeling, an instance-level annotation strategy was applied. When a defect such as corrosion or cracking was present, the affected area was either marked as a separate region or assigned a defect-class tag within the corresponding component box, depending on model requirements [51,52]. This dual annotation strategy ensured both localization precision and semantic clarity [53]. Throughout the dataset, annotation consistency was strictly maintained by adhering to a unified set of labeling rules and definitions.

To ensure uniformity and reproducibility across experiments, all annotations were standardized using the YOLO format for object detection training. In this representation, each image is accompanied by a corresponding text file that contains one line per annotated object, following the structure: <class_id> <x_center> <y_center> <width> <height>—where all coordinates are normalized relative to the image dimensions [54,55]. The YOLO format was selected due to its lightweight structure, efficient parsing, and compatibility with real-time detection frameworks [56]. Although it does not inherently support rotated bounding boxes or polygonal regions, this limitation was negligible for our task, as most defects are naturally aligned with their parent components.

For models requiring COCO-style annotations, such as Faster R-CNN and EfficientDet, the data were automatically converted into JSON format, preserving bounding box coordinates and category labels [57,58]. To maintain consistency between datasets and models, a unified labeling scheme was used—for instance, class 0 = insulator, class 1 = tower, and so forth.

A structured label ontology was defined to encompass all relevant component categories (insulator, conductor, spacer damper, pole/tower, arcing horn, etc.) and defect types (corrosion, crack, broken part, bird nest) [59,60]. Because certain defects are specific to particular components (for example, “broken strand” applies only to conductors), two labeling strategies were considered: treating defects as separate classes or as attributes of the corresponding components [61]. In the YOLOv8 configuration, we adopted the former approach, defining each defective component as a distinct class (e.g., “insulator-broken” vs. “insulator-normal”), thereby enabling one-stage detection of both the object type and its condition within a single inference pass.

Prior to model training, all images underwent a standardized preprocessing pipeline to ensure consistency and optimal model performance [62,63]. The primary step involved resizing each image [64,65] to a uniform resolution of 640 × 640 pixels, which corresponds to the default input size for YOLOv8 small models and is widely adopted across contemporary object detection frameworks [66]. Resizing was performed with aspect ratio preservation using letterboxing to prevent geometric distortion; any additional padding introduced during this process was filled with a neutral gray color to minimize visual bias.

The selected resolution of 640 pixels represents a practical balance between spatial detail and computational efficiency. It provides sufficient granularity to detect small-scale anomalies, such as fine cracks or missing fittings, while maintaining real-time inference capability on embedded edge devices. For comparison, EfficientDet-D2 models were trained using their recommended input resolution of approximately 768 × 768 pixels, which was occasionally adjusted to align with the YOLOv8 configuration for fairness across experiments. In contrast, Faster R-CNN models were trained with images scaled to 1024 pixels on the longer side, leveraging multi-scale training to enhance robustness against varying object sizes and viewing distances.

We also performed image augmentation to improve model robustness. Augmentations included random horizontal flips (simulating viewing the line from opposite sides), random rotations up to 5–10° (to mimic slight camera tilt variations) [67,68], and photometric tweaks: adjusting brightness and contrast, slight blurring or sharpening, and adding random noise. For instance, we randomly applied brightness/contrast changes in the range of ±20% to account for different lighting conditions (bright noon sun vs. overcast). We also used color jitter and histogram equalization occasionally to handle glare or shadowy images. One effective augmentation was mosaic blending (for YOLOv8), which combines four images into one during training—this exposes the model to varied contexts and more objects per image, aiding small-object detection. We set mosaic probability to 0.5 in YOLO’s training hyperparameters. Additionally, copy-paste augmentation was employed for rare defect classes, e.g., copying a cropped image of a bird nest and pasting it onto another tower image, then labeling it, to artificially increase bird nest examples. This technique helps in cases where certain defects are underrepresented but not too context-dependent.

All augmentation was done on the fly during training using the frameworks’ built-in capabilities. For example, the Ultralytics YOLOv8 library applies random scaling, translation, flipping, and color augmentations by default each epoch [69]. We did not apply excessive augmentation that could create unrealistic images—each augmentation type was used in moderation (e.g., flip or rotate at most 50% of the time, moderate noise only). By augmenting, we effectively increased the variety of training data without needing to collect thousands more real images. Finally, we normalized pixel values to the range [0, 1] and, for some models, subtracted the ImageNet mean and applied standard deviation normalization if required by the backbone (this was the case for EfficientDet’s EfficientNet backbone). No manual image cropping was done except to remove excessive sky or ground in some images—most images were already focused on the power line structures.

2.3. Deep Learning Models for Defect Detection

We implemented and compared three modern CNN-based detection architectures, each representing a different trade-off between speed and accuracy: YOLOv8, EfficientDet-D2, and Faster R-CNN. All three models aim to automatically localize and classify defects or defective components in input images, but they differ in design philosophy (one-stage vs. two-stage, anchor-based vs. anchor-free, etc.) [70,71]. We describe each model and any customizations made for our task:

The latest iteration of the “You Only Look Once” family (released 2023) was chosen for its excellent real-time performance and ease of deployment. YOLOv8 is an anchor-free, one-stage detector that predicts object bounding boxes and class probabilities in one forward pass. It builds on YOLOv5’s general architecture but introduces several improvements. Notably, YOLOv8 replaces the CSP-Darknet backbone’s C3 modules with a new C2f module (Cross-Stage Partial with 2 fusion layers). This change makes the network lighter and enhances feature fusion, yielding a richer gradient flow during training. The result is improved accuracy and faster inference compared to YOLOv5. YOLOv8 also includes an extended PAN-FPN (Path Aggregation Network with Feature Pyramid) neck for multi-scale feature aggregation, plus an anchor-free detection head (meaning it directly predicts box coordinates without predefined anchor boxes) [72]. We used the YOLOv8s variant (small model) for most experiments, which has a few million parameters and can run efficiently on GPUs and even Jetson-class devices. Its backbone is CSP-based with a focus on small-object detection, which is crucial for spotting tiny defects like missing cotter pins or hairline cracks. High inference speed is a key advantage—YOLO models are known to achieve >30 FPS easily on modest hardware. This aligns with the need for real-time processing on drones or edge computers [73,74]. Additionally, YOLOv8’s training pipeline is highly optimized, supporting features like Mosaic augmentation, auto-learning of anchor sizes (though anchor-free in v8), and robust mosaic/cutout augmentation. We fine-tuned YOLOv8 to output multiple classes: both the component type and defect type were encoded in the class label as discussed. In some runs, we trained a segmentation-enabled YOLOv8 (YOLOv8-seg) to get pixel masks for certain defects (like rust patches), but our primary results focus on the bounding box outputs.

EfficientDet [75] is a family of detectors known for their compound scaling approach—the model depth, width, and input resolution are scaled in a balanced way to generate models D0 through D7 with increasing capacity. We selected EfficientDet-D2 as a representative mid-sized model (~11 M parameters) that often achieves high accuracy at a moderate speed. EfficientDet uses an EfficientNet backbone (which itself was optimized for parameter efficiency) and introduces a BiFPN (Bi-directional Feature Pyramid Network) for multi-scale feature fusion. The BiFPN iteratively refines feature maps from different levels with weighted connections, allowing information to flow top-down and bottom-up efficiently. This architecture is well-suited for our task since defects can appear at different scales—e.g., a corrosion spot might only be ~20 pixels on a large insulator image (small object), whereas a bird nest on a tower crossarm could span hundreds of pixels (large object). EfficientDet’s multi-scale prowess helps detect both. We initialized EfficientDet-D2 with pre-trained weights (trained on COCO) and then fine-tuned on our power line data [76]. One modification we made was adjusting the anchor box scales and aspect ratios to better fit elongated objects like insulators. While EfficientDet normally relies on a default anchor configuration, we observed better recall by adding a slightly wider anchor for very thin, long objects (like conductors) and a smaller square anchor for tiny defects. EfficientDet outputs detections with a one-stage approach (anchor-based), and we train it with the standard focal loss and smooth L1 loss for box regression as in the original paper. In practice, EfficientDet-D2 runs slower than YOLOv8 on the same GPU, but can still achieve near-real-time performance (we got ~15 FPS on a Tesla T4 for 640 × 640 images). Its accuracy on benchmarks is often higher than YOLO variants for certain tasks, so we wanted to see if that holds on fine-grained power line defects.

As a strong two-stage detector, we include Faster R-CNN [77] for its high localization accuracy. Faster R-CNN first uses a Region Proposal Network (RPN) to generate candidate regions (region proposals) that likely contain objects, and then classifies and refines those proposals in a second stage [78,79]. This extra step can yield higher precision, especially for small or densely packed objects, at the cost of inference speed. We used a modernized implementation from Detectron2, with a ResNet-50 backbone and Feature Pyramid Network (FPN) for multi-scale proposals. The RPN was configured with anchors across 5 scales and 3 aspect ratios (including some especially small anchors of ~16 pixels for tiny defects). We also experimented with an Improved Faster R-CNN variant where we integrated some ideas from recent research: for example, using a Soft-NMS (non-maximum suppression) to reduce missed overlapping detections, and increasing RPN top proposals to 300 per image to improve recall. One report in the literature achieved 81.8% AP on insulator detection using an improved Faster R-CNN with VGG-16 backbone and optimized NMS, demonstrating the potential of this approach. In our case, Faster R-CNN serves as a high-accuracy baseline—we expect it to possibly catch some defects that one-stage models miss, albeit while running at only ~5–10 FPS on GPU (our measurements) [80,81]. We trained Faster R-CNN with multi-class labels (same classes as YOLO/EfficientDet). During training, each mini-batch took 2 images (due to memory constraints, given the heavy model), and we ran it for more epochs to compensate. The loss function was standard cross-entropy for classification and smooth L1 for bounding boxes; we also enabled class-balanced sampling because defect classes were much rarer than normal components. This means that in each batch, we ensured at least one ROI proposal of a defect was present to avoid the classifier being overwhelmed by negatives.

In addition to these primary models, we note that numerous model optimizations have been proposed specifically for power line inspection. For instance, researchers have developed custom YOLO-based models like Line-YOLO that integrate deformable convolutions and attention mechanisms to boost defect detection performance [82]. In one example, adding a deformable convolutional neck (DCNv3) to YOLOv8-nano improved mAP@0.5 by +2.9% with even fewer parameters than baseline [83]. Another variant combined BiFPN and EMA attention to YOLOv8, gaining +6.2% mAP and +14 FPS inference speed. These demonstrate the speed–accuracy balance focus in industry [84]. While we did not implement those specific variants, our pipeline is flexible to accommodate such improvements. Our YOLOv8 model can be seen as analogous to these highly optimized one-stage detectors which are favored for real-time UAV use.

To ensure a meaningful comparison between the evaluated models, all architectures were trained and evaluated using the same aggregated dataset and identical training–validation–testing splits described in Section 2.1. Each model was implemented using widely adopted reference configurations and recommended input resolutions based on the original implementations. This approach reflects common practice in applied computer vision studies, where models are typically evaluated under their standard operational settings rather than enforcing identical architectural constraints that could disadvantage particular methods. Consequently, the comparison focuses on the practical trade-offs between detection accuracy, robustness, and computational efficiency when the models are applied to UAV-based inspection tasks.

2.4. Training Procedure

We trained all models using the PyTorch deep learning framework (versions 1.13–2.0) with GPU acceleration. Where possible, we used existing implementations: YOLOv8 via the Ultralytics repository, EfficientDet via the EfficientDet. PyTorch implementation, and Faster R-CNN via Detectron2. Training was conducted on a workstation with an NVIDIA RTX 3090 GPU (24 GB VRAM) for most experiments; some heavy experiments were also run on cloud V100 GPUs.

To accelerate convergence and enhance generalization, all models were initialized using transfer learning with pre-trained weights [85]. Specifically, YOLOv8 was warm-started from the official YOLOv8s checkpoint trained on the MS COCO dataset, which provides a strong foundation of generic visual features applicable across domains. The model was subsequently fine-tuned on the domain-specific dataset to adapt these features to power line components and defect patterns. Similarly, EfficientDet-D2 was initialized with COCO pre-trained weights, while Faster R-CNN utilized the ResNet50-FPN checkpoint provided by Detectron2, also trained on MS COCO.

Transfer learning significantly reduced the required training time and improved the final detection accuracy by reusing low-level visual representations, such as edges, shapes, and textures, already captured in the pre-trained models [86,87]. During the initial training phase, the earliest convolutional layers (up to conv2 in ResNet and the first stage of the YOLO backbone) were frozen to preserve these general-purpose features. After several warm-up epochs, all layers were progressively unfrozen and fine-tuned with a reduced learning rate to allow gradual adaptation to the target domain without destabilizing the learned weights.

The models were trained using standard hyperparameter configurations, followed by moderate tuning to optimize stability and convergence [88]. For YOLOv8, the initial learning rate was set to 0.01 and decayed according to a cosine annealing schedule over 100 epochs. A short warm-up phase was introduced during the first three epochs, where the learning rate was reduced to 0.005, which improved gradient stability and prevented early divergence.

The batch size was configured as 16 images per GPU for YOLOv8 and EfficientDet-D2, while for Faster R-CNN, it was effectively smaller (2 images per iteration) due to the higher memory footprint of the two-stage architecture. Optimization strategies also differed by model family: AdamW was employed for YOLOv8 and EfficientDet, with a weight decay coefficient of 5 × 10⁻⁴ to enhance generalization, whereas Stochastic Gradient Descent (SGD) with momentum = 0.9 was applied to Faster R-CNN, following conventional practices for region-based detectors.

The choice of optimizer was guided by empirical observations during training: AdamW provided smoother loss convergence and faster stabilization for the one-stage models, whereas SGD exhibited superior robustness for the more complex two-stage R-CNN framework.

Training schedule: Each model was trained until performance plateaued on the validation set. YOLOv8 converged quickly—we trained for 100 epochs (which is about 50k iterations given our data size and batch), and the mAP stopped improving around epoch 80. EfficientDet needed ~120 epochs to fully stabilize, perhaps due to its more complex BiFPN gradients. Faster R-CNN, being two-stage, was the slowest to train; we ran 24k iterations (which with 2 images/iteration is roughly 300 epochs over ~8000 training images) and kept the best checkpoint based on validation mAP. We also employed early stopping: if no mAP improvement was seen for 10 consecutive epochs, training was halted to avoid overfitting.

The training of all models continued until performance convergence was observed on the validation set. The YOLOv8 model exhibited the fastest convergence, completing training in 100 epochs (approximately 50,000 iterations, given the dataset size and batch configuration). The Mean Average Precision (mAP) metric reached a stable plateau around epoch 80, indicating early saturation. In contrast, EfficientDet-D2 required approximately 120 epochs to achieve full stabilization, which can be attributed to the higher complexity of its BiFPN feature fusion architecture and more gradual gradient propagation.

The Faster R-CNN model, characterized by its two-stage detection framework, demonstrated the slowest training dynamics. It was trained for approximately 24,000 iterations, corresponding to roughly 300 epochs over 8000 training images (with two images per iteration). The optimal model checkpoint was selected based on the highest validation mAP achieved during training.

To prevent overfitting, an early stopping criterion was applied: if no improvement in validation mAP was observed over 10 consecutive epochs, training was automatically terminated and the best-performing checkpoint retained for further evaluation.

The datasets were partitioned into three subsets: training, validation, and testing. To prevent information leakage caused by near-duplicate imagery, all images originating from the same UAV flight or depicting the same transmission tower were assigned exclusively to a single subset. Approximately 70% of the images were used for model training, 15% for validation and hyperparameter tuning, and the remaining 15% for final testing. The held-out test set was reserved strictly for the quantitative evaluation of model performance, as reported in Section 3.

In addition, a dedicated challenge subset was curated to qualitatively assess model robustness [89] under adverse or nonstandard conditions, including extreme camera angles, severe occlusions, and low-light or night-time imagery. This subset enabled stress-testing of the detection models beyond the typical operational domain, providing valuable insights into their generalization capabilities.

Each model was trained using an appropriate composite loss function designed to balance localization accuracy, classification confidence, and robustness to class imbalance.

For YOLOv8, the overall optimization objective was defined as a weighted sum of three components—the bounding box regression loss, objectness loss, and classification loss—formulated as Equation (1):

L_{t o t a l} = λ_{b o x} L_{C I o U} + λ_{o b j} L_{o b j} + λ_{c l s} L_{c l s} .

(1)

Here

L_{C I o U}

represents the Complete IoU loss for bounding box regression,

L_{o b j}

denotes the objectness confidence term, and

L_{c l s}

corresponds to the binary cross-entropy loss for class prediction.

To promote more precise localization, the weighting factor for the bounding box component was slightly increased from 1.0 to 1.2, as even minor coordinate deviations can result in missing small defects.

The EfficientDet-D2 model employed a focal loss function for classification, with parameters α = 0.25 and γ = 1.5, combined with a Smooth L1 loss for bounding box regression. Default configurations were retained since they provided stable and well-balanced gradients.

For Faster R-CNN, training utilized a standard cross-entropy loss for object classification and Smooth L1 loss for bounding box regression, along with an additional RPN objectness loss for region proposal generation.

To address the strong class imbalance between defective and non-defective instances, a weighting factor of 2× was applied to the defect class in the classification loss. This adjustment reduced the model’s bias toward the dominant “no defect” category, thereby improving sensitivity to rare fault instances.

Model performance was periodically evaluated on the validation set to monitor training progress and prevent overfitting. Validation was conducted every five epochs, and several key metrics were recorded, including mAP@0.5, mAP@0.5:0.95 (following the COCO evaluation protocol), precision, recall, and the corresponding loss components. These metrics provided continuous feedback on both localization and classification quality throughout training.

A confusion matrix was also maintained to analyze inter-class misclassifications and identify categories that were frequently confused [90,91]. For instance, during the early training stages, the model occasionally misclassified rust corrosion as surface dirt or stains on insulators due to their visual similarity. This observation informed subsequent data refinements—additional examples of clean insulators were augmented with synthetic noise to better distinguish surface artifacts from true defects.

Training progress and performance trends were visualized using TensorBoard, which provided dynamic plots of precision–recall curves, loss trajectories, and metric evolution across epochs. The gradual improvement of precision–recall curves indicated enhanced calibration of model confidence scores and overall convergence stability [92].

All experiments were conducted using a high-performance computing environment to ensure training efficiency and reproducibility. The majority of model training was performed on a single NVIDIA GPU, which provided sufficient computational capacity for most configurations. For YOLOv8 and EfficientDet-D2, the complete training process over 100 epochs required approximately 4–6 h, whereas Faster R-CNN training extended to around 8 h, reflecting its smaller effective batch size and the higher computational complexity of its two-stage detection pipeline.

To further accelerate training, multi-GPU experiments were carried out on an NVIDIA DGX node using four GPUs in a distributed data-parallel configuration. Under this setup, the effective batch size increased to 64 images, reducing YOLOv8 training time to approximately 2 h while maintaining stable convergence.

All experiments were implemented in Python 3.9 using the PyTorch 1.x deep learning framework. YOLOv8 models were trained via the Ultralytics API (yolo.train() function), while EfficientDet-D2 was trained using custom PyTorch scripts. For Faster R-CNN, the Detectron2 framework was employed with modified configuration files to accommodate the specific dataset and loss weighting settings.

2.5. Evaluation Metrics and Analysis Tools

To objectively assess the models, we employ standard detection metrics as well as task-specific measurements:

The Average Precision (AP) for each class is defined as Equation (2):

{A P}_{i} = \int_{0}^{1} P_{i} (R) d R,

(2)

and the Mean Average Precision (mAP) is given by Equation (3):

m A P = \frac{1}{N} \sum_{i = 1}^{N} {A P}_{i},

(3)

where

P_{i} (R)

is the precision–recall curve of class i, and N is the number of classes.

Precision (P), recall (R), and F1-score are computed as Equations (4)–(6):

P = \frac{T P}{T P + F P},

(4)

R = \frac{T P}{T P + F N},

(5)

F_{1} = 2 \frac{P \times R}{P + R},

(6)

where TPs (True Positives) denote the number of correctly detected objects, FPs (False Positives) refer to incorrectly detected objects (false alarms), and FNs (False Negatives) represent missed detections where existing defects were not identified.

To evaluate spatial accuracy, we use Intersection over Union (IoU) from Equation (7):

I o U (B_{p}, B_{g t}) = \frac{|B_{p} \cap B_{g t}|}{|{B_{p} \cup B}_{g t}|},

(7)

Here,

B_{p}

and

B_{g t}

denote predicted and ground-truth bounding boxes.

A detection is correct if

I o U

> τ, typically τ = 0.5.

The Mean Average Precision (mAP) metric serves as the principal quantitative measure for evaluating object detection performance. In this study, mAP is reported at an IoU threshold of 0.5 (denoted as AP@0.5), where a detection is considered correct if the predicted bounding box overlaps with the ground truth by at least 50%. Additionally, we compute mAP@0.5:0.95, following the COCO evaluation protocol, which averages precision across multiple IoU thresholds ranging from 0.5 to 0.95 in increments of 0.05. This provides a more comprehensive assessment of detection accuracy under increasingly strict localization criteria.

The mAP metric integrates both precision and recall across all confidence thresholds, effectively representing the area under the precision–recall curve. Higher mAP values indicate that the model achieves a favorable balance between detecting the majority of true defects (high recall) and minimizing false detections (high precision). On the test dataset, YOLOv8 achieved the highest mAP values (as detailed in Section 3), reaching approximately 0.70 AP on specific InsPLAD classes, while EfficientDet-D2 and Faster R-CNN demonstrated slightly lower but comparable results.

While mAP provides an overall assessment of detection performance, individual metrics such as precision (P) and recall (R) offer more detailed insights into the model’s behavior [93,94]. Precision represents the proportion of correctly predicted defect instances among all detections, whereas recall measures the proportion of actual defects that were successfully identified by the model. Both metrics are evaluated at a fixed IoU threshold of 0.5 and at the confidence level that yields the optimal F₁-score on the validation set.

High precision indicates a low rate of false alarms, which is crucial for avoiding unnecessary maintenance interventions, while high recall reflects the model’s ability to minimize missed detections—an essential factor in ensuring operational safety and reliability [95]. The goal is to achieve a balanced trade-off between these two measures. In our experiments, YOLOv8 achieved approximately a precision of 0.92 and a recall of 0.88 at the confidence threshold maximizing the F₁-score. In comparison, Faster R-CNN demonstrated a slightly higher recall (≈0.90) but lower precision (≈0.85), indicating that it tended to generate more candidate detections, thereby increasing the likelihood of false positives.

The F₁-score serves as a consolidated indicator of detection performance by combining precision and recall into a single balanced metric [96]. It is particularly useful for evaluating how well a model maintains equilibrium between avoiding false alarms and minimizing missed detections. During validation, we report the maximum F₁-score obtained by varying the model’s confidence threshold, which reflects the optimal trade-off between these two metrics.

In practical deployment, the operating threshold is typically chosen near this peak value to achieve balanced performance under real-world conditions. The F₁-score also facilitates comparative analysis among models: those exhibiting large disparities between precision and recall generally yield a lower maximum F₁, while models maintaining both metrics at consistently high levels demonstrate more reliable and stable detection behavior.

For multi-class detection tasks—such as distinguishing among various defect types—a confusion matrix was generated to compare the predicted and ground-truth class labels across all detected instances [97,98]. This matrix provides a detailed view of model performance by highlighting systematic misclassifications and identifying categories that are most frequently confused.

For example, in our experiments, several cases of corroded hardware were incorrectly classified as non-defective but dirty components, while small bird nests on insulators were occasionally mislabeled as corrosion due to their similar color and texture. Such insights are critical for understanding the model’s limitations and guiding targeted improvements in training data or augmentation strategies.

A representative confusion matrix is presented in Section 3 to illustrate typical error patterns and to support the qualitative analysis of defect-type separability.

Given the requirement for real-time processing in onboard UAV applications, each model’s inference speed was evaluated in frames per second (FPS) on both high-performance and edge-computing hardware. Specifically, FPS was measured on a desktop-class NVIDIA RTX 3090 GPU and a low-power NVIDIA Jetson Xavier NX device to assess the models’ deployability under different computational constraints.

As expected, YOLOv8 demonstrated the highest inference speed, exceeding 50 FPS on the RTX 3090 and achieving approximately 10 FPS on the Jetson platform for the standard input resolution used in this study. EfficientDet-D2 achieved around 18 FPS on the high-end GPU, while Faster R-CNN was the slowest, operating at roughly 5 FPS due to its two-stage architecture and heavier computational load.

In addition to speed, model complexity—expressed in terms of parameter count and memory footprint—was also considered to contextualize performance differences [99,100]. These results highlight the trade-off between accuracy and computational efficiency, emphasizing YOLOv8 as the most suitable model for real-time, edge-based power line inspection scenarios.

In addition to quantitative metrics, qualitative visual analysis was performed to further assess model performance and interpret detection behavior. Predicted bounding boxes were overlaid on test images to visually inspect the accuracy and completeness of defect localization. These visual outputs included color-coded bounding box plots to distinguish between component and defect classes, as well as confidence heatmaps highlighting regions of high model certainty.

Furthermore, precision–recall (PR) curves were generated for each class to identify potential weaknesses; for example, the “bird nest” class exhibited a lower area under the curve (AUC), reflecting its greater visual variability and detection difficulty. For detailed error analysis, representative cases of false positives (e.g., shadows misinterpreted as corrosion) and false negatives (e.g., small cracks not detected) were examined and visualized by comparing predictions against ground truth annotations.

Such qualitative evaluations are essential to ensure that model predictions align with real-world defect characteristics, providing deeper insight into error sources and guiding further improvements in dataset design and model training.

In the specific context of power line maintenance, an additional task-oriented metric termed inspection efficiency was introduced to quantify the practical benefits of automation. This measure expresses the effective distance of power lines inspected per hour by the AI-based system compared with traditional manual inspection [101]. The value is derived from drone flight speeds and AI processing frame rates. For instance, with a typical drone covering approximately 5 km of power line per hour and the vision model analyzing five frames per second with sufficient spatial overlap, the automated system achieves an effective inspection capacity of roughly 25 km per hour, significantly exceeding the manual inspection rate of less than 1 km per hour.

Although not a conventional computer vision metric, inspection efficiency provides a practical indicator of the system’s real-world impact in terms of operational scalability and labor reduction.

Additionally, we evaluated the true negative rate—the proportion of images correctly identified as having no detectable defects—to ensure that the system maintains reliability [102] in non-defective scenarios and avoids unnecessary false alerts during large-scale inspections.

All metrics are computed on the held-out test set comprising scenes not seen during training. Where available, we compare our results to published benchmarks (e.g., values in the literature on the InsPLAD detection task or similar tasks) [103,104]. For instance, if an InsPLAD paper reports a certain AP for YOLOv5 or Cascade R-CNN, we use those as reference points. We ensure our evaluation protocol is consistent: for mAP, we use the COCO definition (averaging over multiple IoUs for the “mAP” metric and also specifically report AP50 for easier comparison with some papers that use that).

In summary, our evaluation framework not only assesses accuracy but also considers speed and reliability, which are equally vital for a deployable power line defect detection system.

3. Results

3.1. Model Performance Comparison

After training the three models on the consolidated dataset, we evaluated each on the test set (containing a mix of towers, lines, insulators, and defects from various conditions). Table 1 presents a comparison of their performance metrics. The table includes mAP at 0.5 and 0.5:0.95, precision, recall, and inference speed.

To better illustrate the quantitative comparison presented in Table 1, Figure 6 visualizes the main evaluation metrics of all three models. It can be observed that YOLOv8 outperforms both EfficientDet-D2 and Faster R-CNN in terms of overall balance between precision and recall, achieving the highest F₁-score and mAP values. EfficientDet-D2 shows the highest precision (94.3%), whereas Faster R-CNN achieves slightly higher recall (90.2%) but at the cost of lower precision.

To provide an additional visual comparison of the models’ quantitative performance, Figure 7 presents a heatmap of the four primary evaluation metrics. The color intensity highlights the relative strengths of each model: YOLOv8 maintains a balanced performance across all metrics, EfficientDet-D2 achieves the highest precision, and Faster R-CNN attains the highest recall. This visualization further confirms YOLOv8’s superior trade-off between accuracy and speed for real-time defect detection.

From the above results, YOLOv8 emerges as the top performer in terms of balanced accuracy and speed. It achieved an mAP@0.5 of ~88.5%, meaning it correctly detects nearly 89% of defects/components on average when a 50% IoU overlap is required. Its mAP@0.5:0.95 (which heavily penalizes localization errors) was ~53%, the highest among the models. YOLOv8’s precision was slightly lower than EfficientDet’s (92% vs. 94%), indicating a few more false positives, but YOLO had the highest recall (alongside Faster R-CNN) at ~87–90%, meaning it missed the fewest actual defects. The net effect is YOLOv8 had the highest F1 of 0.897, indicating the best precision–recall trade-off. Crucially, YOLOv8 ran at >50 FPS on a single GPU, easily in real time, and about 10 FPS on the edge device, which is sufficient for onboard analysis given drone video is often 30 FPS and not every frame needs analysis. These findings reinforce YOLO’s reputation: it is a real-time detector that still achieves high accuracy, making it well-suited for deployment in inspection drones where both speed and accuracy matter.

EfficientDet-D2 showed strong precision—the highest of the three, at 94%, meaning it produced very few false detections (it is conservative in declaring something a defect). Its recall (82%) was a bit lower, indicating it missed more defects than YOLO or Faster R-CNN. This could be due to its reliance on anchor boxes; a few tiny defects might have slipped through if they were not well-covered by anchors. The mAPs were around 85% (AP50) and 50.8% (AP50:95), only marginally behind YOLOv8. EfficientDet’s F1 was about 0.877, slightly lower primarily due to recall. In practice, EfficientDet might be a good choice when false alarms need to be absolutely minimized—its high precision means that when it flags a defect, one can be very confident it is truly an issue. However, its inference speed (~18 FPS on GPU) is much lower than YOLO’s, and on the Jetson, it struggled at 4 FPS (which might be borderline for real-time unless frames are skipped or the model is further pruned). The model size of EfficientDet-D2 is also larger (in terms of parameter count ~10 M, but thanks to EfficientNet, it is still reasonably light in memory at ~30 MB). Considering its accuracy was close to YOLO’s, EfficientDet is a viable model if one can afford a bit more latency.

Faster R-CNN, as expected, had the highest recall (90.2%), meaning it found the most defects (few misses), but its precision (85.1%) was lowest—it generated more false positives. In terms of mAP, it reached ~84% at IoU 0.5, and ~48.5% on the stricter metric. So it was roughly 4–5 points mAP behind YOLOv8. This gap might be because our Faster R-CNN did not have advanced bells and whistles like cascade or focal loss that could improve it. Interestingly, the recall advantage suggests that the two-stage process helped it pick up some difficult instances that others missed; however, it might also mean it tended to double-detect some objects or confuse small background patterns for potential defects, hence lowering precision. The biggest drawback is speed: at ~5 FPS on a powerful GPU, Faster R-CNN is not suitable for live drone feed analysis (and it practically cannot run on the Jetson except at <1 FPS, so we would offload to a base station if using it). Thus, Faster R-CNN might be used as an offline analysis tool to double-check images after a flight, rather than in-flight processing. Its performance confirms the typical trade-off: two-stage detectors can be very thorough (high recall) but slower and more complex.

Statistically, all models achieved high AP50 in the high 80 s, indicating that most ground-truth defects were detected. The AP50:95 values, roughly 48–53%, reflect that precise localization and detection at high IoU is challenging—not surprising given small objects and ambiguous boundaries. These values are in line with what has been reported in the literature on similar tasks (for instance, an improved YOLOv4 model achieved ~93.8% precision and 97.3% recall on insulator defect images, which is comparable to our best results when considering different dataset specifics).

It should be noted that small differences in detection performance between deep learning models may partly reflect stochastic variability inherent to training procedures, including random initialization, data shuffling, and augmentation processes. In the present study, the reported results correspond to the best-performing checkpoints obtained during training using a consistent validation protocol. The relatively large training dataset and stable convergence behavior observed during training reduce the impact of random variability on the reported metrics.

Moreover, the comparative analysis does not rely solely on marginal differences in mAP values but also considers additional indicators such as precision–recall balance and inference efficiency, which are particularly relevant for real-time UAV-based inspection scenarios. Future work may further investigate statistical variability by performing repeated training runs and reporting confidence intervals for the evaluated models.

3.2. Efficiency Gains and Inspection Capacity

One of the goals of automating OPL inspection is to improve the speed and scale of coverage. Based on our results and known drone capabilities, we can estimate the efficiency improvements:

With YOLOv8 running in a drone at ~10 FPS, the system can analyze frames in near-real time as the drone flies. Suppose a drone travels along a line at 5 m/s (18 km/h) while capturing video. With a typical overlap, analyzing 2–3 frames per second may suffice (since consecutive video frames have high redundancy). At 3 FPS analysis, that means processing roughly one frame every 1.5 m of flight. If each frame covers, say, 10–20 m of line (depending on camera FOV and altitude), the system effectively inspects ~20–30 m of line per frame. This yields an inspection speed on the order of 50–90 m per second in terms of line length covered when considering the continuous video feed and the analysis window. That translates to about 3–5 km per minute, or up to 200–300 km per hour theoretically. In practice, the drone’s battery and flight plan limit actual throughput (a drone might cover perhaps 10–20 km in a single flight). But clearly, an AI-assisted drone can examine dozens of kilometers of lines in the time a manual crew might inspect a few kilometers. This is a transformative increase in capacity—a roughly 3× or more increase in inspection capacity has been noted in real deployments. Our system’s ability to process imagery quickly is a key enabler of that.

To illustrate the relative efficiency gains of the proposed approach, Figure 8 compares inspection capacity and remediation time across traditional manual, helicopter-based, and AI-assisted UAV methods. The results highlight a substantial improvement in both inspection throughput and maintenance responsiveness when deep learning-based automation is applied.

We also consider remediation time. Traditionally, after a manual inspection, it might take days or weeks for reports to be compiled and maintenance crews dispatched, during which the defect remains. With automated detection, findings can be reported immediately (even mid-flight). As noted earlier, case studies have found an up to 70% reduction in defect remediation time when using automated drone inspections integrated with maintenance systems.

In summary, the deep learning pipeline substantially boosts inspection productivity: more kilometers covered per hour, more frequent inspections possible (since the cost per inspection goes down), and faster turnaround from finding a problem to fixing it. A conservative estimate from our deployment simulations is that one drone team using our AI system could replace several ground teams, inspecting 3× the line length in the same time with a ~50% lower cost. This frees up skilled linemen to focus on actual repairs and high-level analysis rather than tedious patrolling. The following Section 4 will delve into how these technical results translate into field implementation considerations and remaining challenges.

4. Discussion

4.1. Technical Performance and Insights

The results demonstrate that deep learning models can effectively detect a variety of defects in overhead power lines from drone imagery. Our YOLOv8-based detector, in particular, achieved high accuracy (mAP > 88%) and real-time speeds, validating the choice of modern one-stage CNNs for this task. This confirms trends in recent research where YOLO architectures dominate aerial inspection applications due to their speed/accuracy balance. EfficientDet also proved competent, showing that an optimized anchor-based model can nearly match YOLOv8’s accuracy while maintaining precision. Faster R-CNN’s strong recall suggests that there is still value in two-stage approaches for finding a few remaining difficult-to-find defects, but its practicality is limited by speed.

One notable insight is that small-object detection (a critical challenge in power line inspection) was handled best by YOLOv8. The model’s use of a P2 feature map (additional fine feature layer) and anchor-free mechanism likely helped it latch onto very tiny objects like splices or bolts. This aligns with design choices seen in YOLOv8-P2 variants that explicitly enhance small-object granularity. EfficientDet’s BiFPN also contributed to multi-scale detection, but perhaps due to its reliance on fewer, scaled anchors, it had a bit lower recall on the smallest defects. A practical takeaway is that anchor-free detectors with multi-level features are advantageous for inspections where defect size can vary from a few pixels (e.g., a crack) to large (e.g., an entire missing component).

The confusion between certain defect classes (e.g., corrosion vs. other surface dirt) suggests that additional modalities might improve accuracy. For instance, an RGB image might not easily distinguish rust versus just mud on an insulator, but a thermal image could show if there is abnormal heating (if corrosion leads to partial discharge heating). Likewise, ultraviolet or corona cameras can pick up partial discharges not visible in RGB. While our pipeline focused on RGB, incorporating thermal imagery—as some studies have done—can significantly boost defect detection reliability (e.g., YOLO models have hit >96% mAP on thermal images for certain power line issues, outperforming their visible-light results.) We discuss multimodal integration shortly.

Another observation is the benefit of data volume and variety. The models performed well on classes where we had many examples (insulators, towers), but struggled more on rare cases (flashover burns). The introduction of large datasets like InsPLAD (10k images) and MPID (multi-thousand insulators) was crucial for this success. This underscores that the field is moving past the era of sparse data into one of increasingly robust benchmarks. It will be important to continually expand these datasets, perhaps through community efforts, to include new defect types (e.g., gunshot damage to insulators, a real but uncommon occurrence, is not yet well-represented in datasets). More data will also allow the training of even larger models (e.g., transformer-based detectors or next-gen YOLO versions) without overfitting.

The evaluation of efficiency indicates that deployment is feasible. A YOLOv8 model can run on an edge device aboard a drone with tolerable frame rates, enabling onboard AI. Onboard processing is valuable for autonomy—the drone can potentially react in real time (e.g., circle back to examine a detected hotspot, or avoid collision if a power line is detected in its path for navigation [105]). The alternative is streaming video to the ground for analysis, which requires robust communication links. A hybrid approach can be used: basic object detection onboard (for immediate safety, obstacle avoidance, etc.), and high-resolution imagery saved for detailed cloud-based analysis [106,107]. Our pipeline can be deployed either way. We note that running models in the cloud allows the use of heavier models (like an ensemble of YOLO and Faster R-CNN) to maximize detection, if latency is less of an issue (e.g., analyzing after the flight).

It should also be noted that the evaluated models differ in architectural design, recommended input resolutions, and computational complexity. In this study, the models were trained using their typical reference configurations in order to reflect realistic deployment scenarios for UAV-based inspection systems. While enforcing identical input resolutions or computational budgets could provide additional normalization for purely methodological comparisons, such constraints may also disadvantage architectures that were specifically designed for different efficiency–accuracy trade-offs. Therefore, the present comparison emphasizes practical performance under commonly used configurations. A more controlled ablation analysis under standardized computational budgets could be explored in future work to further investigate the relative contribution of architectural design and training configurations.

4.2. Challenges and Limitations

Despite the promising results, several technical and practical challenges remain that must be addressed to ensure reliable large-scale deployment of the proposed system.

Small, Occluded, or Complex Defects. Certain defect types are inherently difficult to detect. For example, a single broken strand in a conductor cable may be nearly invisible against the sky background in aerial imagery. Occlusions also present a major limitation: vegetation overgrowth can obscure components, concealing defects beneath leaves or branches [108]. Although partially occluded objects were annotated, a completely hidden element cannot be identified by any vision-based algorithm. Integrating complementary sensing modalities such as LiDAR could help “see through” foliage and provide geometric cues for structural analysis. Complex visual backgrounds—such as trees or buildings near the lines—can also induce false positives; for instance, the model occasionally misinterpreted a tree branch as a cracked insulator. Improving context awareness through models that account for spatial relationships between components (e.g., ensuring an “insulator” is physically connected to a tower or conductor) could mitigate such errors.

Lighting and Weather Variability. Outdoor inspection conditions are highly variable, involving strong sunlight, shadows, glare, rain, fog, or haze [109,110]. While the training pipeline incorporated basic lighting augmentations, extreme cases—such as direct sunlight, lens flare, or heavy fog—can still degrade performance. Nighttime inspections represent a separate domain that may benefit from thermal or infrared imaging, which can easily detect overheated joints. Model robustness under adverse weather could be further enhanced by augmenting data with synthetic weather effects (rain, fog) or by integrating image restoration and enhancement techniques (e.g., dehazing filters) prior to inference. Recent work on power line image restoration under adverse conditions [111] could be incorporated to improve visual clarity before analysis.

Real-Time Constraints on Drones. Although the edge deployment of our models is technically feasible, onboard computing resources are also required for flight control, obstacle avoidance, and navigation [112]. Consequently, even a model achieving 10 FPS may require further optimization. Techniques such as model pruning and quantization can substantially reduce inference latency with minimal loss of accuracy [113,114]. Preliminary experiments with YOLOv8 quantized to FP16 or INT8 confirmed notable speedups while maintaining detection quality. A practical hybrid strategy may involve using a lightweight model (e.g., YOLOv8-nano) as a first-pass detector to flag frames of interest, followed by more detailed analysis by a larger model either onboard or in the cloud. Periodically transmitting selected key frames for remote inference could further balance accuracy and efficiency.

Class Imbalance and Rare Events. Certain fault types, such as severed conductors or collapsed towers, are extremely rare in practice and thus underrepresented in the training data. The absence of such examples limits the model’s ability to recognize these failure modes. Synthetic data generation (e.g., image compositing, digital manipulation, or the inclusion of historical fault imagery) could be used to simulate these scenarios. Alternatively, anomaly detection approaches can complement classification-based methods by learning the distribution of normal patterns and flagging deviations. This concept was explored in the InsPLAD dataset, where autoencoders and one-class classifiers were used to identify previously unseen anomalies such as missing tower elements.

Integration with Existing Workflows. Real-world deployment requires seamless integration with utility maintenance and asset management systems [115,116]. The AI pipeline must produce structured outputs such as inspection reports listing each detected defect, its GPS location or tower ID, and suggested maintenance priority. Although severity estimation was not included in the current work, it is essential for prioritizing repairs (e.g., distinguishing light from heavy corrosion). A rule-based or hybrid expert system could automatically assign severity levels based on component type and defect characteristics. Maintaining low false-positive rates is also critical for operator trust; initially, all AI-generated detections should be verified by human experts until system reliability is fully established.

Regulatory and Safety Considerations. Autonomous UAV inspections must comply with aviation regulations, particularly for Beyond Visual Line of Sight (BVLOS) operations. The reliability of AI-based detection contributes to demonstrating system safety—for example, through detect-and-avoid capabilities that prevent collisions with power lines or other aircraft. Although defect detection and navigation are separate tasks, they share perception components: the same vision model identifying power lines can also support safe flight control [117,118]. Data security represents another regulatory concern, as inspection imagery may contain sensitive infrastructure information. Processing data locally on the drone minimizes cybersecurity risks associated with network transmission.

Model Interpretability. Deep learning models are often criticized as “black boxes.” In safety-critical domains such as power utilities, operators may require interpretability to understand the reasoning behind detections [119]. Explainable AI (XAI) methods—such as saliency maps or Grad-CAM visualizations—can highlight image regions that influenced a prediction, helping engineers verify and trust AI decisions. Although not implemented in the current work, such methods could increase transparency and operator confidence. In practice, visual defect detection lends itself well to manual verification, as humans can readily confirm detections from annotated imagery.

Model Maintenance and Continuous Learning. As new data become available, periodic retraining or fine-tuning is necessary to maintain model accuracy [120,121]. The proposed pipeline could be extended into a continuous learning framework, where newly confirmed defect samples are fed back into the training set. To prevent catastrophic forgetting, strategies such as rehearsal of older samples or incremental learning algorithms may be employed. The models must also adapt to new equipment types introduced by utilities; for example, newly designed insulators may differ from those in the training data. Alternatively, integrating anomaly detection modules could enable the system to flag unknown components or defect types without prior retraining.

4.3. Toward an Integrated Smart Grid Solution

Looking beyond just defect detection, such a vision system can be a component in a larger predictive maintenance ecosystem for the smart grid. For example, detection results can be fed into asset management databases and correlated with other data (load, environmental conditions) to predict remaining life of components [122,123]. Integration with SCADA was mentioned; indeed, if SCADA indicates a certain line had a fault trip and our drone inspection finds a burned insulator at a location, the system can automatically pinpoint the cause of the fault. Conversely, our detected defects (like a hot splice) can be preemptively addressed before SCADA even picks up a fault. This convergence of operational technology with AI analytics is where the grid is heading.

Edge vs. Cloud Deployment. A distributed architecture is likely optimal. In the near future, we could have edge AI on drones handling immediate tasks and filtering data, and heavier cloud AI performing detailed analysis on the collected dataset [124]. For instance, a drone’s onboard YOLO might flag five spots on a line as suspicious during flight; those images are sent to a cloud server where a more powerful model (maybe a Vision Transformer or an ensemble) analyzes them in depth to confirm the defects and perhaps even estimate the amount of time before failure might occur. This edge–cloud split maximizes efficiency and safety, allowing critical quick decisions onboard the drone and deeper insights back at the base.

Multi-Modal Fusion. As noted, combining visible, thermal, and LiDAR data is a natural next step. Drones can carry infrared cameras to find hot joints (which indicate high resistance, possibly due to looseness or corrosion)—the YOLO model could be extended to detect anomalies in thermal images or even fuse the features (some researchers have already shown YOLO achieving >96% mAP on thermal for detecting overheated components). LiDAR can map the 3D geometry of lines and vegetation; integrating it can enable detection of encroachment (trees growing too close to lines), which is a leading cause of outages. In fact, one could train a model specifically for vegetation detection around lines [125]. The fusion of multimodal data will likely yield far more reliable systems, as each sensor covers the other’s blind spots.

Standardization and Benchmarks. The power industry would benefit from standard benchmarks for these algorithms, similar to COCO in general vision. InsPLAD is a great start for a public benchmark. Our work used it and others, but as a community, we should define metrics and standard testbeds [126,127]. This ensures researchers can compare methods fairly and drive progress. Moreover, regulatory bodies might require proof of performance—having standard metrics (like “99% detection rate for critical defects”) will help in that regard.

Limitations of Our Study. While we endeavored to be comprehensive, one limitation is that our test set, though diverse, may not include every edge case scenario. Real-world deployment might encounter issues like image transmission loss, vastly different camera angles (if drones tilt), or unknown defect types. Another limitation is that we focused on static image analysis. In reality, using video context could improve detection—e.g., tracking an object across frames to confirm a defect (temporal consistency check). We did not leverage temporal data, treating frames independently [128]. This is an area for improvement: integrating a tracking algorithm or a recurrent network that processes sequences could reduce false positives and fill in detections under occlusion (if a defect is not visible in one frame but appears in the next, a tracker can maintain it).

Finally, in terms of the cost–benefit tradeoff, drones and AI are investments. However, given the reduction in labor and prevention of failures, the ROI is generally positive. Still, utilities will weigh the cost of deploying many drones, training personnel, etc. The good news is drone costs are coming down, and AI software [129] like ours can be reused across many deployments, so scaling is economically feasible. There may also be organizational resistance to adopting AI over human inspection—thus, our system is designed not to eliminate humans, but to augment them [130]. The final decision and verification can remain with human engineers, but their workload is drastically reduced by filtering the torrent of images to a short list of likely issues.

Recent research in power systems increasingly emphasizes the integration of machine learning models with operational decision-making and optimization workflows. Rather than using learning models purely as standalone analytical tools, modern approaches aim to combine predictive capabilities with system-level optimization and operational constraints. For example, recent studies have demonstrated how learning-based frameworks can be integrated with energy storage operation strategies, cyber-security monitoring, and network-constrained unit commitment optimization in power systems [131,132,133]. These approaches highlight a broader trend toward closed-loop AI-enabled decision support systems where predictive models interact directly with optimization algorithms and operational planning tools.

Within this context, the visual inspection framework proposed in this study can be viewed as a complementary component of such integrated smart grid architectures. By providing automated, data-driven diagnostics of transmission infrastructure, the developed computer vision pipeline can supply reliable asset condition information that can be further incorporated into higher-level maintenance planning, asset management, and system operation optimization processes. Consequently, the proposed approach contributes to the emerging paradigm of AI-assisted infrastructure monitoring as part of a larger intelligent and self-adaptive power system ecosystem.

4.4. Future Work

Building on this work, future research directions include: Improved Models: Exploring transformer-based detectors or hybrid CNN-transformer models for possibly better defect recognition. Also, lightweight model research (like distilling YOLOv8 into an even smaller network) to push onboard inference to 30+ FPS on microcomputers. On-the-Fly Active Learning: Have drones adapt their route based on detections (e.g., if a defect is found, hover to get more angles or closer shots for confirmation). 3D Reconstruction: Use drone images to create a 3D model of the line (via photogrammetry or SLAM) and then analyze the model for defects and measurements (sag, clearance). This could find issues like slack spans or tilting poles, which single images might miss. Collaboration between Swarm Drones: Multiple drones covering different segments and sharing data to cover large territories quickly, coordinated by an AI system. Human–AI interaction: Developing intuitive user interfaces (like AR headsets that linemen could use in the field to see AI annotations on equipment in real time).

Ultimately, the vision is a fully autonomous inspection pipeline: drones launch on a schedule, scan the lines, AI identifies problems, and maintenance is dispatched proactively. Achieving this reliably will mark a significant milestone in power grid management, helping transition the industry from today’s periodic manual checks to tomorrow’s continuous, AI-driven monitoring.

5. Conclusions

In this work, we presented a comprehensive computer vision pipeline for automated defect detection in overhead power line infrastructure using deep learning. By leveraging modern object detection models (YOLOv8, EfficientDet, Faster R-CNN) and training on newly available UAV image datasets (InsPLAD, TTPLA, MPID, etc.), we demonstrated that AI can identify critical components and defects with high accuracy. Our best model (YOLOv8) detected ~88% of defects at 92% precision, including subtle issues like insulator cracks, corrosion, and missing hardware, all in real time on drone-capable hardware. These results represent a significant step towards replacing or augmenting risky manual inspections with AI-powered visual analytics.

We detailed the visual analytics pipeline from data annotation (using tools like LabelImg/Roboflow with rigorous labeling standards to model training (transfer learning on pre-trained weights, hyperparameter tuning on domain data) to evaluation (mAP, confusion analysis, speed tests). Such detail and transparency are crucial for translating research into practice, especially in safety-critical industries. We also incorporated visualizations of model outputs, highlighting both successes (e.g., accurately locating a tiny bird nest on a tower) and typical failure modes (e.g., false alarms in challenging lighting). This helps build intuition and trust in the system’s operation.

Our approach achieves notable efficiency gains: A drone outfitted with our system can inspect large stretches of line much faster and more frequently than human crews, with the potential to reduce operational costs by 30–50% and fault remediation times by up to 70%. This paves the way for truly predictive maintenance in power distribution—finding and fixing weaknesses before they cause outages. The ability to integrate detections with utility maintenance systems (SCADA, work order management) means AI can not only detect but also directly trigger maintenance actions, enabling a more resilient grid that can adapt to issues in near real time.

We discussed remaining challenges, including small-object detection limits, variable environmental conditions, and the need for multimodal data fusion (e.g., adding thermal imaging for hot-spot detection). We also outlined future enhancements such as edge–cloud distributed processing, multimodal sensor integration, and continuous learning. Progress in deep learning for power line inspection is rapidly accelerating—just in the past year, new datasets and improved YOLO-based models have pushed performance higher. Our work keeps pace with these developments, and by replacing older references with recent peer-reviewed studies, we ensured our methods and references are up to date.

In conclusion, the high-performing CV pipeline presented here demonstrates real-world feasibility for automated OPL defect detection. The convergence of drones, deep learning, and domain-specific datasets has made it possible to monitor the health of power lines more reliably and safely than ever before. By deploying systems like this, utility companies can transition from time-based maintenance schedules to data-driven predictive maintenance, improving reliability and reducing downtime. The implications extend beyond just finding defects—a network of AI-powered inspection drones could become an integral part of tomorrow’s smart grid, continuously surveying and “self-healing” the grid by informing repairs proactively. Our contributions—from dataset utilization to model benchmarking—lay the groundwork for such implementations. As the technology matures, we anticipate widespread adoption in the power industry, leading to more resilient infrastructure and a more secure supply of electricity for our communities. The marriage of computer vision and power engineering exemplified here is a prime example of AI driving innovation in critical infrastructure, ultimately helping to keep the lights on.

Author Contributions

Conceptualization, T.F.T. and O.V.A.; methodology, T.F.T.; software, T.F.T. and A.A.S.; data curation, T.F.T. and O.V.A.; formal analysis, T.F.T.; investigation, T.F.T. and O.V.A.; visualization, T.F.T.; validation, O.V.A.; resources, A.A.S.; writing—original draft preparation, T.F.T.; writing—review and editing, O.V.A.; supervision, O.V.A.; project administration, O.V.A. and A.A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets used in this study are primarily based on publicly available sources. In particular, the InsPLAD, TTPLA, MPID, and CPLID are openly accessible through the repositories and references cited in the manuscript. These datasets provide sufficient information to reproduce the experimental setup and model evaluation presented in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Faisal, M.A.A.; Mecheter, I.; Qiblawey, Y.; Fernandez, J.H.; Chowdhury, M.E.H.; Kiranyaz, S. Deep Learning in Automated Power Line Inspection: A Review. Appl. Energy 2025, 385, 125507. [Google Scholar] [CrossRef]
DJI Enterprise. Drone Powerline Inspection: Benefits and Applications; White Paper; DJI Enterprise: Shenzhen, China, 2025; Available online: https://enterprise.dji.com/ (accessed on 3 November 2025).
JOUAV. Drone for Power Line Inspection: The Benefits, Applications, and More; JOUAV: Chengdu, China, 2024; Available online: https://www.jouav.com/industry/power-line-inspection (accessed on 3 November 2025).
Labrador Rivas, A.E.; Abrão, T. Faults in Smart Grid Systems: Monitoring, Detection and Classification. Electr. Power Syst. Res. 2020, 189, 106602. [Google Scholar] [CrossRef]
Averroes.ai. Drones vs. Manual Inspections: Comparison and Best Use Cases; Averroes.ai: London, UK, 2024; Available online: https://www.averroes.ai/ (accessed on 3 November 2025).
Birdseye Aerial Drones. Drones vs. Traditional Inspections: How UAV Technology Is Enhancing Industrial Safety; Birdseye Aerial Drones: San Diego, CA, USA, 2024; Available online: https://birdseyeaerialdrones.com/ (accessed on 3 November 2025).
Vieira e Silva, A.L.B.; Felix, H.C.; Simões, F.P.M.; Teichrieb, V.; Santos, M.M.d.; Santiago, H.; Sgotti, V.; Neto, H.L. InsPLAD: A Dataset and Benchmark for Power Line Asset Inspection in UAV Images. Int. J. Remote Sens. 2023, 44, 7294–7320. [Google Scholar] [CrossRef]
Ahmed, M.D.F.; Mohanta, J.C.; Sanyal, A.; Yadav, P.S. Path Planning of Unmanned Aerial Systems for Visual Inspection of Power Transmission Lines and Towers. IETE J. Res. 2024, 70, 3259–3279. [Google Scholar] [CrossRef]
Kukharova, T.; Maltsev, P.; Abramkin, S.; Novozhilov, I. Analysis of Modern Challenges and Technological Solutions in Natural Gas Production at Fields with Complex Geological Structure: A Review. Resources 2026, 15, 32. [Google Scholar] [CrossRef]
Ahmed, M.F.; Mohanta, J.C. Autonomous Site Inspection of Power Transmission Line Insulators with Unmanned Aerial Vehicle System. Electr. Power Compon. Syst. 2024, 1–24. [Google Scholar] [CrossRef]
Kleshnia, V.A.; Kukharova, T.; Fedosov, I.S.; Tsapleva, V.V. Modeling of Pressure Control System in Oil Wells Accounting for Reservoir Non-Homogeneity. In Proceedings of the 2025 International Conference on Control and Technical Systems (CTS), Saint Petersburg, Russia, 17–19 September 2025; pp. 64–67. [Google Scholar] [CrossRef]
Felix, H.C. MPID: An Open-Source Insulator Dataset for UAV Inspection of Power Lines Through Computer Vision; GitHub: San Francisco, CA, USA, 2022; Available online: https://github.com/phd-benel/MPID (accessed on 3 November 2025).
Howard, A.; Dane, S.; Vantuch, T. VSB Power Line Fault Detection; Kaggle: San Francisco, CA, USA, 2018; Available online: https://www.kaggle.com/competitions/vsb-power-line-fault-detection (accessed on 3 November 2025).
EPRI. Insulator Defect Detection; IEEE DataPort: Piscataway, NJ, USA, 2021; Available online: https://ieee-dataport.org/competitions/insulator-defect-detection (accessed on 3 November 2025).
Li, Z.; Zhang, Y.; Wu, H.; Suzuki, S.; Namiki, A.; Wang, W. Design and Application of a UAV Autonomous Inspection System for High-Voltage Power Transmission Lines. Remote Sens. 2023, 15, 865. [Google Scholar] [CrossRef]
Martirosyan, A.V.; Martirosyan, K.V.; Chernyshev, A.B. Investigation of Popov’s Lines’ Limiting Position to Ensure the Process Control Systems’ Absolute Stability. In Proceedings of the 2023 International Conference on System Control and Modeling (SCM), Saint Petersburg, Russia, 24–26 May 2023; pp. 69–72. [Google Scholar] [CrossRef]
Martirosyan, K.V.; Chernyshev, A.B.; Martirosyan, A.V. Application of Bayes Networks in the Design of the Information System “Mineral Water Deposit”. In Proceedings of the 2023 International Conference on System Control and Modeling (SCM), Saint Petersburg, Russia, 24–26 May 2023; pp. 236–239. [Google Scholar] [CrossRef]
Asadulagi, M.-A.M.; Fedorov, M.C.; Trushnikov, V.E. Control Methods of Mineral Water Wells. In Proceedings of the 2023 International Conference on Control and Technical Systems (CTS), Saint Petersburg, Russia, 21–23 September 2023; pp. 152–155. [Google Scholar] [CrossRef]
Ilyushin, Y.V.; Boronko, E.A. Analysis of Energy Sustainability and Problems of Technological Process of Primary Aluminum Production. Energies 2025, 18, 2194. [Google Scholar] [CrossRef]
Nazarychev, A.; Iliev, I.; Manukian, D.; Beloev, H.; Suslov, K.; Beloev, I. Review of Operating Conditions, Diagnostic Methods, and Technical Condition Assessment to Improve Reliability and Develop a Maintenance Strategy for Electrical Equipment. Energies 2025, 18, 5832. [Google Scholar] [CrossRef]
Semenova, T.; Sokolov, I. Theoretical Substantiation of Risk Assessment Directions in the Development of Fields with Hard-to-Recover Hydrocarbon Reserves. Resources 2025, 14, 64. [Google Scholar] [CrossRef]
Hosseini, M.M.; Umunnakwe, A.; Parvania, M.; Tasdizen, T. Intelligent Damage Classification and Estimation in Power Distribution Poles Using Unmanned Aerial Vehicles and Convolutional Neural Networks. IEEE Trans. Smart Grid 2020, 11, 3325–3333. [Google Scholar] [CrossRef]
Landwehr, J.P.; Kühl, N.; Walk, J.; Gnädig, M. Design Knowledge for Deep-Learning-Enabled Image-Based Decision Support Systems: Evidence from Power Line Maintenance Decision-Making. Bus. Inf. Syst. Eng. 2022, 64, 707–728. [Google Scholar] [CrossRef]
Li, D.; Wang, X.; Zhang, J.; Ji, Z. Automated Deep Learning System for Power Line Inspection Image Analysis and Processing: Architecture and Design Issues. Glob. Energy Interconnect. 2023, 6, 614–633. [Google Scholar] [CrossRef]
Liu, X.; Miao, X.; Jiang, H.; Chen, J. Data Analysis in Visual Power Line Inspection: An In-Depth Review of Deep Learning for Component Detection and Fault Diagnosis. Annu. Rev. Control 2020, 50, 253–277. [Google Scholar] [CrossRef]
Luo, Y.; Yu, X.; Yang, D.; Zhou, B. A Survey of Intelligent Transmission Line Inspection Based on Unmanned Aerial Vehicle. Artif. Intell. Rev. 2023, 56, 173–201. [Google Scholar] [CrossRef]
Maduako, I.; Igwe, C.F.; Abah, J.E.; Onwuasaanya, O.E.; Chukwu, G.A.; Ezeji, F.; Okeke, F.I. Deep Learning for Component Fault Detection in Electricity Transmission Lines. J. Big Data 2022, 9, 81. [Google Scholar] [CrossRef]
Matikainen, L.; Lehtomäki, M.; Ahokas, E.; Hyyppä, J.; Karjalainen, M.; Jaakkola, A.; Kukko, A.; Heinonen, T. Remote Sensing Methods for Power Line Corridor Surveys. ISPRS J. Photogramm. Remote Sens. 2016, 119, 10–31. [Google Scholar] [CrossRef]
Erdemir, G. Applying AI to Power Line Inspection: Recent Developments. J. Adv. Artif. Intell. 2023, 1, 141–153. [Google Scholar] [CrossRef]
Yang, L.; Fan, J.; Liu, Y.; Li, E.; Peng, J.; Liang, Z. A Review on State-of-the-Art Power Line Inspection Techniques. IEEE Trans. Instrum. Meas. 2020, 69, 9350–9365. [Google Scholar] [CrossRef]
Bellou, E.; Pisica, I.; Banitsas, K. Real-Time Object Detection on High-Voltage Powerlines Using an Unmanned Aerial Vehicle (UAV). In Proceedings of the 58th International Universities Power Engineering Conference (UPEC), Dublin, Ireland, 30 August–1 September 2023; pp. 1–6. [Google Scholar] [CrossRef]
Bellou, E.; Pisica, I.; Banitsas, K. Aerial Inspection of High-Voltage Power Lines Using YOLOv8 Real-Time Object Detector. Energies 2024, 17, 2535. [Google Scholar] [CrossRef]
Cantieri, A.; Ferraz, M.; Szekir, G.; Teixeira, M.A.; Lima, J.; Oliveira, A.S.; Wehrmeister, M.A. Cooperative UAV-UGV Autonomous Power Pylon Inspection: An Investigation of Cooperative Outdoor Vehicle Positioning Architecture. Sensors 2020, 20, 6384. [Google Scholar] [CrossRef]
Chang, A.; Liang, Y.; Jiang, M.; Li, X.; Chen, Z.; Liu, R. Application of UAV Intelligent Management and Control Platform on Overhead Transmission Lines Inspection. In Forthcoming Networks and Sustainability in the IoT Era; Al-Turjman, F., Rasheed, J., Eds.; Springer International Publishing: Cham, Switzerland, 2022; Volume 130, pp. 251–260. [Google Scholar] [CrossRef]
Hui, X.; Bian, J.; Zhao, X.; Tan, M. Vision-Based Autonomous Navigation Approach for Unmanned Aerial Vehicle Transmission-Line Inspection. Int. J. Adv. Robot. Syst. 2018, 15, 1729881417752821. [Google Scholar] [CrossRef]
Lopez Lopez, R.; Batista Sanchez, M.J.; Perez Jimenez, M.; Arrue, B.C.; Ollero, A. Autonomous UAV System for Cleaning Insulators in Power Line Inspection and Maintenance. Sensors 2021, 21, 8488. [Google Scholar] [CrossRef]
Siddiqui, Z.A.; Park, U. A Drone Based Transmission Line Components Inspection System with Deep Learning Technique. Energies 2020, 13, 3348. [Google Scholar] [CrossRef]
Xie, X.; Liu, Z.; Xu, C.; Zhang, Y. A Multiple Sensors Platform Method for Power Line Inspection Based on a Large Unmanned Helicopter. Sensors 2017, 17, 1222. [Google Scholar] [CrossRef] [PubMed]
Bao, W.; Ren, Y.; Wang, N.; Hu, G.; Yang, X. Detection of Abnormal Vibration Dampers on Transmission Lines in UAV Remote Sensing Images with PMA-YOLO. Remote Sens. 2021, 13, 4134. [Google Scholar] [CrossRef]
Han, G.; Wang, R.; Yuan, Q.; Zhao, L.; Li, S.; Zhang, M.; He, M.; Qin, L. Typical Fault Detection on Drone Images of Transmission Lines Based on Lightweight Structure and Feature-Balanced Network. Drones 2023, 7, 638. [Google Scholar] [CrossRef]
He, Y.; Wu, R.; Dang, C. Low-Power Portable System for Power Grid Foreign Object Detection Based on the Lightweight Model of Improved YOLOv7. IEEE Access 2025, 13, 125301–125312. [Google Scholar] [CrossRef]
Huang, Y.; Jiang, L.; Han, T.; Xu, S.; Liu, Y.; Fu, J. High-Accuracy Insulator Defect Detection for Overhead Transmission Lines Based on Improved YOLOv5. Appl. Sci. 2022, 12, 12682. [Google Scholar] [CrossRef]
Liu, C.; Wu, Y.; Liu, J.; Sun, Z. Improved YOLOv3 Network for Insulator Detection in Aerial Images with Diverse Background Interference. Electronics 2021, 10, 771. [Google Scholar] [CrossRef]
Liu, C.; Ma, L.; Sui, X.; Guo, N.; Yang, F.; Yang, X.; Huang, Y.; Wang, X. YOLO-CSM-Based Component Defect and Foreign Object Detection in Overhead Transmission Lines. Electronics 2023, 13, 123. [Google Scholar] [CrossRef]
Liu, Z.; Wu, G.; He, W.; Fan, F.; Ye, X. Key Target and Defect Detection of High-Voltage Power Transmission Lines with Deep Learning. Int. J. Electr. Power Energy Syst. 2022, 142, 108277. [Google Scholar] [CrossRef]
Lu, Y.; Li, D.; Li, D.; Li, X.; Gao, Q.; Yu, X. A Lightweight Insulator Defect Detection Model Based on Drone Images. Drones 2024, 8, 431. [Google Scholar] [CrossRef]
Tao, X.; Zhang, D.; Wang, Z.; Liu, X.; Zhang, H.; Xu, D. Insulator Data Set—Chinese Power Line Insulator Dataset (CPLID); IEEE DataPort: Piscataway, NJ, USA, 2020; Available online: https://ieee-dataport.org/open-access/insulator-data-set-chinese-power-line-insulator-dataset-cplid (accessed on 3 November 2025).
Meng, Y.; Tang, Y.; Huang, X.; Wang, H.; Zhu, J.; Tang, W.; Chen, L. A Lightweight Insulator Detection Methodology for UAVs in Power Line Inspection. J. Circuits Syst. Comput. 2024, 33, 2450069. [Google Scholar] [CrossRef]
Peng, H.; Liang, M.; Yuan, C.; Ma, Y. EDF-YOLOv5: An Improved Algorithm for Power Transmission Line Defect Detection Based on YOLOv5. Electronics 2023, 13, 148. [Google Scholar] [CrossRef]
Qiang, H.; Tao, Z.; Ye, B.; Yang, R.; Xu, W. Transmission Line Fault Detection and Classification Based on Improved YOLOv8s. Electronics 2023, 12, 4537. [Google Scholar] [CrossRef]
Qiu, Z.; Zhu, X.; Liao, C.; Shi, D.; Qu, W. Detection of Transmission Line Insulator Defects Based on an Improved Lightweight YOLOv4 Model. Appl. Sci. 2022, 12, 1207. [Google Scholar] [CrossRef]
Ru, C.; Zhang, S.; Qu, C.; Zhang, Z. The High-Precision Detection Method for Insulators’ Self-Explosion Defect Based on the Unmanned Aerial Vehicle with Improved Lightweight ECA-YOLOX-Tiny Model. Appl. Sci. 2022, 12, 9314. [Google Scholar] [CrossRef]
Sun, H.; Shen, Q.; Ke, H.; Duan, Z.; Tang, X. Power Transmission Lines Foreign Object Intrusion Detection Method for Drone Aerial Images Based on Improved YOLOv8 Network. Drones 2024, 8, 346. [Google Scholar] [CrossRef]
Wang, S.; Tan, W.; Yang, T.; Zeng, L.; Hou, W.; Zhou, Q. High-Voltage Transmission Line Foreign Object and Power Component Defect Detection Based on Improved YOLOv5. J. Electr. Eng. Technol. 2024, 19, 851–866. [Google Scholar] [CrossRef]
Yu, Z.; Lei, Y.; Shen, F.; Zhou, S.; Yuan, Y. Research on Identification and Detection of Transmission Line Insulator Defects Based on a Lightweight YOLOv5 Network. Remote Sens. 2023, 15, 4552. [Google Scholar] [CrossRef]
Zhang, X.; Zhang, J.; Jia, X. An Enhanced SL-YOLOv8-Based Lightweight Remote Sensing Detection Algorithm for Identifying Broken Strands in Transmission Lines. Appl. Sci. 2024, 14, 7469. [Google Scholar] [CrossRef]
Zheng, J.; Wu, H.; Zhang, H.; Wang, Z.; Xu, W. Insulator-Defect Detection Algorithm Based on Improved YOLOv7. Sensors 2022, 22, 8801. [Google Scholar] [CrossRef] [PubMed]
Ilyushin, Y.V.; Boronko, E.A. Development of a Mathematical Model of the Electromagnetic Field Formation Process Based on System Analysis Methods. Mathematics 2026, 14, 399. [Google Scholar] [CrossRef]
Gong, Y.; Zhou, W.; Wang, K.; Wang, J.; Wang, R.; Deng, H.; Liu, G. Defect Detection of Small Cotter Pins in Electric Power Transmission System from UAV Images Using Deep Learning Techniques. Electr. Eng. 2023, 105, 1251–1266. [Google Scholar] [CrossRef]
Zhukovskiy, Y.L.; Suslikov, P.K. Identification and Classification of Electrical Loads in Mining Enterprises Based on Signal Decomposition Methods. J. Min. Inst. 2025, 275, 5–17. Available online: https://pmi.spmi.ru/pmi/article/view/16670 (accessed on 3 November 2025).
Liu, M.; Li, Z.; Li, Y.; Liu, Y. A Fast and Accurate Method of Power Line Intelligent Inspection Based on Edge Computing. IEEE Trans. Instrum. Meas. 2022, 71, 1–12. [Google Scholar] [CrossRef]
Wang, G.; Chen, Y.; An, P.; Hong, H.; Hu, J.; Huang, T. UAV-YOLOv8: A Small-Object-Detection Model Based on Improved YOLOv8 for UAV Aerial Photography Scenarios. Sensors 2023, 23, 7190. [Google Scholar] [CrossRef]
Wu, K.; Chen, Y.; Lu, Y.; Yang, Z.; Yuan, J.; Zheng, E. SOD-YOLO: A High-Precision Detection of Small Targets on High-Voltage Transmission Lines. Electronics 2024, 13, 1371. [Google Scholar] [CrossRef]
Bhola, R.; Krishna, N.H.; Ramesh, K.N.; Senthilnath, J.; Anand, G. Detection of the Power Lines in UAV Remote Sensed Images Using Spectral-Spatial Methods. J. Environ. Manag. 2018, 206, 1233–1242. [Google Scholar] [CrossRef]
Chen, G.; Hao, K.; Wang, B.; Li, Z.; Zhao, X. A Power Line Segmentation Model in Aerial Images Based on an Efficient Multibranch Concatenation Network. Expert Syst. Appl. 2023, 228, 120359. [Google Scholar] [CrossRef]
Damodaran, S.; Shanmugam, L.; Swaroopan, N.M.J. Overhead Power Line Detection from Aerial Images Using Segmentation Approaches. Automatika 2024, 65, 261–288. [Google Scholar] [CrossRef]
Yan, J.; Zhang, X.; Shen, S.; He, X.; Xia, X.; Li, N.; Wang, S.; Yang, Y.; Ding, N. A Real-Time Strand Breakage Detection Method for Power Line Inspection with UAVs. Drones 2023, 7, 574. [Google Scholar] [CrossRef]
Li, C.; Zhu, F.; Guo, B.; Wang, Z.; Jiang, X.; Wang, J.; Liao, X. Power Line Extraction and Obstacle Inspection of Unmanned Aerial Vehicle Oblique Images Constrained by the Vertical Plane. Photogramm. Rec. 2022, 37, 306–332. [Google Scholar] [CrossRef]
Shuang, F.; Chen, X.; Li, Y.; Wang, Y.; Miao, N.; Zhou, Z. PLE: Power Line Extraction Algorithm for UAV-Based Power Inspection. IEEE Sens. J. 2022, 22, 19941–19952. [Google Scholar] [CrossRef]
Song, J.; Qian, J.; Li, Y.; Liu, Z.; Chen, Y.; Chen, J. Automatic Extraction of Power Lines from Aerial Images of Unmanned Aerial Vehicles. Sensors 2022, 22, 6431. [Google Scholar] [CrossRef]
Qin, X.; Wu, G.; Lei, J.; Fan, F.; Ye, X. Detecting Inspection Objects of Power Line from Cable Inspection Robot LiDAR Data. Sensors 2018, 18, 1284. [Google Scholar] [CrossRef]
Yang, L.; Fan, J.; Xu, S.; Li, E.; Liu, Y. Vision-Based Power Line Segmentation with an Attention Fusion Network. IEEE Sens. J. 2022, 22, 8196–8205. [Google Scholar] [CrossRef]
Yang, L.; Kong, S.; Deng, J.; Li, H.; Liu, Y. DRA-Net: A Dual-Branch Residual Attention Network for Pixelwise Power Line Detection. IEEE Trans. Instrum. Meas. 2023, 72, 1–13. [Google Scholar] [CrossRef]
Zhao, W.; Dong, Q.; Zuo, Z. A Method Combining Line Detection and Semantic Segmentation for Power Line Extraction from Unmanned Aerial Vehicle Images. Remote Sens. 2022, 14, 1367. [Google Scholar] [CrossRef]
Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and Efficient Object Detection. arXiv 2020, arXiv:1911.09070. [Google Scholar] [CrossRef]
Zou, K.; Jiang, Z. Power Line Extraction Framework Based on Edge Structure and Scene Constraints. Remote Sens. 2022, 14, 4575. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv 2016, arXiv:1506.01497. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Shuang, F.; Li, Y.; Zhang, L.; Huang, X.; Qin, J. SS-IPLE: Semantic Segmentation of Electric Power Corridor Scene and Individual Power Line Extraction from UAV-Based LiDAR Point Cloud. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 38–50. [Google Scholar] [CrossRef]
Lu, Y.; Zheng, E.; Chen, Y.; Wu, K.; Yang, Z.; Yuan, J.; Xie, M. Stower-13: A Multi-View Inspection Image Dataset for the Automatic Classification and Naming of Tension Towers. Electronics 2024, 13, 1858. [Google Scholar] [CrossRef]
Shen, J.; Ye, H.; Tang, C.; Zhang, G.; He, Y.; Xie, M. TTower-345: A Multi-Categories Multi-Perspectives Benchmark for Automatic Naming of Transmission Line Inspection Photos. Int. J. Adv. Mechatron. Syst. 2024, 11, 11–25. [Google Scholar] [CrossRef]
Shuang, F.; Han, S.; Li, Y.; Lu, T. RSIn-Dataset: An UAV-Based Insulator Detection Aerial Images Dataset and Benchmark. Drones 2023, 7, 125. [Google Scholar] [CrossRef]
Hütten, N.; Alves Gomes, M.; Hölken, F.; Andricevic, K.; Meyes, R.; Meisen, T. Deep Learning for Automated Visual Inspection in Manufacturing and Maintenance: A Survey of Open- Access Papers. Appl. Syst. Innov. 2024, 7, 11. [Google Scholar] [CrossRef]
Martirosyan, A.V.; Romashin, D.V. Investigation of the Control Strategies for Enhancing the Efficiency of Natural Gas Separation and Purification Processes. Processes 2026, 14, 700. [Google Scholar] [CrossRef]
Abdelfattah, R.; Wang, X.; Wang, S. TTPLA: An Aerial-Image Dataset for Detection and Segmentation of Transmission Towers and Power Lines. arXiv 2020, arXiv:2010.10032. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
Perepelkin, A.; Sharifov, A.; Titov, D.; Shandrygolov, Z.; Derkach, D.; Islamov, S. Approaches to Proxy Modeling of Gas Reservoirs. Energies 2025, 18, 3881. [Google Scholar] [CrossRef]
Raupov, I.; Rogachev, M.; Shevaldin, E. Review of Formation Mechanisms, Localization Methods, and Enhanced Oil Recovery Technologies for Residual Oil in Terrigenous Reservoirs. Energies 2025, 18, 5649. [Google Scholar] [CrossRef]
Mustafin, M.G.; Slobodkin, S.M. Methodology of Geodetic Observations for Forecasting of Potentially Hazardous Zones of Operating Main Pipelines. Int. J. Eng. 2026, 39, 1753–1761. [Google Scholar] [CrossRef]
Liu, S.; Zhang, X.; Xiao, H.; Li, Z.; Zhang, H. Double-Layer Mobile Edge Computing-Enabled Power Line Inspection in Smart Grid Networks. Information 2022, 13, 167. [Google Scholar] [CrossRef]
McEnroe, P.; Wang, S.; Liyanage, M. A Survey on the Convergence of Edge Computing and AI for UAVs: Opportunities and Challenges. IEEE Internet Things J. 2022, 9, 15435–15459. [Google Scholar] [CrossRef]
Shen, H.; Jiang, Y.; Deng, F.; Shan, Y. Task Unloading Strategy of Multi UAV for Transmission Line Inspection Based on Deep Reinforcement Learning. Electronics 2022, 11, 2188. [Google Scholar] [CrossRef]
Zhuang, W.; Xing, F.; Lu, Y. Task Offloading Strategy for Unmanned Aerial Vehicle Power Inspection Based on Deep Reinforcement Learning. Sensors 2024, 24, 2070. [Google Scholar] [CrossRef] [PubMed]
Mendu, B.; Mbuli, N. State-of-the-Art Review on the Application of Unmanned Aerial Vehicles (UAVs) in Power Line Inspections: Current Innovations, Trends, and Future Prospects. Drones 2025, 9, 265. [Google Scholar] [CrossRef]
Botyan, E.Y.; Nikolaichuk, L.; Martemyanova, A.N.; Stepuk, E.I.; Pushkarev, A.E. Improving the Approach to the Organization of Technical Repairs of Dump Trucks Using Remote Monitoring Systems of Their Nodes. Min. Inform. Anal. Bull. 2025, 12, 137–152. Available online: https://www.giab-online.ru/en/catalog/sovershenstvovanie-podhoda-k-organizacii-tehnicheskih-remontov-k (accessed on 12 March 2026).
Dai, Y.; Tan, J.; Wang, M.; Jiang, C.; Li, M. A Convolutional Neural Network Image Compression Algorithm for UAVs. J. Circuits Syst. Comput. 2024, 33, 2450211. [Google Scholar] [CrossRef]
Koteleva, N.; Valnev, V.V.; Simakov, A.S.; Shirazi, M.M. Digital Transformation of Industrial Machinery Repair and Maintenance to Build an Industrial Metaverse. J. Min. Inst. 2025, 275, 30–41. Available online: https://pmi.spmi.ru/pmi/article/view/16679 (accessed on 3 November 2025).
Li, Y.; Dong, X.; Ding, Q.; Xiong, Y.; Liao, H.; Wang, T. Improved A-STAR Algorithm for Power Line Inspection UAV Path Planning. Energies 2024, 17, 5364. [Google Scholar] [CrossRef]
Liu, X.; Jin, Z.; Jiang, H.; Miao, X.; Chen, J.; Lin, Z. Quality Assessment for Inspection Images of Power Lines Based on Spatial and Sharpness Evaluation. IET Image Process. 2022, 16, 356–364. [Google Scholar] [CrossRef]
Azevedo, F.; Dias, A.; Almeida, J.; Oliveira, A.; Ferreira, A.; Santos, T.; Martins, A.; Silva, E. LiDAR-Based Real-Time Detection and Modeling of Power Lines for Unmanned Aerial Vehicles. Sensors 2019, 19, 1812. [Google Scholar] [CrossRef]
Asadulagi, M.-A.; Pershin, I.M.; Tsapleva, V.V. Research on Hydrolithospheric Processes Using the Results of Groundwater Inflow Testing. Water 2024, 16, 487. [Google Scholar] [CrossRef]
Kongar-Syuryun, C.B.; Babyr, N.; Klyuev, R.V.; Khayrutdinov, M.; Zaalishvili, V.; Agafonov, V. Model for Assessing Efficiency of Processing Geo-Resources, Providing Full Cycle for Development—Case Study in Russia. Resources 2025, 14, 51. [Google Scholar] [CrossRef]
Babyr, N.V.; Gabov, V.V.; Nosov, A.A.; Nikiforov, A.V. Features of Design and Work Method of Mining Module at Coal Deposits in the Russian Arctic. Min. Inform. Anal. Bull. 2024, 6, 5–16. [Google Scholar] [CrossRef]
Golovina, E.; Khloponina, V.; Tsiglianu, P.; Zhu, R. Organizational, Economic and Regulatory Aspects of Groundwater Resources Extraction by Individuals (Case of the Russian Federation). Resources 2023, 12, 89. [Google Scholar] [CrossRef]
Xing, J.; Cioffi, G.; Hidalgo-Carrió, J.; Scaramuzza, D. Autonomous Power Line Inspection with Drones via Perception-Aware MPC. arXiv 2023, arXiv:2304.00959. [Google Scholar] [CrossRef]
Golovina, E.; Shchelkonogova, O. Possibilities of Using the Unitization Model in the Development of Transboundary Groundwater Deposits. Water 2023, 15, 298. [Google Scholar] [CrossRef]
Linh, N.K.; Tien, N.T.; Luan, D.C.; Dinh, D.V.; Thang, N.V. Enhancing Efficiency of Steel Prop Recovery Processes in Unused Mining Excavation. Int. J. Eng. Trans. B Appl. 2025, 38, 400–407. [Google Scholar] [CrossRef]
Korobov, G.Y.; Parfenov, D.V.; Nguyen, N.V. Long-Term Inhibition of Paraffin Deposits Using Porous Ceramic Proppant Containing Solid Ethylene-Vinyl Acetate. Int. J. Eng. Trans. B Appl. 2025, 38, 1887–1897. [Google Scholar] [CrossRef]
Bykowa, E.; Voronetskaya, V. A Compensation Strategy for the Negative Impacts of Infrastructure Facilities on Land Use. Sci 2025, 7, 95. [Google Scholar] [CrossRef]
Aleksandrova, T.N.; Lyublyanova, V.A. Enhancing Copper Recovery from Copper Minerals via Sulfuric Acid Leaching. Obogashchenie Rud 2025, 6, 16–21. [Google Scholar] [CrossRef]
Yang, S.; Hu, B.; Zhou, B.; Liu, F.; Wu, X.; Zhang, X.; Zhou, J. Power Line Aerial Image Restoration under Diverse Weather: Datasets and Baselines. arXiv 2024, arXiv:2409.04812. [Google Scholar] [CrossRef]
Bryn, M.Y.; Mustafin, M.G.; Eduardovna, D.R.; Vasilev, B.Y. Investigation of the Accuracy of Constructing Digital Elevation Models of Technogenic Massifs Based on Satellite Coordinate Determinations. J. Min. Inst. 2025, 271, 95–107. Available online: https://pmi.spmi.ru/pmi/article/view/16310 (accessed on 3 November 2025).
Jalil, B.; Leone, G.R.; Martinelli, M.; Moroni, D.; Pascali, M.A.; Berton, A. Fault Detection in Power Equipment via an Unmanned Aerial System Using Multi Modal Data. Sensors 2019, 19, 3014. [Google Scholar] [CrossRef]
Jeong, S.; Kim, M.-G.; Kim, J.-H.; Oh, K.-Y. Thermal Monitoring of Live-Line Power Transmission Lines with an Infrared Camera Mounted on an Unmanned Aerial Vehicle. Struct. Health Monit. 2023, 22, 3707–3722. [Google Scholar] [CrossRef]
Pastucha, E.; Puniach, E.; Ścisłowicz, A.; Ćwiąkała, P.; Niewiem, W.; Wiącek, P. 3D Reconstruction of Power Lines Using UAV Images to Monitor Corridor Clearance. Remote Sens. 2020, 12, 3698. [Google Scholar] [CrossRef]
Santos, T.; Cunha, T.; Dias, A.; Moreira, A.P.; Almeida, J. UAV Visual and Thermographic Power Line Detection Using Deep Learning. Sensors 2024, 24, 5678. [Google Scholar] [CrossRef]
Xi, S.; Zhang, Z.; Niu, Y.; Li, H.; Zhang, Q. Power Line Extraction and Tree Risk Detection Based on Airborne LiDAR. Sensors 2023, 23, 8233. [Google Scholar] [CrossRef] [PubMed]
Zhu, S.; Li, Q.; Zhao, J.; Zhang, C.; Zhao, G.; Li, L.; Chen, Z.; Chen, Y. A Deep-Learning-Based Method for Extracting an Arbitrary Number of Individual Power Lines from UAV-Mounted Laser Scanning Point Clouds. Remote Sens. 2024, 16, 393. [Google Scholar] [CrossRef]
Marinina, O.A. Methodological Approach to Economic Assessment of Losses of Balance Coal Reserves. Min. Inform. Anal. Bull. 2025, 11-1, 183–197. Available online: https://www.giab-online.ru/en/catalog/metodicheskiy-podhod-k-ekonomicheskoy-ocenke-poter-balansovyh-za (accessed on 12 March 2026).
Materova, E.S.; Aksenova Zh, A.; Marinina, O.A.; Sharafullina, R.R.; Rakhmatullina Yu, A.; Zhironkin, S.A.; Ayupov, A.A. Investment attractiveness of securities of Russian coal companies: Assessment and forecasting. Ugol 2025, 11, 64–70. Available online: https://www.ugolinfo.ru/index.php?article=202511064 (accessed on 12 March 2026).
Boukabou, I.; Kaabouch, N. Electric and Magnetic Fields Analysis of the Safety Distance for UAV Inspection around Extra-High Voltage Transmission Lines. Drones 2024, 8, 47. [Google Scholar] [CrossRef]
Sidorenko, A.A.; Sidorenko, S.A. A Comprehensive Strategy for Safe and Efficient Mining of Thick, Spontaneous Combustion-Prone Coal Seams under Geodynamic Hazard Conditions. Int. J. Eng. Trans. B Appl. 2026, 39, 818–827. [Google Scholar] [CrossRef]
Chen, D.-Q.; Guo, X.-H.; Huang, P.; Li, F.-H. Safety Distance Analysis of 500 kV Transmission Line Tower UAV Patrol Inspection. IEEE Lett. Electromagn. Compat. Pract. Appl. 2020, 2, 124–128. [Google Scholar] [CrossRef]
Sidorenko, A.A.; Kriukov, A.; Meshkov, A.; Sidorenko, S.A. A Comprehensive Technical and Economic Analysis of Rubber-Tyred Transport Implementation in Longwall Mining: A Case Study on the V.D. Yalevsky Coal Mine. Mining 2025, 5, 65. [Google Scholar] [CrossRef]
Pervukhin, D.A.; Lisha, T. A Multi-Objective ε-Constraint Optimization of Coal Supply Chain Performance Considering Customer Satisfaction in Multi-Layer Logistics Networks. Int. J. Eng. Trans. B Appl. 2026, 39, 1716–1729. [Google Scholar] [CrossRef]
Pervukhin, D.A.; Neyrus, S. Optimization of Bunkering Logistics at Sea, Taking into Account Cost, Time and Technical Constraints. Eng 2025, 6, 364. [Google Scholar] [CrossRef]
Galevskiy, S.; Qian, H. Developing and Validating Comprehensive Indicators to Evaluate the Economic Efficiency of Hydrogen Energy Investments. Oper. Res. Eng. Sci. Theory Appl. 2024, 7, 188–207. [Google Scholar] [CrossRef]
Öztürk, M.; Karan, M.A. Impact of Near-Fault Seismic Inputs on Building Performance: A Case Study Informed by the 2023 Maras Earthquakes. Appl. Sci. 2025, 15, 10142. [Google Scholar] [CrossRef]
Mollamahmutoglu, C.; Ozturk, M.; Yilmaz, M.O. Shear Capacity of Masonry Walls Externally Strengthened via Reinforced Khorasan Jacketing. Buildings 2025, 15, 2177. [Google Scholar] [CrossRef]
Li, Y.; Ding, Q.; Li, K.; Valtchev, S.; Li, S.; Yin, L. A Survey of Electromagnetic Influence on UAVs from an EHV Power Converter Stations and Possible Countermeasures. Electronics 2021, 10, 701. [Google Scholar] [CrossRef]
Huang, B.; Zhao, T.; Yue, M.; Zheng, X.; Wang, J. Quantum Policy Learning for Energy Storage Arbitrage. IEEE Trans. Smart Grid 2026. [Google Scholar] [CrossRef]
Huang, B.; Wang, J.; Huang, X. Aligning Quantum Kernels for Detecting False Data Injection Attacks in Power Systems. Appl. Energy 2025, 378, 127332. [Google Scholar] [CrossRef]
Chen, X.; Yang, Y.; Liu, Y.; Wu, L. Feature-Driven Economic Improvement for Network-Constrained Unit Commitment: A Closed-Loop Predict-and-Optimize Framework. IEEE Trans. Power Syst. 2022, 37, 3104–3118. [Google Scholar] [CrossRef]

Figure 1. Conceptual architecture of the proposed UAV-based visual inspection system.

Figure 2. Overall architecture of the proposed deep learning-based visual inspection system for overhead power lines.

Figure 3. Example of YOLOv8 detecting transmission towers and power lines in defect-free UAV imagery. The bounding boxes indicate the structural elements identified by the model under varying viewing angles and background conditions.

Figure 4. Example of YOLOv8 detecting missing insulator caps in UAV inspection imagery. The model identifies localized structural defects on suspension insulators and highlights the detected anomaly using bounding boxes.

Figure 5. Example of YOLOv8 detecting corrosion on metallic fittings and conductor joints in aerial inspection images. The highlighted regions correspond to rust-affected components identified by the trained detection model.

Figure 6. Comparative visualization of model performance metrics (mAP@0.5, precision, recall, and F1-score) for YOLOv8, EfficientDet-D2 and Faster R-CNN.

Figure 7. Heatmap visualization of the main performance metrics (mAP@0.5, precision, recall, and F1-score) for the evaluated models.

Figure 8. Comparative inspection efficiency of manual, helicopter-based, and AI-assisted UAV methods.

Table 1. Detection performance of YOLOv8, EfficientDet-D2, and Faster R-CNN on OPL defect test set.

Model	mAP@0.5	mAP@0.5:0.95	Precision	Recall	F1-Score	FPS (GPU)	FPS (Jetson)
YOLOv8 (small)	88.5%	53.2%	92.1%	87.5%	89.7%	52	~10
EfficientDet-D2	85.4%	50.8%	94.3%	82.0%	87.7%	18	~4
Faster R-CNN (R50)	83.7%	48.5%	85.1%	90.2%	87.5%	5	N/A (CPU fallback)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Afanaseva, O.V.; Tulyakov, T.F.; Shaimardanov, A.A. Deep Learning-Based Visual Analytics for Efficiency and Safety Optimization in Power Infrastructure. Eng 2026, 7, 135. https://doi.org/10.3390/eng7030135

AMA Style

Afanaseva OV, Tulyakov TF, Shaimardanov AA. Deep Learning-Based Visual Analytics for Efficiency and Safety Optimization in Power Infrastructure. Eng. 2026; 7(3):135. https://doi.org/10.3390/eng7030135

Chicago/Turabian Style

Afanaseva, Olga Vladimirovna, Timur Faritovich Tulyakov, and Artur Airatovich Shaimardanov. 2026. "Deep Learning-Based Visual Analytics for Efficiency and Safety Optimization in Power Infrastructure" Eng 7, no. 3: 135. https://doi.org/10.3390/eng7030135

APA Style

Afanaseva, O. V., Tulyakov, T. F., & Shaimardanov, A. A. (2026). Deep Learning-Based Visual Analytics for Efficiency and Safety Optimization in Power Infrastructure. Eng, 7(3), 135. https://doi.org/10.3390/eng7030135

Article Menu

Deep Learning-Based Visual Analytics for Efficiency and Safety Optimization in Power Infrastructure

Abstract

1. Introduction

2. Materials and Methods

2.1. Datasets for Power Line Inspection

2.2. Data Annotation and Preprocessing

2.3. Deep Learning Models for Defect Detection

2.4. Training Procedure

2.5. Evaluation Metrics and Analysis Tools

3. Results

3.1. Model Performance Comparison

3.2. Efficiency Gains and Inspection Capacity

4. Discussion

4.1. Technical Performance and Insights

4.2. Challenges and Limitations

4.3. Toward an Integrated Smart Grid Solution

4.4. Future Work

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI