MRA-YOLOv8: A Transmission Line Fault Detection Algorithm Integrating Multi-Scale Feature Fusion

Hao, Shuai; Li, Jing; Ma, Xu

doi:10.3390/s25247508

Open AccessArticle

MRA-YOLOv8: A Transmission Line Fault Detection Algorithm Integrating Multi-Scale Feature Fusion

by

Shuai Hao

¹,

Jing Li

^1,2 and

Xu Ma

^1,*

¹

College of Electrical and Control Engineering, Xi’an University of Science and Technology, Xi’an 710054, China

²

Shaanxi Aerospace Workers’ University, Xi’an 710018, China

^*

Author to whom correspondence should be addressed.

Sensors 2025, 25(24), 7508; https://doi.org/10.3390/s25247508 (registering DOI)

Submission received: 1 October 2025 / Revised: 21 November 2025 / Accepted: 25 November 2025 / Published: 10 December 2025

(This article belongs to the Section Sensing and Imaging)

Download

Browse Figures

Versions Notes

Abstract

Aiming at the problems of complex background interference and partial occlusion of fault targets during UAV transmission line inspection, this paper proposes an MRA-YOLOv8-based fault detection method for transmission line components. Firstly, the YOLOv8 network is adopted as the baseline framework, and a self-attention mechanism is incorporated into its backbone network to enhance the detection accuracy for occluded targets. Subsequently, a Multi-scale Attention Aggregation module is introduced into the neck network to improve the feature extraction capability for fault targets against complex backgrounds. Furthermore, the bounding box loss function is optimized to mitigate class imbalance issues, thereby boosting the model’s fault detection performance. Finally, the proposed algorithm is validated using inspection data collected over the past three years from an electric power inspection department. Experimental results demonstrate that the proposed method achieves an average detection precision of 92.5% and a recall rate of 90.9%.

Keywords:

YOLOv8; multi-scale attention aggregation module; self-attention mechanism

1. Introduction

In recent years, the continuous expansion of power systems has posed unprecedented challenges to the operation and maintenance of transmission lines, which serve as the “lifelines” of electrical grids. Their extensive coverage and complex operating environments make traditional manual inspection methods—hindered by difficult terrain and harsh conditions—increasingly inefficient, costly, and risky, failing to meet the safety and stability demands of modern power networks [1,2]. Against this backdrop, UAV-based inspection technology represents a significant advancement in the field. Furthermore, by integrating emerging technologies such as artificial intelligence and big data, UAVs can achieve automated defect detection and intelligent fault diagnosis of transmission lines, which in turn significantly improves the overall intelligence and efficiency of grid operation and maintenance [3].

UAV-powered inspection technologies have become a vital component of modern transmission network maintenance strategies [4]. Vision-based fault detection methods for transmission lines can be broadly categorized into two groups: traditional machine learning approaches [5,6] and deep learning-based methods [7,8].

Object detection algorithms based on convolutional neural networks can be broadly categorized into two groups: anchor-based and anchor-free methods. Anchor-based approaches can be further subdivided into one-stage and two-stage detectors [9]. Two-stage algorithms divide the detection process into two phases: generating region proposals containing potential objects, followed by object classification and bounding box regression. Representative algorithms include R-CNN [10], Fast R-CNN [11], and Faster R-CNN [12]. However, the sequential nature of proposal generation and subsequent refinement necessitates multi-step processing, resulting in slower training and inference speeds. This increased computational time makes them less suitable for applications requiring high real-time performance.

In contrast, one-stage algorithms directly predict both object categories and locations in a single forward pass, eliminating the need for a separate region proposal stage. This significantly reduces computational complexity and improves detection speed. Classical examples include SSD (Single Shot MultiBox Detector) [13] and the YOLO series (You Only Look Once) [14,15,16,17,18]. For instance, Zhang et al. proposed an improved SSD algorithm that integrates multi-level features via a channel attention mechanism and dilated convolutions, thereby expanding the receptive field of low-level features [19]. Xu et al. applied YOLOv5 for the automatic annotation of insulator images, where the network was first fine-tuned on a small set of manually labeled examples and then used to automatically label larger datasets [20]. Ma et al. incorporated an SPPCSPC module into the backbone network to aggregate multi-scale features within a spatial pyramid, enhancing the model’s capability to detect objects at various scales [21]. Despite numerous advances in fault detection for transmission lines, most existing studies focus on improving single-stage detection accuracy or lightweight model optimization [22,23,24]. However, challenges such as partial occlusion, small fault scale, and complex background interference remain unresolved. Recent studies [25,26,27] attempted to enhance YOLO-based architectures but often overlook feature correlation modeling and multi-scale aggregation, limiting their ability to handle fine-grained or occluded faults. To address these gaps, this study proposes MRA-YOLOv8, which integrates self-attention mechanisms and multi-scale attention aggregation, coupled with an adaptive loss design, to achieve robust fault detection under challenging UAV inspection scenarios. The main contributions of this work are as follows:

(1): To address the issue of weakened features for small targets against complex backgrounds, a Multi-Scale Attention Aggregation (MSAA) module is designed. In the spatial refinement path, multi-scale fusion helps retain fine-grained features of small objects, while the channel aggregation path enhances the model’s focus on channel information that is most relevant to the target. Together, these two mechanisms improve background suppression and effectively reduce interference from cluttered environments.
(2): To address the performance degradation caused by partial object occlusion, a C2f-Restormer module is introduced. This module captures inter-feature relationships across different subspaces, computes attention weights to assess element importance for adaptive feature aggregation, and models long-range dependencies. Consequently, it significantly enhances the overall feature representation capacity.
(3): This study proposes ATFL (Adaptive Task-focused Focal Loss), a loss function that adaptively adjusts training weights based on task-specific requirements, thereby improving model performance and convergence. It is intended to alleviate class imbalance issues and increase the model’s focus on hard-to-classify examples, thereby improving fault detection performance.

2. Principle of YOLOv8 Detection Algorithm

The advancements in the field of deep learning have facilitated the continuous development of the YOLO series, aiming to achieve higher detection accuracy and more efficient feature representation. Building on the foundation of YOLOv5, YOLOv8 [28] further enhances the network architecture by incorporating the SPP (Spatial Pyramid Pooling) module and the PAN (Path Aggregation Network) module, thereby improving feature extraction, multi-scale representation, and overall detection performance.

(1): Input: Input images of varying sizes are initially resized to a uniform resolution of 640 × 640 to meet the model’s input requirements. Data diversity is enhanced through augmentation techniques such as random cropping, rotation, and flipping, thereby improving dataset quality. Predefined anchor boxes are utilized to generate candidate regions, preparing the data for subsequent feature extraction and object detection.
(2): Backbone: Serving as the core component of YOLOv8, the backbone network extracts multi-level features from the input images. By incorporating a Cross-Stage Partial (CSP) structure, it effectively reduces computational overhead while maintaining strong feature representation capability. The use of depthwise separable convolution further decreases the number of parameters and computational complexity. Hierarchical convolutional operations capture features at different scales, enabling robust detection of multi-scale objects.
(3): Neck: YOLOv8 employs a Path Aggregation Network (PANet) as its neck module. This architecture facilitates bidirectional (top-down and bottom-up) multi-scale feature fusion, enriching feature expressiveness. By integrating contextual information and combining features from different levels, the model gains the ability to detect objects across a wide range of scales, particularly enhancing its performance on small targets.
(4): Head: The detection head constitutes the final component of YOLOv8, responsible for generating the final detection results. Its primary functions include classification and regression, multi-task learning, and non-maximum suppression (NMS). Classification and regression involve predicting object categories and bounding box coordinates for each candidate region. Through multi-task training, end-to-end object detection is achieved. Finally, the NMS algorithm is applied to eliminate redundant detection boxes and retain the most probable predictions.

3. MRA-YOLOv8 Detection Network

The MRA-YOLOv8 transmission line fault target detection network framework proposed in this paper is shown in Figure 1.

3.1. Backbone Network Reconfiguration

During the detection of faulty components in transmission lines, some objects are notably small and exhibit high similarity to their surrounding environment, making accurate differentiation challenging. To address this issue, we integrate the C2f and Restormer modules to form a unified C2f-Restormer block. Additionally, a Multi-Scale Attention Aggregation (MSAA) module is incorporated to mitigate the adverse impact of complex backgrounds on detection accuracy. Furthermore, to enhance the model’s focus on defect categories, especially those that are challenging to detect under difficult conditions, an Adaptive Task-focused Focal Loss (ATFL) is introduced, thereby improving overall recognition performance.

3.1.1. Multi-Scale Attention Aggregation Module (MSAA)

In complex environments, the detection of small targets is often hindered by weakened feature representation and significant background interference, which collectively degrade detection accuracy. The MSAA module [29] tackles these issues through multi-scale fusion and channel-wise aggregation pathways, effectively improving the detection efficacy of small objects. By leveraging multi-dimensional fusion techniques, the module captures feature information across different scales, enabling precise identification of small targets under varying background conditions. Simultaneously, the integration of information from different feature channels enhances the expressive capacity of small target features, thereby increasing robustness in cluttered scenes. The MSAA module not only strengthens the recognition capability for small targets but also significantly suppresses background interference in complex environments, offering an effective solution for small object detection.

The features extracted from the backbone network undergo a refinement process through dual-path operations on both spatial and channel dimensions, enhancing the informational content in each domain and ultimately improving the quality of the output feature map. In the spatial refinement path, the number of channels is first reduced to C₂ via a 1 × 1 convolution. Multi-scale features are then aggregated by concatenating outputs from convolutional layers with varying receptive fields (3 × 3, 5 × 5, 7 × 7). These features are subsequently combined using both average and max pooling operations. A spatial attention map is generated by applying a sigmoid activation following a 7 × 7 convolution, which is then multiplied element-wise with the original features to emphasize salient regions. Simultaneously, in the channel aggregation path, global average pooling reduces the spatial dimensions to 1 × 1, producing a compact channel-wise descriptor. This descriptor is then processed by a 1 × 1 convolution followed by a ReLU activation to generate channel attention weights. The resulting channel attention mechanism is fused with the spatially refined features through multiplicative integration. By incorporating the MSAA module, the model demonstrates improved recognition accuracy for small targets and enhanced overall detection efficiency, see Figure 2.

3.1.2. Self-Attention Mechanism Restormer

Leveraging the self-attention mechanism [30], the module establishes inter-dependencies among spatial features, computes attention weights to integrate information, and captures long-range contextual relationships. The incorporation of local encoding techniques provides positional information to the model, facilitating the recovery of structural details within the image.

The module comprises two principal components: the Multi-scale Dense Transformer Attention (MDTA) and the Gated Dual-linear Feedforward Network (GDFN), which collectively form a hierarchical feature extraction architecture. Within this framework, internal features are processed through multiple parallel pathways. Each pathway independently computes feature representations, which are subsequently merged and transformed to enhance representational capacity. The GDFN further refines the output features from the MDTA through a gating mechanism and bilinear transformation. This enables controlled feature flow and adaptive fusion, allowing selective enhancement or suppression of different feature responses to optimize the final output, see Figure 3.

3.1.3. Improved Bounding Box Loss Function

The Adaptive Threshold Focal Loss (ATFL) [31] is a loss function designed to dynamically adjust loss weights during training. Its primary objective is to reduce the model’s reliance on easily classifiable examples while increasing its focus on hard-to-classify instances, thereby improving detection performance in class-imbalanced scenarios.

ATFL automatically modulates loss weights based on each example’s distinctive characteristics and the model’s current predictions. By leveraging prior knowledge and self-learning capabilities from the data, it determines optimal loss weights through individualized training for each sample. The model dynamically adjusts a classification threshold based on the discrepancy between predicted outputs and ground-truth labels. This mechanism allows it to allocate greater attention to challenging examples during the training phase, ultimately enhancing the model’s robustness and fault detection capability. When

P t \leq 0.5

, the formula is as follows:

ATFL = - (λ - p t)^{- \ln (p t)} \log (p t)

(1)

When

p t > 0.5

, the formula is as follows:

ATFL = - (1 - p t)^{- \ln (\hat{p} c)} \log (p t)

(2)

Among these,

p t

indicates the current average predicted probability value.

\hat{p} c

indicates the forecast value for the next round.

4. Results and Analysis

The hardware configuration used in the experiments is listed in Table 1.

4.1. Dataset and Model Training

The dataset used in this study was collected from a power station over the past three years, covering various weather conditions, seasons, different time periods, and lighting angles. These images were captured by drones flying at different heights and from different perspectives, and multiple images at different heights were obtained by adjusting the camera focal length to present targets of different scales and perspectives. A total of 2001 images were used for training, 482 for testing, and the training method of this model was from scratch. The following picture shows some examples, see Figure 4.

The types of the database and the number of examples in it are shown in Table 2.

Figure 5 illustrates representative examples of the 12 fault categories listed in Table 2.

During model training, the batch size was set to 16, the number of epochs to 200, and the initial learning rate to 0.001.

4.2. Test Results and Analysis

To comprehensively assess the performance of the proposed algorithm, recall and precision are adopted as evaluation metrics. The corresponding formulas for these parameters are given below [32,33]:

m A P = \sum_{i = 1}^{N} A P i / N

(3)

Recall = \frac{N_{TP}}{N_{TP} + N_{FN}}

(4)

Precision = \frac{N_{TP}}{N_{TP} + N_{FP}}

(5)

The above formula,

N_{TP}

,

N_{FP}

,

N_{FN}

Correct detections, false positives, and false negatives, respectively. The line integral of

A P

with respect to P-R;

N

denotes the number of detection categories,

i \in (0, N)

.

The proposed algorithm’s accuracy, recall, and mAP50-95 curves are compared with the original YOLOv8 as shown in Figure 6, Figure 7 and Figure 8 below.

The proposed algorithm incorporates a multi-scale attention aggregation module, a self-attention mechanism module, and an improved bounding box loss function. To better evaluate the effectiveness of each component module, ablation experiments were conducted, and the results are presented in Table 3.

All performance metrics include precision rate, recall rate, mAP50 and mAP50–95. The precision rate measures the proportion of correctly predicted positive samples among all positive samples predicted, while the recall rate measures the proportion of positive samples that are correctly detected among all actual positive samples. mAP50 represents the average mean average precision at an IoU threshold of 0.5, and mAP50–95 is the average value of AP scores calculated over the range of IoU thresholds from 0.5 to 0.95, providing a basis for comprehensive evaluation under different levels of detection strictness. For all these metrics, the higher the value, the better the performance. Integrating the MSAA and C2f-Restormer modules enhances the feature extraction capability, especially for small-scale targets and complex backgrounds, thereby achieving improvements in precision rate, recall rate and mAP metrics.

As can be observed from Table 3, the detection precision of the baseline YOLOv8 network for fault targets is 0.902. Given the diversity of transmission line fault types and detection scenarios, specific enhancements were designed to address challenges such as complex backgrounds, partial occlusions, and small-scale targets: the C2f-Restormer module, the MSAA module, and an improved bounding box loss function. Incorporating only one or two modules may bias feature extraction toward certain fault characteristics, overlooking other critical features and slightly reducing performance. Furthermore, for fault samples that are difficult to identify due to challenging conditions such as varying illumination or occlusions, the ATFL function enables the model to focus more effectively on learning discriminative features from these hard examples, thereby improving detection accuracy.

To further validate the advantages of the proposed MRA-YOLOv8 algorithm, evaluations were conducted under three distinct challenging scenarios: partially occluded fault targets, small-scale targets, and targets in complex backgrounds. The detection results are illustrated in Figure 9, with detailed outcomes presented for the three experimental groups below.

In the first group experiment, the faulty component was partially occluded. Comparative results from the first experimental group indicate that both YOLOv5 and SSD demonstrate relatively low accuracy in detecting partially obscured objects. The proposed algorithm addresses this limitation by employing a self-attention mechanism, which captures dependencies among spatial features, computes attention weights to evaluate the importance of each element during information integration, and models long-range contextual relationships. As a result, the method exhibits increased robustness and accuracy when detecting fault targets subject to partial occlusion.

In the second group experiment, the faulty component was a small-scale one. In UAV-based inspection, images are often affected by various disturbances, causing small targets to appear blurred and difficult to recognize. The proposed algorithm incorporates the MSAA module to preserve discriminative features of small-scale objects. This module performs multi-dimensional fusion through spatial refinement and enhances the focus on target-relevant characteristics via channel-wise aggregation. Consequently, detection precision for small-scale anomalies is significantly improved.

In the third group experiment, the target was in a complex background. In complex backgrounds, imprecise feature matching often leads to missed detections or inaccurate localization. Experimental results show that while conventional detectors such as YOLOv5, YOLOv8, SSD, and Faster R-CNN are capable of identifying faults in such environments, their detection accuracy remains limited. The proposed method improves feature fusion efficiency through a multi-dimensional combination of spatial and channel-wise mechanisms, thereby enhancing robustness to complex scenes, suppressing background interference, and increasing fault detection accuracy under challenging conditions.

To objectively evaluate the performance of the proposed method, comparative experiments were carried out against several detection networks, including YOLOv5, YOLOv8, CenterNet, SSD, and Faster R-CNN. The corresponding results are summarized in Table 4.

As shown in Table 4, the proposed algorithm achieves a higher mAP50 than the other algorithms, fully validating its superiority. Compared with the baseline YOLOv8 model, MRA-YOLOv8 achieves an inference time of 13.7 ms per image, which is approximately 22% slower than YOLOv8 (11.2 ms) due to the added attention modules.

To further illustrate the enhanced feature focus of MRA-YOLOv8, Figure 10 visualizes attention heatmaps generated from the C2f-Restormer and MSAA modules. As shown, the proposed network concentrates more effectively on occluded or partially visible fault regions compared with the baseline YOLOv8, confirming that the attention mechanisms facilitate robust contextual reasoning.

5. Conclusions

This study presents MRA-YOLOv8, a tailored YOLOv8-based algorithm for fault detection in UAV-assisted transmission line inspection. The proposed approach integrates multiple specialized modules to enhance feature representation, improve detection robustness, and optimize overall performance. The principal contributions and findings are summarized as follows:

(1): The C2f-Restormer self-attention module enhances global feature extraction and captures long-range dependencies in transmission line images, thereby improving the detection of small and partially occluded fault instances. This enhancement is accompanied by a moderate increase in GPU memory usage, which is considered acceptable given the current focus on accuracy.
(2): The MSAA module facilitates multi-scale feature fusion by emphasizing informative features across different resolutions, thereby enhancing the model’s ability to detect faults of varying sizes under complex environmental conditions.
(3): The ATFL dynamically adjusts training weights according to task-specific requirements, mitigating class imbalance and improving precision and recall for challenging fault instances.

Experimental results indicate that MRA-YOLOv8 achieves a precision of 92.5%, recall of 90.9%, and mAP50 of 0.93, surpassing state-of-the-art detectors such as YOLOv8 and Faster R-CNN. The model demonstrates an effective balance between detection accuracy and real-time performance, making it highly suitable for practical UAV inspection applications. Despite the improved detection robustness, the inclusion of the C2f-Restormer module increases computational cost. Future work will focus on incorporating lightweight attention mechanisms, cross-domain adaptation, and multi-sensor fusion to further enhance robustness, deployability, and real-time applicability for operational UAV inspection systems.

Author Contributions

Conceptualization, S.H. and J.L.; methodology, J.L.; software, J.L.; validation, S.H. and J.L.; formal analysis, X.M.; investigation, J.L.; resources, S.H.; data curation, S.H. and X.M.; writing—original draft preparation, J.L.; writing—review and editing, J.L.; visualization, J.L.; supervision, S.H., J.L. and X.M.; project administration, S.H.; funding acquisition, S.H. and X.M. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the Natural Science Basic Research Program of Shanxi under Grant 2024JC-YBMS-490.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Restrictions apply to the availability of these data. Data were obtained from power grid bureau and are available from the authors with the permission of power grid bureau.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yang, L.; Fan, J.; Liu, Y.; Li, E.; Peng, J.; Liang, Z. A Review on State-of-the-Art Power Line Inspection Techniques. IEEE Trans. Instrum. Meas. 2020, 69, 9350–9365. [Google Scholar] [CrossRef]
Zhiheng, K.; Chong, T.; Peiyao, T.; Chengbo, H.; Zheng, M. Numerical Recognition Algorithm for Power Equipment Monitoring Based on Light-Resnet Convolutional Neural Network. China Electr. Power 2024, 57, 206–213. [Google Scholar] [CrossRef]
Zhang, S.; Guo, T.; Zheng, L. Data Aggregation Model for UAV Inspection of Transmission Lines under Digital Twin Technology. Electron. Des. Eng. 2024, 32, 102–106. Available online: http://m.qikan.cqvip.com/Article/ArticleDetail?id=7112752962 (accessed on 24 November 2025).
Wu, L.; Bi, J.; Chang, W.; Yang, Y.; Gong, Y. Integrated UAV Inspection Technology for Distribution Network Overhead Transmission Lines. China Electr. Power 2018, 51, 97–101, 138. [Google Scholar] [CrossRef]
Nguyen, H. Augmented Reality and Machine Learning Incorporation Using YOLOv3 and ARKit. Appl. Sci. 2021, 11, 6006. [Google Scholar] [CrossRef]
Özüpak, Y. Machine learning-based fault detection in transmission lines: A comparative study with random search optimization. Bull. Pol. Acad. Sci. Tech. Sci. 2025, 73, e153229. [Google Scholar] [CrossRef]
Faisal, A.A.; Mecheter, I.; Qiblawey, Y.; Fernandez, J.H.; Chowdhury, M.E.; Kiranyaz, S. Deep Learning in Automated Power Line Inspection: A Review. Appl. Energy 2025, 385, 125507. [Google Scholar] [CrossRef]
Siddiqui, Z.A.; Park, U. A Drone Based Transmission Line Components Inspection System with Deep Learning Technique. Energies 2020, 13, 3348. [Google Scholar] [CrossRef]
Liu, K.; Li, B.; Qin, L.; Li, L.; Zhao, F.; Wang, Q.; Lin, Z.; Yu, J. A Review of Deep Learning Object Detection Algorithms for Insulator Fault Detection in Overhead Transmission Lines. High Volt. Eng. 2023, 49, 3584–3595. [Google Scholar] [CrossRef]
Hao, S.; Zhang, X.; Ma, X. A Method for Detecting Small-Target Faults in Transmission Lines Based on PKAMNet. High Volt. Technol. 2023, 49, 3385–3394. [Google Scholar]
Chen, Y.; Wang, H.; Shen, J.; Zhang, X.; Gao, X. Application of Data-Driven Iterative Learning Algorithm in Transmission Line Defect Detection. Sci. Program. 2021, 2021, 9976209. [Google Scholar] [CrossRef]
Liang, H.; Zuo, C.; Wei, W. Detection and Evaluation Method of Transmission Line Defects Based on Deep Learning. IEEE Access 2020, 8, 38448–38458. [Google Scholar] [CrossRef]
Su, T.; Liu, D. Transmission line defect detection based on feature enhancement. Multimed. Tools Appl. 2024, 83, 36419–36431. [Google Scholar] [CrossRef]
Tan, G. FALE-YOLO: Real-time intelligent detection of multiple defects in transmission lines based on lightweight improvement and multi-scale feature fusion. Insight 2025, 67, 362–369. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R. You only look once: Unified, ruth-time object detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Wang, X.; Cao, Q.; Jin, S.; Chen, C.; Feng, S. Research on Detection Method of Transmission Line Strand Breakage Based on Improved YOLOv8 Network Model. IEEE Access 2024, 12, 168197–168212. [Google Scholar] [CrossRef]
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4:optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Jocher, G.; Stoken, A.; Borovec, J. ultralytics/yolov5. v3.0 [Computer software]. Zenodo: Geneva, Switzerland, 2020. [Google Scholar] [CrossRef]
Zhang, H.; Zhang, M. SSD Object Detection Algorithm Incorporating Channel Attention Mechanism. Comput. Eng. 2020, 46, 264–270. [Google Scholar]
Xu, W.; Gao, Y.; Chen, C. Automatic Annotation of Insulator Images Based on YOLOv5. Sci. Technol. Innov. 2021, 15–17. [Google Scholar]
Ma, X.; Wang, R.; Deng, J. Transmission Line Fault Detection Using CSD-YOLOv8. J. Xi’an Univ. Sci. Technol. 2025, 10–18. [Google Scholar]
Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO Series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar] [CrossRef]
Xu, X.; Jiang, Y.; Chen, W.; Huang, Y.; Zhang, Y.; Sun, X. DAMO-YOLO: A Report on Real-Time Object Detection Design. arXiv 2022, arXiv:2211.15444. [Google Scholar]
Mittal, P. A comprehensive survey of deep learning-based lightweight object detection models for edge devices. Artif. Intell. Rev. 2024, 57, 242. [Google Scholar] [CrossRef]
Liu, C.; Wu, Y.; Liu, J.; Sun, Z.; Xu, H. Insulator faults detection in aerial images from high-voltage transmission lines based on deep learning model. Appl. Sci. 2021, 11, 4647. [Google Scholar] [CrossRef]
Ji, Y.; Ma, T.; Shen, H.; Feng, H.; Zhang, Z.; Li, D.; He, Y. Transmission line defect detection algorithm based on improved YOLOv12. Electronics 2025, 14, 2432. [Google Scholar] [CrossRef]
Cai, Z.; Wang, T.; Han, W.; Ding, A. PGE-YOLO: A multi-fault-detection method for transmission lines based on cross-scale feature fusion. Electronics 2024, 13, 2738. [Google Scholar] [CrossRef]
Yaseen, M. What is YOLOv8: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector. arXiv 2024, arXiv:2408.15857. [Google Scholar]
Liu, M.; Dan, J.; Lu, Z.; Yu, Y.; Li, Y.; Li, X. CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation. arXiv 2024, arXiv:2405.10530. [Google Scholar]
Li, H.; Su, D.; Cai, Q.; Zhang, Y. BSAFusion: A Bidirectional Stepwise Feature Alignment Network for Unaligned Medical Image Fusion. Proc. AAAI Conf. Artif. Intell. 2025, 39, 4725–4733. [Google Scholar] [CrossRef]
Yang, B.; Zhang, X.; Zhang, J.; Luo, J.; Zhou, M.; Pi, Y. EFLNet: Enhancing Feature Learning Network for Infrared Small Target Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5906511. [Google Scholar] [CrossRef]
Hao, S.; Ren, K.; Li, J.; Ma, X. Transmission Line Defect Target-Detection Method Based on GR-YOLOv8. Sensors 2024, 24, 6838. [Google Scholar] [CrossRef]
Wang, D.; Zhang, S.; Yuan, B. Research on lightweight glass insulator spontaneous breakage defect detection based on an improved YOLOv5. High Volt. Technol. 2023, 49, 4382–4390. [Google Scholar]

Figure 1. MRA-YOLOv8 transmission line fault target detection network framework.

Figure 2. Structure of the Multi-dimensional Attention Aggregation (MSAA) module.

Figure 3. Restormer module with self-attention mechanism.

Figure 4. The unmanned aircraft collected examples of faulty components at different heights.

Figure 5. Example images of different fault types in the transmission line dataset.

Figure 6. Accuracy contrast curve.

Figure 7. Curve comparing recall rates.

Figure 8. mAP50-95 contrast curve.

Figure 9. Comparison of test results. Ground Truth. (a) Ground Truth, (b) SSD, (c) Faster RCNN, (d) Center Net, (e) YOLOv5, (f) YOLOv8, (g) YOLOv8n, (h) MRA−YOLOv8.

Figure 10. Visualization of attention heatmaps for occluded fault targets. (a) Original image, (b) CBAM, (c) SE, (d) MSAA.

Table 1. The test environment and related parameters.

Configuration	Version Information
CPU	i5-13490F
GPU	RTX 4060ti
Operating System	Windows 11
Deep learning framework	Pytorch2.0.1 CUDA 11.8

Table 2. Type and number of datasets.

Fault Type	Custom Name	Number of Faults
Damage to parallel gap	FDSJ	152
Short circuit in parallel gaps	FDD	174
Excessive parallel gaps	FDG	230
Spacer rod dislodged	JBG	284
Pin dislodged	XT	252
Screw dislodged	LST	267
Insulator dislodged	JYT	162
Power lines obstructed by debris	ZW	178
Power lines obstructed by bird nests	NC	145
Nut missing washer	DP	263
Nut loose	LSS	260
Bird deterrent spikes damaged	NS	116

Table 3. Ablation test results.

C2f-Restormer	MSAA	ATFL	Precision	Recall	mAP50	mAP50-95
-	-	-	0.902	0.889	0.910	0.627
√	-	-	0.918	0.932	0.935	0.677
-	√	-	0.941	0.881	0.923	0.641
-	-	√	0.911	0.887	0.924	0.653
√	√	-	0.933	0.902	0.920	0.681
√	-	√	0.924	0.923	0.937	0.670
-	√	√	0.934	0.914	0.935	0.671
√	√	√	0.925	0.909	0.93	0.917

“√” indicates that the module is included, while “-” indicates that the module is not included.

Table 4. Training results of the different algorithms.

Algorithm	Precision	Recall	mAP50	mAP50-95	Time (ms)
YOLOv5	0.903	0.874	0.907	0.638	14.5
Faster RCNN	0.88	0.843	0.78	0.552	28.7
SSD	0.874	0.825	0.756	0.54	13.9
YOLOv8	0.918	0.892	0.92	0.655	11.2
Center Net	0.895	0.87	0.81	0.575	16.1
YOLOv8n	0.866	0.81	0.746	0.52	10.1
MRA-YOLOv8	0.925	0.909	0.93	0.917	13.7

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hao, S.; Li, J.; Ma, X. MRA-YOLOv8: A Transmission Line Fault Detection Algorithm Integrating Multi-Scale Feature Fusion. Sensors 2025, 25, 7508. https://doi.org/10.3390/s25247508

AMA Style

Hao S, Li J, Ma X. MRA-YOLOv8: A Transmission Line Fault Detection Algorithm Integrating Multi-Scale Feature Fusion. Sensors. 2025; 25(24):7508. https://doi.org/10.3390/s25247508

Chicago/Turabian Style

Hao, Shuai, Jing Li, and Xu Ma. 2025. "MRA-YOLOv8: A Transmission Line Fault Detection Algorithm Integrating Multi-Scale Feature Fusion" Sensors 25, no. 24: 7508. https://doi.org/10.3390/s25247508

APA Style

Hao, S., Li, J., & Ma, X. (2025). MRA-YOLOv8: A Transmission Line Fault Detection Algorithm Integrating Multi-Scale Feature Fusion. Sensors, 25(24), 7508. https://doi.org/10.3390/s25247508

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

MRA-YOLOv8: A Transmission Line Fault Detection Algorithm Integrating Multi-Scale Feature Fusion

Abstract

1. Introduction

2. Principle of YOLOv8 Detection Algorithm

3. MRA-YOLOv8 Detection Network

3.1. Backbone Network Reconfiguration

3.1.1. Multi-Scale Attention Aggregation Module (MSAA)

3.1.2. Self-Attention Mechanism Restormer

3.1.3. Improved Bounding Box Loss Function

4. Results and Analysis

4.1. Dataset and Model Training

4.2. Test Results and Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI