ADS-LI: A Drone Image-Based Segmentation Model for Sustainable Maintenance of Lightning Rods and Insulators in Steel Plant Power Infrastructure

Kim, Hyeong-Rok; Choi, So-Won; Lee, Eul-Bum; Kim, Geon-Woo

doi:10.3390/su172411151

Open AccessArticle

ADS-LI: A Drone Image-Based Segmentation Model for Sustainable Maintenance of Lightning Rods and Insulators in Steel Plant Power Infrastructure

¹

Graduate Institute of Ferrous and Eco Materials Technology, Pohang University of Science and Technology (POSTECH), Pohang 37673, Republic of Korea

²

Hot Rolling and Plate Section, Facilities Engineering and Investment Department, Pohang Iron and Steel Company (POSCO), Gwangyang 57807, Republic of Korea

³

Department of Industrial and Management Engineering, Pohang University of Science and Technology (POSTECH), Pohang 37673, Republic of Korea

^*

Author to whom correspondence should be addressed.

Sustainability 2025, 17(24), 11151; https://doi.org/10.3390/su172411151

Submission received: 28 October 2025 / Revised: 5 December 2025 / Accepted: 9 December 2025 / Published: 12 December 2025

(This article belongs to the Special Issue Digital Transformation and Green Technology for Sustainable Building and Construction Management)

Download

Browse Figures

Versions Notes

Abstract

Detecting anomalies in electrical equipment and improving maintenance efficiency are critical for ensuring operational safety, reliability, and sustainability. To address the structural limitations of conventional manual and visual inspection methods, this study developed an object-recognition-based automated damage diagnosis system for lightning rods and insulators (ADS-LI), which enabled non-contact and fully automated diagnosis of lightning rods and insulators. ADS-LI employs a dual-module architecture. The first module precisely detects lightning rods and insulators using the PointRend algorithm applied to drone-acquired aerial imagery. The second module is a formula-based diagnostic model that quantitatively determines structural anomalies using the geometric attributes of the detected objects. Specifically, anomalies in lightning rods are identified by analyzing variations in inclination derived from center-coordinate shifts (

Δ x

), while insulator anomalies are evaluated based on the mask area conservation ratio (

r

). The performance of ADS-LI was validated using 90 independent test datasets, achieving a 0.89 F1-score and 99% overall accuracy. These results demonstrate that ADS-LI effectively automates labor-intensive diagnostic tasks that previously relied on skilled experts. Furthermore, by quantifying anomaly detection criteria, it ensures consistency and reproducibility for diagnostic outcomes. This study is also expected to contribute, in the long term, to the transition of elevated electrical installations toward a sustainable maintenance regime.

Keywords:

predictive maintenance; sustainable maintenance; power infrastructure; steel plant; lightning rods; electrical insulator; automated inspection; drone-based inspection; instance segmentation; PointRend; Automated Diagnosis System for Lightning Rods and Insulators (ADS-LI)

1. Introduction

1.1. Background of Study

Power infrastructure is an essential foundation for industrial production and daily life, and the stability and reliability of electrical equipment are key elements for maintaining power quality and ensuring supply sustainability [1]. Electricity demand is accelerating, driven by electrification, cooling, and data-center expansion. Global electricity demand is projected to grow by approximately 4% per year in 2025–2027, yielding approximately 3500 TWh of consumption over the period [2]. This demand for growth raises requirements for power-system reliability and resilience. The frontline of reliability lies in asset-health management for transmission and distribution equipment. In practice, most customer-interruption durations and frequencies originate in distribution networks, and on the transmission side, failures of individual devices, such as protective relays and transformers, can trigger wide-area blackouts [3]. Lightning protection systems and transmission-tower insulators are key components that block fault propagation by suppressing overvoltage and maintaining insulation. Furthermore, performance degradation can lead to flashover and system outages [4]. In particular, devices that are installed on top of chimneys and buildings, such as lightning rods and transmission tower insulators, can be subject to the accumulation of physical damage (e.g., contamination, wear, and cracks) due to constant exposure to the external environment [5]. These anomalies may cause serious accidents that may lead to power loss, electric shock, power outages, and fires beyond simple performance degradation.

Diagnosing anomalies in equipment in advance and performing timely maintenance are recognized as essential measures for ensuring the continuity of power supply and the social safety net. Most power equipment inspections, however, have relied on working at height and visual diagnosis by skilled workers, which involves structural limitations, such as high labor costs, deviations in diagnostic accuracy, and risks of safety accidents [6]. In practice, falls during work at height and missed inspections have been reported domestically and internationally. According to South Korea’s Ministry of Employment and Labor, falls account for approximately 10% of industrial accidents by type [7]. Falls during work at height and missed diagnoses have been reported globally, underscoring the increasing need for automated diagnostic systems. In particular, there is a strong demand for advanced quantitative diagnostic technologies capable of automatically detecting anomalies in outdoor electrical equipment by analyzing data-driven geometric changes in their structural components. Technological advances in drone-based image collection and object detection algorithms have rapidly increased the possibility of image-based diagnosis; however, diagnosis algorithms suitable for the actual power equipment structure are still insufficient. Zhai et al. proposed a convolutional neural network-gated recurrent unit-federated learning (CNN–GRU–FL)based anomaly detection model for distributed power infrastructure in a smart grid environment [8]. However, the model focused on detecting network security threats and thereby exhibited limitations in quantitatively diagnosing the structural damage or geometric changes in power facilities. In the case of optional equipment with complex geometry installed at height, such as lightning rods and insulators, few technologies automatically determine anomalies based on quantitative characteristics, such as geometric changes, wear, and detachment. Li and Shang proposed an algorithm that utilizes spectral similarity variations to detect abnormal areas in high-dimensional spectral images [9]. This method specializes in detecting anomalies in static surface structures and is sensitive to changes in the geometry and lighting of structures at height. Research on artificial intelligence (AI)-based diagnostic technologies that automate quantitative anomaly detection for elevated structures remains limited. Accordingly, there is a need to develop quantitative diagnostic models applicable to equipment such as lightning rods and insulators.

1.2. Problem Statement and Research Objectives

Steel mill A of Company P (POSCO, Pohang, Korea), a major steel manufacturer in South Korea, receives several thousand MW of electricity annually and generates a similar level of power through its own power generation equipment. The power transmission infrastructure, supplying the steel mill, comprises 38 steel transmission towers in the 154-kV class and spans tens of kilometers. The insulators installed at such steel towers and the lightning rods installed on top of factory rooftops, chimneys, and structures at height were subjected to visual inspection by workers who climbed the towers. However, given that working at height involves the risks of accidents, such as falls, a non-contact diagnosis method that uses drones has been introduced since the first half of 2022.

The number of lightning rods installed in the steel mill, however, exceeds 1000, and some areas have structural conditions that make drone access difficult. In some areas, professional diagnostic teams have still conducted visual inspections. Furthermore, insulators have the same restrictions due to the nature of their installation at height. Currently, diagnosis methods based on drone photography have been periodically used. The installation locations of steel tower insulators and lightning rods are presented in Figure 1 and Figure 2 as examples.

Under these field conditions, the use of drones improved the safety and efficiency of equipment inspections to a certain extent; however, the entire diagnosis process has still relied on professionals. Two personnel qualified for drone control performed imaging at the site, and the captured images were manually analyzed by a skilled diagnostic expert with over ten years of experience. The structure necessitated an average of two days per transmission tower and approximately three hours per lightning rod, severely limiting the ability to rapidly inspect equipment during natural disasters, such as typhoons, apart from regular inspections. Despite the vast volume of equipment requiring inspection, personnel and available time were severely insufficient. These structural constraints continued to hinder efforts to shorten diagnostic cycles or perform emergency inspections when necessary.

To overcome these technical constraints, efforts were made to utilize commercially available automatic imaging equipment and image recognition-based products for the diagnosis of power equipment. The AI-based diagnosis technology verified at actual industrial sites, however, was judged to be in the initial development stage [6]. Moreover, the captured images of equipment at height contain sensitive security information, raising concerns over information leakage. When commercial technology was acquired, practical operational limitations arose due to the challenging nature of flexible maintenance, primarily due to the equipment’s structure or environmental changes. Power-equipment inspection imagery requires the removal of location metadata and additional anonymization before external sharing. In a commercial platform field deployment, even after reducing the number of photos per tower from 90 to 18, the time from upload to result delivery was approximately 85 min [10].

Advancements in deep learning-based object detection technologies, such as you only look once (YOLO), mask region-based convolutional neural network (Mask R-CNN), and PointRend, have enabled the automatic recognition of equipment structures, securing a certain level of performance in technology development using image data [11]. Despite this, systems for detecting anomalies by expanding such recognition results to quantitative diagnostic indicators remain insufficient [12]. Considering this, various previous studies were conducted. Khan et al. comprehensively analyzed energy management and equipment monitoring strategies using numerical data-based sensor networks and optimization techniques for traditional power infrastructure; however, they did not address unstructured data-based diagnostics, such as image data-based structural damage detection and geometric change analysis for equipment at height [13]. Lim et al. utilized drones and infrared images as similar methods to detect anomalies in nuclear facilities; however, they also presented a limited approach for numerically linking automatic recognition results to the judgment of structural damage [14]. In particular, for complex-shaped equipment at height, such as insulators with repeated disc structures and lightning rods with unclear backgrounds and boundaries, few cases implemented anomaly detection through detailed instance segmentation and geometric analysis beyond simple bounding box-based detection. Additionally, existing diagnosis methods still depend on the qualitative experience of experts, causing limitations in ensuring the consistency and reproducibility of diagnostic criteria.

To address these technological gaps, this study aimed to enhance maintenance safety and resource efficiency for power installations. This system, termed the automated damage diagnosis system for lightning rods and insulators (ADS-LI), potentially enables the diagnosis of damage in these components without physical contact. This study is expected to lay the foundation for a sustainable diagnostic infrastructure that improves the operational reliability of elevated power installations and reduces maintenance costs. In the long term, this infrastructure is anticipated to contribute to the transition toward a sustainable maintenance regime. To this end, detailed objectives were set as follows:

A PointRend-based instance segmentation model is constructed to accurately recognize lightning rods and insulators from the images captured using a drone.
A diagnosis model is designed to quantitatively determine anomalies based on the results.
The aforementioned two models are integrated to implement ADS-LI, which automatically detects anomalies when the user uploads drone images.

This study stands out from the previous research in that its models were trained based on the image data captured at industrial sites and anomaly detection was quantified after object detection. This study introduced a novel methodology for diagnosis by developing a system that enables anyone to perform repeated and objective diagnoses using AI-based image recognition technology. This system surpasses the existing limitations, as diagnosing power equipment at height was previously confined to a specialized area requiring highly skilled workers. Despite the diagnosis being confined to limited image data and fixed equipment, the developed system was implemented as an automatic diagnosis system that can be tested by the person who performs the diagnosis, and it could contribute to securing the consistency of equipment diagnostic criteria and reproducibility of results. Furthermore, future advances are expected to expand the system to various structures at height, including corrosion and aging diagnosis for fence (protective fence) structures and crack detection in stack (chimney) structures. This scalability has the potential to develop into a foundation technology to realize the automation and quantification of professional diagnosis at various industrial sites (e.g., plants, steel mills, and petrochemicals) beyond power equipment. Over the long term, industrial and economic effects can also be expected, including technology loss supplementation due to the retirement of skilled workers, labor cost reduction, compliance with the diagnostic cycle, and disaster prevention.

The term anomaly is herein restricted to visible surface defects that can be identified by human inspectors in high-resolution drone imagery. Specifically, this includes visual anomalies such as excessive tilting and structural deformation of lightning rods, breakage or loss of insulator discs, and severe surface contamination. High-voltage insulators are a key diagnostic target in which image-based and visual information are commonly used to assess contamination and aging. Previous studies have reported that such visual indicators are often combined with electrical measurements—such as leakage current and partial discharge—to evaluate contamination severity and flashover risk [15]. By contrast, integrity assessments for phenomena that cannot be observed in images, such as internal insulation degradation, fine cracks, or purely electrical faults, must rely on conventional electrical and mechanical tests [16]. Therefore, the proposed ADS-LI is positioned not as a complete replacement for these tests but as an image-based screening tool for rapidly triaging equipment with visually suspicious anomalies. From a maintenance strategy perspective, drone- and image-based inspections should be regarded not as a standalone diagnostic modality but as a component of a multi-tier maintenance system in which multiple diagnostic stages are integrated [17]. ADS-LI does not replace traditional diagnostic procedures such as on-site measurements or laboratory testing; rather, it serves a complementary role by improving the prioritization of diagnostic targets and optimizing the timing of detailed inspections.

1.3. Research Process

The workflow of this study is organized as follows. In Section 3.1, data collection and preprocessing for model development are described. Data collection used drone-captured imagery, and only images suitable for training were selected. Preprocessing included data standardization and labeling. In Section 3.2, we developed the automated damage diagnosis systems used for lightning rods and insulators (ADS-LI) model for non-contact diagnosis by employing the PointRend_R101 algorithm, which is suited to object-boundary recognition. Section 3.3 and Section 3.4 describe the modules that detect insulator and lightning-rod objects extracted from image files, as well as the formula-based modules that automatically determine anomaly states based on changes in lightning-rod tilt and insulator area. Section 3.5 presents the evaluation metrics of the developed models and the experimental environment configured for the experiments. In Section 4, the models on field images are evaluated and performance is assessed using the F1 score. The overall model-development process described above is illustrated in Figure 3.

2. Literature Review

For this study, recent advancements in equipment maintenance strategies were first examined. Second, case studies on predictive maintenance (PdM) incorporating machine learning (ML) techniques were analyzed. Third, prior research on AI-based diagnostic technologies for electrical equipment was reviewed. Finally, object-recognition-based maintenance and PdM applications were investigated to evaluate their practical applicability in real-world industrial environments.

2.1. Advances in Equipment Maintenance Strategies

Rojek et al. reported that conventional preventive maintenance, which relies on estimation-based scheduling, can increase labor and material costs [18]. Chen et al. emphasized that digital-twin (DT)-based PdM can improve the accuracy of equipment condition awareness and enable proactive failure prediction, thereby contributing to the optimization of maintenance scheduling and the minimization of downtime [19]. Werbińska-Wojciechowska et al. argued that DT technologies enhance the accuracy and efficiency of equipment condition monitoring and diagnostics, and support data-driven maintenance decision making [20]. Ma et al. proposed a requirement-based roadmap for implementing standardized DT-based PdM automation [21]. Mikołajewska et al. emphasized that integrating AI with DT is critical for real-time decision making and strategy optimization in fault diagnosis and PdM [22]. Garcia et al. conducted a large language model (LLM)-assisted systematic literature review (SLR) from the perspectives of ML, deep learning (DL), and explainable AI (XAI) [23]. The review aimed to identify future standardization and practical implementation challenges in condition monitoring and predictive maintenance (PdM) for industrial equipment.

Maintenance systems are transitioning stepwise from corrective maintenance through preventive maintenance to AI- and DT-based PdM. The core of this transition is establishing an operational model that proactively manages risk and cost through data-driven decision making. Future priorities include standardized performance metrics, security, and field deployability, and equipment-specific maintenance strategies are becoming decisive determinants of operational performance.

2.2. PdM Using Machine Learning Technology

Equipment diagnostic technologies have evolved from rule- and threshold-based as well as signal-processing- and model-based approaches to data-driven diagnostics that leverage ML and DL [24]. This evolution has led to substantial changes in maintenance strategies across industrial sectors. In particular, PdM has enabled advanced analysis of equipment condition by exploiting sensor-based data and AI techniques. To enhance predictive maintenance for rotating machinery in the maritime industry, Apeiranthitis et al. proposed a CNN-based approach that automatically classifies bearing health states and fault types using raw vibration data from motor ball bearings as input [25]. Bunyan et al. developed an extreme gradient boosting (XGBoost)-based classification model that uses thermal load data to distinguish between healthy and faulty states of gas turbines, thereby enabling early anomaly detection and the implementation of predictive maintenance in power plants [26]. Li and Li proposed a sensor-data-driven hybrid model that combines CNNs and long short-term memory (LSTM) networks to implement predictive maintenance strategies in industrial manufacturing systems [27]. Saleem et al. proposed a deep learning framework that combines empirical wavelet transform (EWT)-based adaptive preprocessing of acoustic emission (AE) signals with a DenseNet architecture for early leak detection in process pipelines [28]. Aminzadeh et al. proposed an integrated framework for implementing PdM in industrial air compressors that performs real-time inference using a linear regression model and issues threshold-based alarms [29]. Roy et al. proposed a deep learning model that integrates CNN, VGG16, U-Net, and Swin Transformer architectures to automate structural health monitoring (SHM) of civil structures by classifying and segmenting cracks [30].

2.3. AI-Based Diagnostics for Electrical Equipment

In recent years, AI-based diagnostic techniques for electrical equipment have been extensively studied. Specifically, a wide range of models and algorithms have been developed to improve accuracy and efficiency in anomaly detection and condition assessment. He et al. proposed a YOLOv8s-SwinT method that combines a Swin Transformer with a CNN to detect subtle defects in images of transmission-line insulators [31]. Wang et al. proposed an ML-YOLOv5 algorithm for defect detection in transmission insulators by incorporating depthwise separable convolution and knowledge distillation in a YOLOv5-based architecture [32]. Ngwenyama and Gitau evaluated decision tree, support vector machine (SVM), k-nearest neighbors (KNN), and ensemble classifier models for diagnosing incipient faults in oil-immersed transformers using dissolved gas analysis (DGA) data, thereby demonstrating the feasibility of ML-based DGA diagnostics [33]. Dalila and Turkben proposed a hybrid detection system for automatic partial discharge (PD) identification in high-voltage insulation systems, integrating a ResNet and a KNN classifier to extract features from spectrograms and phase-resolved PD (PRPD) patterns [34]. Eang and Lee developed a sensor-data-driven CNN–RNN framework for early fault detection and the implementation of PdM in direct current (DC) motor drives used in manufacturing robotics systems [35]. Pohakar et al. proposed an ML approach for the diagnosis of concurrent faults in three-phase induction motors in industrial drive trains, using voltage-, current-, and speed-based features together with random forest (RF), KNN, gradient boosting machine (GBM), SVM, and combinations of XGBoost with fuzzy inference systems (FIS) [36]. Tamakloe et al. presented a multimodal data fusion-based dynamic multiscale attention (DMSA) CNN–LSTM model that enhances PdM for distribution transformers by improving transformer fault-detection performance [37].

These studies indicate that research on electrical equipment diagnostics has predominantly advanced around image- and signal-based methods, with intensive efforts devoted to insulator defect recognition, DGA-based transformer diagnosis, and PdM for motor drive systems. By contrast, automated diagnostic approaches for lightning protection systems have remained underexplored.

2.4. Object Detection-Based PdM

AI-based visual diagnostic techniques using multiple unmanned aerial vehicles (UAVs) have played a critical role in accurately analyzing electrical equipment and detecting anomalies. By enabling non-contact observation and data acquisition in high-risk environments, these techniques complement labor-intensive, field-based manual inspections and support the transition toward PdM. Lu et al. proposed the Insulator Defect Detection YOLO (IDD-YOLO) model for real-time UAV-based detection of insulator defects in power systems [38]. Santos et al. introduced a YOLOv8-based deep learning approach that detects power lines from visible and thermal images to support UAV-based maintenance inspections of transmission assets [39]. Yang et al. developed a model that combines a Vision Transformer with image filtering and thresholding algorithms on UAV imagery to enhance maintenance of gravel runways at remote airports [40]. Davis et al. compared YOLO and Mask R-CNN to advance automated inspection of wind turbine blades (WTBs) [41]. Rodriguez-Vazquez et al. proposed a keypoint-based object detection method for real-time UAV inspections of photovoltaic plants, which detects the vertices of solar panels on board the UAV and provides higher spatial granularity than conventional bounding box or segmentation approaches [42]. Barraz et al. developed an automatic geo-labeling method that combines photogrammetric data with DL based segmentation and image processing techniques to optimize large-scale UAV inspections of photovoltaic power plants [43]. Lim et al. developed a remote and regional thermal anomaly detection system that integrates drone imagery using deep learning for remote thermal monitoring of nuclear power facilities [44].

Collectively, these studies empirically demonstrate the effectiveness of UAV image-based object detection for maintenance and predictive maintenance systems. However, most of them primarily focus on improving the accuracy of object recognition; few studies have formulated explicit anomaly decision criteria or developed quantitative diagnostic schemes based on geometric indicators.

2.5. Limitation of Previous Research

PdM for electrical equipment has mainly evolved as anomaly detection and failure prediction technologies that use structured sensor data for assets such as transformers, high-voltage cables, and switchgear. In general, numerical measurements, such as voltage, current, temperature, and DGA, are acquired in real time and analyzed, and this approach has been established as a framework that can secure a certain level of diagnostic accuracy and stability.

However, these studies have mostly focused on fixed installations that can be monitored in real time using numerical signal data, while diagnostic techniques based on unstructured data, such as real-world RGB images, remain relatively limited. Specifically, studies that perform fully automated diagnosis using drone-captured images for assets, such as insulators and lightning rods installed at height, where worker access is difficult, are extremely rare, and publicly documented academic cases are difficult to identify. Although insulators with repetitive circular structures and lightning rods with elongated vertical geometries and visually ambiguous boundaries with surrounding structures are structurally challenging targets for image-based recognition, few studies have explicitly defined these components as diagnostic targets and applied automated decision making, which distinguishes them from the scope of most existing research.

Furthermore, existing studies have largely relied on qualitative judgment after object detection or restricted themselves to simple bounding box-based detections, and thereby, fail to provide quantitative criteria required for anomaly diagnosis in complex structural configurations. Hence, it is difficult to ensure the consistency and scalability of diagnostics required in real industrial environments. Additionally, automated diagnostic systems that are ready for deployment in real-time operational settings are still insufficient.

To address these limitations in prior research, in this study, the problem was defined and the research framework was designed in the following directions:

An object detection model that can automatically identify insulators and lightning rods from drone images was developed. The model is configured to robustly detect repetitive shapes and linear structures under diverse environmental conditions, and it simultaneously performs object classification and localization.
Quantitative indicators are designed to numerically encode traditional qualitative decision rules. For lightning rods, anomalies are determined based on deviations in the slope between centroid coordinates. For insulators, anomalies are determined based on the effective area preservation ratio.
An automated diagnostic system that integrates object detection and anomaly decision functions was implemented. Users can simply upload images and automatically obtain equipment recognition and anomaly diagnosis results.
Conventional visual inspection methods suffer from work safety risks and long diagnostic times. By applying drone and AI technologies, the proposed approach enables non-contact diagnosis of elevated equipment and improves diagnostic efficiency and worker safety.

The ADS-LI system proposed in this study represents an early-stage investigation that relies on limited image data collected from a single steel mill and targets only specific types of equipment. Consequently, issues such as generalizability to diverse climate and background conditions and to various types of transmission facilities, statistical validation of reliability using long-term operational data, and quantitative linkage with maintenance decision making remain research gaps to be addressed in future work. From this perspective, the present study can be regarded as a first step toward broader field deployment and performance evaluation from a sustainability perspective.

3. Materials and Methods

3.1. ADS-LI Dataset Construction

3.1.1. Data Acquiring

This study developed automated diagnostic models for lightning rods and insulators by utilizing image data of these assets. Image acquisition targeted lightning rods installed on elevated structures within the steel mill, such as rooftops and chimneys, and insulators located on transmission towers. The equipment used for drone photography was the Intel^® Falcon™ 8+ (Intel Corporation, Santa Clara, CA, USA) octocopter model, which can capture high-resolution images of up to 24 megapixels (MP) and supports a flight time of approximately 16–26 min [45]. It has features that are suitable for diagnosing equipment at height, as it is equipped with a high-precision global positioning system (GPS) based automatic flight function. The drone’s exterior photograph and its major specifications are presented in Figure 4. Drone imaging was conducted at Company P’s steel mill site, and examples of field imagery together with site location information were provided.

The images were captured at a resolution of 4000 × 3000 pixels. The drone was operated only under clear weather and low wind speed conditions, and data collection was performed only on days that met the weather criteria set in advance to minimize the quality difference depending on the photography environment. To prevent malfunctions or flight instability of the drone, photography was performed at locations that ensured a safe distance with no risk of interference from metal structures or the outer walls of buildings.

Within the steel mill, 1100 lightning rods and 228 insulators were installed as diagnostic targets. Image acquisition was conducted from May 2022 to December 2023, spanning approximately 20 months. Among the collected data, 424 images of lightning rods and 121 images of insulators were obtained, corresponding to approximately 2830 MB and 800 MB of storage, respectively. A total of 545 images were collected with an overall size of 3630 MB. Although we attempted to acquire as many diagnostic images as possible after the introduction of drones, the number of images that could be collected was inherently limited by several constraints, including weather conditions and work schedules, because all flight operations had to be performed manually by human operators. In particular, for insulators, the inspection interval is longer, the number of installed units is smaller, and access to towers is more difficult, resulting in a smaller number of available images compared with lightning rods. Consequently, concerns may arise that the number of images used for training is relatively limited. However, the dataset used in this study comprises images captured periodically from fixed viewpoints with similar camera angles, which reduces environmental variability. Additionally, prior research in object detection and anomaly detection has reported that data diversity and annotation quality exert a greater influence on model performance when compared to the sheer number of training samples.

3.1.2. Data Cleaning and Standardization

Of the 545 images collected via drone imagery, only those appropriate for model development were retained. The selection criteria required that the target object be clearly identifiable, exhibit minimal perspective distortion, and maintain sufficient structural sharpness, without visibility issues such as overexposure, shadows, or occlusion by surrounding objects. Duplicate images captured from similar viewpoints at the same location were removed by retaining only one representative frame. Additionally, images were discarded if the object (lightning rod or insulator) was not fully captured within the frame or occupied less than 5% of the total image area. After applying these criteria, 330 images were excluded, leaving 215 usable images for model training. Specifically, 110 of the 424 lightning-rod images and 105 of the 121 insulator images were selected, corresponding to approximately 388 MB and 367 MB of data, respectively. In total, the final training dataset consisted of 755 MB of imagery, which served as the foundation for enhancing object-recognition accuracy and developing the anomaly diagnosis model.

The 110 lightning rod images and 105 insulator images used for training are significantly smaller when compared with the datasets typically employed in general-purpose deep learning research. This reflects the structural constraints of industrial visual inspection environments, in which defect cases occur infrequently in real steel mill operations and pixel-level annotation of high-resolution drone imagery requires substantial labor, time, and cost [46]. This type of small-scale, label-scarce data problem has also been repeatedly reported in surface defect inspection tasks in manufacturing domains such as steel plates and bars. Accordingly, the ADS-LI can be positioned as a feasibility study that, under similar data constraints, investigates the field applicability of automatic damage diagnosis for lightning rods and insulators in steel mills using a limited number of defect images.

The raw images acquired via drones were 4000 × 3000 pixels with a 4:3 aspect ratio. This resolution provided sufficient quality to identify structural features of electrical equipment, such as insulators and lightning rods. However, given that high-resolution images can induce computational latency and memory overhead during AI training, normalization was performed by resizing them to a model-appropriate input size [47]. In this study, the resolution was reduced to 1280 pixels based on the longer side of the width or height to suit the input format of the object detection model, and ratio-based resizing was performed while maintaining the aspect ratio [48]. Resizing was conducted using the cv2.resize() function from the OpenCV 4.5.5 library in Python 3.8.18 [49].

All images were stored in JPEG (jpg) format. The original data, which amounted to approximately 3630 MB, was reduced to approximately 755 MB, and efficiency could be ensured in terms of reducing storage space and improving the learning speed. The normalized images provided consistent input conditions in the labeling and model training processes and became a basis for ensuring the reproducibility of the experiment and reliability of result verification. Figure 5 illustrates the process in which 545 images are preprocessed through the POSAI VISION platform and converted into a total of 215 standardized images.

3.1.3. Data Labeling

The generated training data was constructed in the form of JavaScript object notation (JSON) [50]. This JSON-based labeling structure follows the standard format adopted by the common objects in context (COCO) dataset, and it was constructed in a structure that includes class information (class ID), boundary coordinates (bounding box or polygon), and segmentation information (segmentation mask) for each object [51,52]. This formalized structure can ensure consistency of labeling and machine readability at the same time, thereby contributing to the automatic processing of large-scale image data and an improvement in learning efficiency. Each object was designed to include class information and polygonal coordinate values. Class 1 was defined as “Lightning Rod” and class 2 as “Insulator.” This class information was used for setting the branching conditions of the diagnosis algorithm, as well as for instance, segmentation in the learning process.

In this study, the location information and boundary structure for two types of objects (i.e., lightning rods and insulators) were manually designated through labeling, and they were used as reference values that affect instance segmentation and recognition accuracy at the AI learning stage. Labeling was consistently performed using the instance segmentation method. Segmental labeling that reflects the structural characteristics and shape of the object was applied to lightning rods and insulators, which provided higher boundary consistency and recognition accuracy than the simple bounding box-based method [53].

Labeling examples for lightning rods and insulators are shown in Figure 6. Instance segmentation provides pixel-level annotations, and the red boxes correspond to labels for object detection.

The entire process of image standardization and labeling was conducted using POSAI VISION, an AI model-development platform from Company P [54]. The platform was developed for the user to automatically perform data standardization and preprocessing required for training without any programming. When the user simply uploads images, the resolution of the uploaded images is automatically resized and converted into a format suitable for training at the standardization stage. At the labeling stage, the target of training can be directly classified using a mouse, and an image object processing AI model is provided upon the completion of labeling. Based on the selected model, it undergoes training, and its performance is evaluated. If necessary, another model is selected, and the process is repeated. POSAI VISION was developed because it also has high values in terms of technology internalization and security stability, as it operates in closed network-based operational and internal server environments, while commercial platforms from other companies incur annual licensing costs or additional costs based on the amount of use and cause concerns over security vulnerabilities caused by data transmission through external networks.

3.2. ADS-LI Architecture

In this section, a numerical-based diagnosis model is developed to automatically detect anomalies in lightning rods and insulators using the coordinates and area information extracted after object detection. The entire diagnosis process comprises class analysis and object classification of drone images, formula-based anomaly detection according to equipment type, and the output and delivery of diagnosis results. First, it is determined whether the objects in the captured transmission equipment image are lightning rods or insulators (class determination) using the AI model, and the diagnosis path of each object is then branched. For lightning rods, the degree of tilt is calculated after extracting the coordinates of the top, middle, and bottom sections. Based on this, anomalies are determined. However, in the case of insulators, anomalies are determined by calculating the ratio of the area of the damaged region to the total area from the instance segmentation results. For the two facilities, results are obtained after anomaly detection is completed. These results are then sent to the transmission module to be presented as the final output. Each model has its own threshold for anomaly detection, which is optimized through actual testing. The described process is illustrated in Figure 7a.

3.3. Instance Segmentation Module

3.3.1. Model Training

Insulators often have repeated disc structures, making their boundaries with the background difficult to discern in many instances. In the case of lightning rods, visually overlapping cases with the surrounding structure frequently occur. In particular, for lightning rods, the connection between the top salient pole and the bottom support has a thin and straight structure. It was difficult to recognize due to the similar shape and color of the background. The bounding box-based object detection was initially applied to the object; however, the boundary accuracy of the top salient pole and the bottom structure was low, and uncertainty occurred in the judgment of anomalies based on the center coordinates and area. Accordingly, the instance segmentation method was also applied to lightning rods in the same manner as insulators, which contributed to the improvement of diagnostic accuracy through detailed boundary labeling [55].

The object detection model was developed based on the POSAI VISION platform to automatically recognize lightning rods and insulators from the captured images. POSAI VISION is equipped with eight deep learning models, which are divided into three methods. Instance segmentation precisely extracts boundaries at the pixel level and includes models based on Mask R-CNN and PointRend. Semantic segmentation segregates the pixels of the same class across the entire area, and the PointRend_R101_SEM model belongs to it. Object detection uses a YOLOv7-based algorithm based on the rectangular bounding box.

In this study, boundary fidelity was prioritized as the primary design criterion because the dataset included objects in which boundary error dominated, such as lightning-rod tips and insulator edges. Accordingly, the PointRend_R101 model, which refined masks by iteratively sampling only high-uncertainty boundary points at high resolution, was adopted [55]. Boundary fidelity refers to the model’s ability to accurately delineate the contours of fine structures, such as insulator discs and lightning rod tips, at the pixel level, thereby ensuring precise segmentation [56]. It was found that PointRend_R101, which exhibited the highest structural compatibility with the diagnostic system’s boundary-based anomaly decision pipeline, was the most appropriate option in functional and quantitative terms. Insulators have an atypical structure in which individual discs are continuously connected. Thus, simple bounding box-based object detection had limitations in accurate boundary extraction. Given that it is difficult to identify lightning rods when they overlap with surrounding structures (e.g., steel towers and chimneys) or when they are in environments with complex backgrounds, the application of a segmentation method capable of detailed boundary detection was required [57].

The entire dataset is managed within POSAI VISION as individually accessible resources, where the raw image data and corresponding labeling results are stored such that rework can be performed when necessary. Additionally, image upload and automatic resizing are supported via a web-based interface with a consistent processing pipeline. By leveraging this platform, we ensured the reliability and reproducibility of the labeling and preprocessing procedures. Uniform criteria, such as preserving object shape and maintaining spatial alignment, were applied across all samples, thereby establishing a basis for improving the accuracy of model training. No change in the number of samples occurred during preprocessing, and the final dataset used for model training comprised 215 images.

The PointRend_R101 instance-segmentation model was trained on the preprocessed training data. All 215 images were used for training, and no separate validation set was created. In numerical prediction tasks, it is widely reported that failing to hold out test data precludes objective estimation of generalization performance and increases the risk of overfitting [58]. However, for image segmentation, performance is governed by accurate learning of object boundaries and high-quality annotations even with small sample sizes, and the effectiveness of boundary-centric training has been reported [59]. Accordingly, the strategy prioritized accurate learning of object contours over broad generalization, and this choice was grounded in prior analyses indicating that training-data quality is pivotal for performance improvement.

The 110 lightning rod images and 105 insulator images used for training are relatively small in number when compared with the large-scale datasets that are typically employed in general-purpose deep learning research. This reflects the characteristics of industrial visual inspection environments, in which defect images are inherently rare in operating steel mills, and pixel-level annotation of high-resolution drone imagery is costly in terms of labor, time, and budget. In the domains of industrial surface defect and visual inspection, segmentation-based deep learning models and few-shot defect detection methods under similar small-scale data constraints have been reported [60].

3.3.2. Fine-Tuning

Training was set to 300 epochs. An epoch denotes one complete pass of the entire dataset through the model [61]. Setting a relatively high number of epochs was intended to ensure sufficient incorporation of object-boundary information. The batch size was set to 4, which denotes the number of training samples processed concurrently in a single iteration [62]. The labeling information was organized in JSON format. This format records each object’s class ID and polygon vertex coordinates in a structured manner, enabling the learning algorithm to accurately localize object positions and delineate boundaries [63]. The related contents are shown in Table 1.

This study is positioned as a feasibility study with a limited number of labeled images acquired in an actual steel mill, and the data constraints are comparable to those commonly reported in industrial visual inspection research [64]. Under these constraints, we did not construct a separate validation set in the conventional sense to maximize the number of samples available for training the segmentation model while preserving an independent test set for evaluating the end-to-end performance of ADS-LI. Instead, excessive overfitting was practically controlled by monitoring the convergence behavior of the training loss and repeatedly inspecting the qualitative quality of the predicted masks on randomly sampled training images. The maximum number of epochs, 300, was set as an upper bound at which the loss had essentially converged, and extending training beyond this point did not yield any noticeable improvement in boundary reproduction or shape reconstruction in the predicted masks.

To examine the convergence behavior of the PointRend_R101 model during training, we analyzed the variation in the loss function with respect to epochs using the training logs generated by the POSAI VISION platform. Figure 8 illustrates how the box loss and mask loss change during training along the iteration axis. The two loss values started in the range of approximately 0.5–0.7 during the initial tens of iterations and then decreased rapidly as training progressed. After around 2000 iterations, they gradually decreased within a range below 0.05 and exhibited stable convergence behavior. This trend indicates that the errors in bounding box and mask predictions gradually decrease through iterative training. Under the specified training and fine-tuning conditions, the proposed segmentation module is confirmed to converge in a stable manner. This study focuses on the development of an application-oriented anomaly diagnosis system and presents the training curves of key loss components to demonstrate that the model converges in a stable manner.

3.4. Anomaly Detection Module

This section describes the anomaly detection module of ADS-LI. This module is implemented as a rule-based diagnostic model. It is formulated using the lightning-rod tilt (

Δ x

) and the insulator area ratio (

r

), which are extracted from the instance segmentation results, together with their corresponding thresholds

θ_{L R}

and

θ_{I}

. This implied that, once the formulas and thresholds had been specified, the model had a deterministic structure that returned identical diagnostic outcomes for identical geometries [65]. Therefore, this model was better interpreted as a rule-based anomaly detection scheme that relied on explicitly defined indicators and preset thresholds rather than on the statistical distribution of the training data.

3.4.1. Lightning Rod Anomaly Detection

To quantitatively determine structural anomalies in lightning rods, a formula-based diagnosis model was designed based on the center coordinates of the top, middle, and bottom points extracted from the object recognition results. First, the top, middle, and bottom sections of the lightning rod were automatically distinguished through the AI-based segmentation model, and the center coordinates of each section on the two-dimensional image coordinate system were defined using Equation (1) as follows [66]:

P_{t} = (x_{t}, y_{t}), P_{m} = (x_{m}, y_{m}), P_{b} = (x_{b}, y_{b}) \in R^{2}

(1)

where

P_{t}

,

P_{m}

, and

P_{b}

are the top, middle, and bottom center points, respectively. Focus was placed on the x-axis components of the center coordinates to assess horizontal alignment of the structure. By defining coordinates on the image plane in a Cartesian (x–y) system, pixel-level horizontal deviation can be quantified [67]. A digital image is represented as a two-dimensional grid with integer pixel coordinates

(x, y)

, and each pixel is assumed to have either a scalar grayscale value or an RGB color vector [66]. The coordinate system follows the convention commonly used in digital image processing, with the upper-left corner considered as the origin, the x-axis increasing to the right, and the y-axis increasing downward. For each lightning rod object, three vertical cross-sections are defined on the basis of the segmentation result, and the x-coordinates of the centroids of these cross-sections are denoted by

x_{t}

,

x_{m}

, and

x_{b}

, respectively. If the structure is perfectly vertical, the three x-values are nearly identical (

x_{t} \approx x_{m} \approx x_{b}

). When the top or bottom tilts, discrepancies arise among the x-coordinates. Axis-based quantification of tilt and alignment has been reported in prior studies [68]. Accordingly,

Δ x

was defined as the sum of the absolute differences between the upper and middle x-coordinates and between the lower and middle x-coordinates, as in Equation (2).

∆ x = | x_{t} - x_{m} | + | x_{b} - x_{m} |

(2)

where

∆ x

is an absolute deviation sum (unit: px). A larger

Δ x

was interpreted as indicating greater asymmetry with respect to the x-axis. This type of indicator has been used in recent research on manufacturing-system scheduling and operations optimization to quantify solution robustness or model distributional uncertainty, and absolute deviation-based statistics, such as the median absolute deviation, are likewise employed as outlier-robust metrics in imaging and sensor domains [69,70]. Moreover, given that absolute-error-based losses are standard in localization, selecting a deviation metric from the same family is appropriate [71]. Additionally,

∆ x

is a key diagnostic indicator that quantitatively represents the degree of horizontal deviation of the lightning rod. If its value exceeds a certain threshold, then an abnormal state of the structure is considered. This method excludes subjectivity that may occur in conventional manual diagnosis and provides quantitative diagnosis results via AI-based image analysis.

The three points are closer to the straight line as

∆ x

value decreases, while they are more dispersed away from the horizontal axis as

∆ x

value increases, making it possible to determine structural anomalies in the lightning rod. Furthermore,

∆ x

is used as a key indicator because the normal structure of a lightning rod generally forms a straight line with respect to the horizontal axis. Geometrically, the degree of linear alignment of an object is a crucial criterion for determining its deformation and damage, and

∆ x

is an effective method for quantitatively expressing the change in linearity.

For lightning rod anomaly determination, a threshold

θ_{L R}

(unit: px) was set and compared against

Δ x

. A decision criterion that determines the normal state when

Δ x

is

θ_{L R}

or less and the abnormal state when it exceeds

θ_{L R}

was applied. The threshold

θ_{L R}

was optimized based on various photography environments and actual field data, making it possible for the model to derive consistent and reliable diagnosis results under various conditions. The formula-based diagnosis model enabled more objective and consistent evaluation than conventional methods that relied on visual inspection, and it was used as a key element of automated diagnosis in conjunction with an AI-based object detection model. The overall flow of the diagnosis process was composed of image data input, extraction of the center coordinates of three points, calculation of

Δ x

, and anomaly detection through a comparison between

Δ x

and the threshold. This was systematically expressed in a flowchart, as illustrated in Figure 7b. The flowchart facilitated an intuitive understanding of the model’s process.

The lightning rod diagnosis algorithm extracted the center coordinates of the top (

P_{t}

), middle (

P_{m}

), and bottom (

P_{b}

) points from the object detection results, and anomalies were determined based on the maximum value of the difference in horizontal coordinates (

Δ x

) between them. If

Δ x

exceeded the set threshold (

θ_{L R}

), then the abnormal slope was determined such that structural deformation can be quantitatively evaluated. In this study, the experiment was performed in four stages by setting the

θ_{L R}

value to 5, 10, 15, and 20 px. A total of ten lightning rod image datasets were used in the experiment. All test images contained lightning rod objects, and normality was assessed for each image. The threshold value was varied to examine the normal detection of lightning rods. The analysis results revealed that all lightning rod objects were accurately distinguished at

θ_{L R} = 10 p x

and

θ_{L R} = 15 p x

, and the normal detection accuracy was recorded at 100%. This indicates that threshold values can clearly distinguish between the presence and absence of structural deformation. In consideration of ensuring the sensitivity of the diagnosis system, however, the use of a lower threshold value has the benefit of detecting even minor slope deformation early. Therefore,

θ_{L R} = 10 p x

was selected as the optimal reference value in this study. The details are shown in Table 2.

3.4.2. Insulator Anomaly Detection

To quantitatively identify anomalies in insulators, a formula-based diagnosis model was developed, utilizing the area information derived from AI-based instance segmentation results. Insulators have a structure in which multiple discs are continuously connected, and an area above a certain level is maintained in the image under normal conditions. Therefore, when the actual area extracted from the instance segmentation results decreased below a certain ratio compared to the reference area, the occurrence of structural anomalies, such as damage and contamination, was determined.

In this study, anomalies were diagnosed using a univariate condition based on the insulator mask area. Multivariate conditions—including cracks, contamination, changes in object count, contour consistency, and shape asymmetry—were examined; however, extracting such features from field photographs was constrained. Owing to occlusion, illumination and scale variations, and small targets, multivariate feature extraction is difficult, and the uncertainty of the segmentation itself is high, making additional programming of limited practical value [72]. Moreover, for edge deployment, compute and memory constraints make it reasonable to operate a univariate threshold on simple, interpretable metrics, such as the segmentation-mask area ratio, rather than on complex indicators [73].

The reference area

A_{b o x}

was defined as the area of the bounding box automatically generated from the object-detection result [74]. The centroid of an object is defined as the arithmetic mean of the coordinates of its foreground pixels. For example, the centroid x-coordinates

x_{t}

,

x_{m}

, and

x_{b}

of the upper, middle, and lower cross-sections of a lightning rod are computed as the mean x-coordinate of all foreground pixels belonging to each cross-section. These definitions of bounding boxes and centroids correspond to basic region descriptors that are widely used to characterize the geometric properties of image regions [66]. This study adopted the minimum to enclose a rectangular bounding box around the object, which was automatically computed from the boundary coordinates of the segmentation mask. The bounding box is defined by the minimum and maximum coordinates spanned by the object and is expressed as in Equation (3).

A_{b o x} = (x_{m a x} - x_{m i n}) \times (y_{m a x} - y_{m i n})

(3)

where

A_{b o x}

is the bounding box area of the insulator (unit: px²);

x_{m i n}

,

x_{m a x}

,

y_{m i n}

, and

y_{m a x}

define the axis-aligned bounding box (unit: px). The insulator area was computed from the pixel count corresponding to the insulator in the instance-segmentation mask. This measure is defined via pixel-level contour extraction and area computation and is expressed in Equation (4) [66].

A_{s e g} = \sum_{i = 1}^{n} M_{i},

(4)

where

A_{s e g}

is the insulator segmented mask area (unit: px²);

M_{i}

is the mask-weighted pixel area element (unit: px²). Here,

M_{i} = 1

denotes a foreground pixel belonging to a lightning rod or an insulator, and

M_{i} = 0

denotes a background pixel. Symbol

n

represents the total number of pixels in the image, or in the region of interest. To automatically determine insulator anomalies, the authors designed an area-ratio-based diagnostic formula grounded in the instance-segmentation results [75]. Insulators comprise a chain of disc-shaped elements and, under normal conditions, maintain a relatively consistent area in images. Therefore, if the measured insulator area in the image falls at or below a defined fraction of the reference area, the condition can be classified as an anomaly, such as physical damage or contamination.

The diagnostic procedure comprises the following steps. First, the reference area

A_{b o x}

is defined as the area of the automatically generated bounding box from the object-detection result. Next, the true segmented area of the insulator,

A_{s e g}

, is extracted within the box. Subsequently, the ratio of the two areas is expressed as in Equation (5).

r = \frac{A_{s e g}}{A_{b o x}} \times 100 (%)

(5)

where

r

is the area ratio (%). For insulator anomaly determination, a threshold

θ_{I}

(unit: %) and compared with

r

. Threshold

θ_{I}

was set via empirical testing to select a value that yielded high detection accuracy. This optimum was derived during the model-testing phase. The diagnostic algorithm classifies a case as abnormal when

r

is less than the preset threshold and as normal when it exceeds the threshold. The image coordinate system adopted in this study, the definition of foreground regions using a binary mask, and the computation of bounding boxes and region areas all follow standard concepts in the field of digital image processing [66]. Building on these basic concepts, the present study introduces application-specific metrics, such as

Δ x

, which represents the inclination of a lightning rod, and

r

, which denotes the area preservation ratio of insulators.

θ_{I}

is not a fixed constant, and it may vary depending on the physical characteristics of the diagnosis target and on-site conditions. Based on this, the sensitivity and reliability of the diagnosis system can be adjusted in a balanced manner [76]. For instance, in an environment where external contamination or shadows reduce the recognition area,

θ_{I}

value can be lowered to minimize false detections. Conversely, when enhancing the sensitivity of defect detection is crucial,

θ_{I}

value can be increased to detect even minor damage. This flexibility established a basis for the quantitative-based automated diagnosis model to effectively respond to various operating conditions.

This diagnostic structure contributed to the implementation of a repeatable and consistent evaluation system by quantifying conventional qualitative decision criteria. In particular, the area ratio

r

functioned as a quantitative indicator that can ensure the reliability of the diagnosis results by numerically expressing the structural preservation status of insulators. The overall structure and application flow of this model are shown in Figure 7c.

The insulator diagnosis model determined anomalies based on the ratio

r

between the actual area (

A_{s e g}

) derived from the instance segmentation results and the bounding box area (

A_{b o x}

). As for the diagnostic criterion, the Pass decision was made when the ratio r was equal to or higher than the threshold

θ_{I}

, and the Fail decision when it was less than the threshold.

Threshold

θ_{I}

was varied at 5% intervals from 30% to 70% to examine the normal detection of insulators. A total of ten insulator image datasets were used in the experiment. In the analysis results, all of the insulators with anomalies were accurately detected at

θ_{I} = 55 %

, and the normal insulators were evaluated as normal with no cases of misjudgment. On the other hand, when the threshold was increased to 60% or higher, the error that the insulators with anomalies were evaluated for was normal. In summary, the optimal threshold for insulator diagnosis was judged as

θ_{I} = 55 %

, which was applied as the final criterion in this study. The results are summarized in Table 3.

Table 2 and Table 3 list a series of candidate diagnostic thresholds and the corresponding FP/FN behaviors, which can be regarded not as the result of tuning a single fixed threshold at one operating point, but rather as a coarse sensitivity analysis of the system behavior with respect to

θ_{L R}

and

θ_{I}

within the available dataset [77].

In this study, diagnostic thresholds,

θ_{L R}

and

θ_{I}

, were determined not through complex statistical modeling, but rather via a simple grid search over candidate values combined with engineering judgment, due to the limited number of labeled anomaly cases. Instead of artificially optimizing performance on the test set, we prioritized minimizing false negatives from a structural safety perspective, thereby allowing field operators to conservatively detect potential anomalies during real-world operation. Similar approaches—where preliminary thresholds are defined using small datasets and later refined as more data become available—have been reported in prior studies [78]. Accordingly, these thresholds will require recalibration once a larger dataset is obtained.

3.5. Experimental Setting

3.5.1. Metrics for Instance Segmentation Evaluation

Instance segmentation tests were conducted using the PointRend_R101 model. To quantitatively evaluate the segmentation results, we adopted evaluation metrics that are widely used in object detection and segmentation research, namely precision, recall, and mean average precision (mAP) [79]. Equation (6) defines precision as the proportion of predicted positives that are truly positive.

P r e c i s i o n = \frac{T P}{T P + F P} .

(6)

Equation (7) defines recall as the proportion of actual positives that the model correctly predicts as positive.

R e c a l l = \frac{T P}{T P + F N} .

(7)

where TP is the number of true positive samples that are correctly predicted as positive, FP is the number of false positive samples that are incorrectly predicted as positive despite being negative, and FN is the number of false negative samples that are incorrectly predicted as negative despite being positive. Average precision (AP) is defined as a weighted average of precision over different levels of recall, which corresponds to the area under the precision–recall curve, as shown in Equation (8).

A P = \int_{0}^{1} P (R) d R

(8)

where P is precision and R is recall. In a multi-class setting, the mean average precision (mAP) is defined as the arithmetic mean of AP values computed for each class, as given in Equation (9). It is used as an indicator for evaluating the overall segmentation performance of the model.

{m A P}_{T} = \frac{1}{|T|} \sum_{τ \in T} A P (τ)

(9)

where

T

is the set of intersection over union (IoU) thresholds used for evaluation (e.g.,

T = {0.50}

for

{m A P}_{50}

and

T = {0.75}

for

{m A P}_{75}

in this study),

|T|

is the cardinality of

T

,

τ

is an IoU threshold element in

T

, and

A P (τ)

denotes the average precision computed at the IoU threshold

τ

.

3.5.2. Metrics for Anomaly Detection Evaluation

For the anomaly detection tests, the diagnostic criteria and thresholds were defined for each equipment type as follows.

Lightning-rod diagnostic criterion: center-coordinate tilt $Δ x$ ; threshold $θ_{L R} = 10 px$ .
Insulator diagnostic criterion: mask area ratio $r$ ; threshold $θ_{I} = 55 %$ .

Center coordinates or area values were extracted from the segmentation results and applied to the formula-based diagnosis models. The diagnosis results were classified into the following four categories. Based on this, performance was assessed using calculation formulas.

True Positive (TP): Abnormal objects are accurately classified as abnormal objects.
True Negative (TN): Normal objects are accurately classified as normal objects.
False Positive (FP): Normal objects are misclassified as abnormal objects.
False Negative (FN): Abnormal objects are misclassified as normal objects.

We computed accuracy, precision, recall, and F1 score using TP, TN, FP, and FN [80]. Equation (10) defines accuracy as the proportion of all samples that the model correctly classifies as positive (abnormal) or negative (normal).

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(10)

As described in Section 3.5.1, precision is computed using Equation (6), and recall is computed using Equation (7). Equation (11) provides the F1-score, defined as the harmonic mean of precision and recall.

F 1 - s c o r e = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(11)

3.5.3. Experimental Environment

ADS-LI diagnostic system was implemented in Python 3.8.18, with object detection and instance segmentation built on the PointRend_R101 algorithm in the Detectron2 v0.6 (Meta Platforms, Inc., Menlo Park, CA, USA) framework [81]. All training and inference were executed under CUDA 11.3 (NVIDIA Corporation, Santa Clara, CA, USA), and the system was configured to ensure computational efficiency and stability for repeated diagnosis on real imagery [82]. PyTorch 1.10.2 was used as the deep learning framework [83]. Image preprocessing and area computation were performed with OpenCV 4.5.5 [49]. The operating system was Ubuntu 20.04 LTS (Canonical Ltd., London, UK) [84]. The Graphics Processing Unit (GPU) was NVIDIA RTX A5000 24 GB (NVIDIA Corporation, Santa Clara, CA, USA) [85]. The software versions and hardware configuration used for system implementation and execution are summarized in Table 4.

With this configuration, the per-image inference time was optimized as approximately 0.8–1.2 s, providing real-time processing performance suitable for repeated diagnostics in industrial settings. In particular, by specifying all configuration parameters and version information, a basis for reliably reproducing the same diagnostic system across diverse environments was established.

The entire diagnosis system was constructed based on the PointRend instance segmentation model of the Detectron2 framework, and it had a structure that applies the diagnosis algorithm for each class after object detection. Upon the completion of object detection, the results are divided into the predicted masks, predicted box, and predicted class. The diagnostic function for each object is then executed. Center coordinate-based slope analysis was conducted for lightning rod objects and segmented area-based preservation rate analysis for insulator objects.

The bending() function for lightning rods calculates the x-coordinates for the top (

P_{t}

), middle (

P_{m}

), and bottom (

P_{b}

) centers of the lightning rod structure based on the object contour coordinates extracted from the segmentation mask. This function identifies contour points that are closest to each y-coordinate position (maximum, average, and minimum) and extracts the x-coordinates of the points to provide the basic values of

Δ x

calculation used to analyze the slope of the central axis. Internally, the findNearNum() function is called to efficiently select the points closest to the reference y values, and the results are converted into (

P_{t}

,

P_{m}

,

P_{b}

). Listing 1 below shows the code transformation of the bending() function that performs the operation.

Listing 1. Python code for the lightning-rod bending function.

def bending(cnt):
x_ = [p[0][0] for p in cnt]
y_ = [p[0][1] for p in cnt]
max_x = x_[findNearNum(y_, np.max(y_))[0]]
mid_x = x_[findNearNum(y_, np.mean(y_))[0]]
min_x = x_[findNearNum(y_, np.min(y_))[0]]
return (max_x, mid_x, min_x)

The model is automatically executed and outputs diagnosis results as shown in the function above. The diagnosis status (Pass/Failed), values for decision, and set reference values were visually stored with the resulting images. The entire structure was designed to apply the algorithm for each diagnosis target and maintain consistency in processing the results. It was constructed using efficient processing methods that consider repeated diagnoses and automatic recording.

4. Experimental Results and Analysis

In this section, the integrated diagnostic performance of the proposed automated diagnosis system for lightning rods and insulators (ADS-LI) was quantitatively verified. ADS-LI automatically segments lightning rod and insulator objects in drone-captured images through the PointRend_R101 model and automatically detects anomalies by applying geometric formula-based models. For this validation, accuracy and detection performance indicators were calculated by comparing the final results of the entire system with the manual detection results.

The ADS-LI test data consist of 90 lightning-rod and insulator images captured after January 2024. They are a completely new dataset that was not used in previous learning processes. As for anomalies in the images, the manual diagnosis results of experts were set as a reference and used for model performance assessment. Valid segmentation outputs were obtained for all samples, and no cases of object-recognition failure were observed. Additional test images could have been acquired, but in this test, the model and thresholds were evaluated only on a predefined independent set to control overfitting risk. Unlike numerical prediction tasks, the amount of training data was limited; however, training with high-quality annotations and data yielded a 100% object-recognition rate within the test scope.

4.1. Instance Segmentation

When the PointRend_R101 model that completed training was applied to the test images, the reliability of detection results for the lightning rod class was more than 90% on average. All 90 images used in the test were composed of the non-training image data captured after January 2024. The model recognized them as lightning rod and insulator classes without object omission. This result was utilized as an indicator that empirically supports the applicability of the developed model to real environments.

In this study, the PointRend_R101 segmentation model was trained exclusively on the training subset, and the term “test images” refers to an independent test subset that was not used during training. A separate validation subset was not constructed, and the limitations associated with this design choice are discussed in Section 6.1.

In this study, we applied the softmax function to the output of the classification head and used its maximum value as the model’s detection confidence. The softmax function normalizes logits into a class-wise probability distribution and is standard for mutually exclusive classification [86]. However, raw softmax probabilities have been reported to exhibit overconfidence and miscalibration. Hence, validation-based threshold tuning and calibration procedures were additionally considered [87]. The reliability threshold was set to 0.7 in this study for the model to maintain detection results only with a confidence of 70% or higher. This enabled effective removal of false detections with low reliability and improved the quality of the final detection results.

Example visual comparisons of the instance segmentation results obtained by Mask R-CNN_R101 and PointRend_R101 for lightning rods and insulators are presented in Figure 9. The PointRend model employs a rendering-based architecture that refines high-resolution features in the neighborhoods of selected points through an expanded fully connected (FC) dimension, thereby providing sharper boundaries and finer shapes for input resolutions of 256 × 256 pixels or higher [88]. By directly comparing the instance segmentation results of Mask R-CNN_R101 and PointRend_R101, which share the same backbone, this study aimed to demonstrate the rationale for adopting PointRend as the instance segmentation model. Figure 9a presents the instance segmentation result for a lightning rod. The segmented region is represented by the mask and bounding box, and the corresponding confidence score is 99%. Figure 9b presents the instance segmentation result for an insulator in a drone image. The segmented region is visualized using a mask and a bounding box, demonstrating accurate detection of the object area. The results confirmed that the AI-based image diagnostic system developed in this study could effectively separate and identify lightning rods and insulators in real drone-captured images. Mask R-CNN_R101 often produced staircase-like and blurred boundaries near the lightning rod tip and insulator disc edges, with the segmented regions partially blending into surrounding structures. In contrast, PointRend smoothly and continuously reconstructed the true geometric contours of the lightning rod shaft and insulator discs. This observation indicated that PointRend represented object shapes more precisely than the widely used Mask R-CNN model. This study employed PointRend to implement structural defect assessment, leveraging its higher segmentation accuracy. This difference in boundary representation reduced the accumulation of errors in the subsequent computation of the slope-change metric (

Δ x

) and the mask area preservation ratio (

r

), thereby improving the anomaly diagnosis accuracy achieved by ADS-LI.

The performance of the two instance segmentation models, Mask R-CNN_R101 and PointRend_R101, was compared and summarized in Table 5. Compared with Mask R-CNN, the

{m A P}_{50}

of PointRend increased from 48.2% to 51.8%, corresponding to an improvement of approximately 3.6 percentage points, while

{m A P}_{75}

increased from 32.1% to 35.1%, i.e., by about 3.0 percentage points. Although the absolute values could not be regarded as very high, PointRend nonetheless exhibited a consistent performance advantage over Mask R-CNN under the same drone imaging conditions and dataset constraints.

To visually verify which regions of the drone images the model focused on when determining anomaly status, class activation maps based on gradient-weighted class activation mapping (Grad-CAM) were analyzed. Grad-CAM is a visualization technique that uses gradient information to highlight which spatial locations and channels in the final convolutional layer of a CNN predominantly contribute to the prediction score of a given class [89]. Figure 10 presents the results obtained by overlaying Grad-CAM-based activation heatmaps on the original drone images for one normal and one abnormal example each of insulators and lightning rods. Figure 10a,b correspond to the lightning rod and insulator examples, respectively. For the lightning rod, strong activations were predominantly concentrated vertically along the shaft and tip. For the insulator, across all examples, high-activation regions were mainly distributed along the portions where the discs are continuously arranged. These Grad-CAM results visually corroborated that the model extracted features primarily from the shape and boundary information of the insulators and lightning rods under diagnosis, rather than from background structures or surrounding noise. The presence of a sufficient level of color intensity within the regions corresponding to each class indicated that the model had recognized the lightning rods and insulators according to their shapes. This observation suggested that appropriate segmentation had been achieved. These findings confirmed that the model was referring to physically meaningful regions.

4.2. Anomaly Detection

A total of 90 image datasets were used, and automated diagnosis results for anomalies in lightning rods and insulators were quantitatively analyzed. For each object, the actual presence or absence of anomalies was compared with the system decision results to calculate major performance indicators such as diagnostic accuracy, precision, recall, and F1-score. However, these metrics were reported under the experimental conditions and the limited number of abnormal samples considered in this study. These metrics should be interpreted as preliminary performance indicators rather than as statistically generalizable guarantees of performance.

In the lightning rod diagnosis results, all objects were accurately classified from 45 test samples (43 normal samples and two abnormal samples). There were two TP cases and 43 TN cases with no FP and FN. Therefore, the accuracy, precision, recall, and F1-score were recorded as 1.00. These results could be interpreted as demonstrating the conceptual validity and practical implementation of the diagnostic logic for lightning rods.

In the case of insulator diagnosis, two out of two abnormal cases were detected. However, one of the 43 normal cases was misclassified as abnormal (false positive) in the evaluation of 45 total cases. With the threshold

θ_{I}

set to 55% in this test set, the reported accuracy, precision, and F1-score were 0.98, 0.67, and 0.80, respectively. The recall reached 1.00, indicating that all anomalous objects were detected. These results demonstrate that the system exhibits strong performance in diagnosing insulator anomalies, although occasional false positives may occur for certain normal objects. In the false-positive case, the segmentation model failed to fully capture the insulator boundaries because portions of the outer edges visually blended with the background. Hence, the diagnostic algorithm underestimated the mask area, yielding a smaller area ratio

r

and misclassifying the object as abnormal. This finding highlights the need for improvements in the image preprocessing stage. Techniques to enhance object–background contrast are currently being considered to mitigate such errors.

In summary, when ADS-LI was evaluated on 90 independent test cases constructed in this study (86 normal and 4 abnormal in total), the reported overall accuracy and F1-score were 0.99 and 0.89, respectively. These performance metrics could be interpreted as preliminary evidence that the proposed ADS-LI operated as intended at a conceptual level. They also suggested that the system was able to identify abnormal installations in real field images, even under a limited test data configuration. The test results are summarized in Table 6.

The automatic diagnostic results of the proposed ADS-LI system for lightning rods and insulators are shown in Figure 11 and Figure 12. In Figure 11, the lightning-rod case (a) has

∣ x_{t} - x_{m} ∣ = 3 p x

and

∣ x_{b} - x_{m} ∣ = 5 p x

, yielding

Δ x = 8 p x \leq θ_{L R}

. Hence, it is classified as normal. Given that this does not exceed the threshold

θ_{L R} = 10 p x

, the structure is interpreted as maintaining vertical alignment. By contrast, case (b) has

∣ x_{t} - x_{m} ∣ = 10 p x

and

∣ x_{b} - x_{m} ∣ = 18 p x

, giving

Δ x = 28 p x > θ_{L R}

, which exceeds the criterion and is classified as abnormal. This indicates a structural tilt anomaly.

In Figure 12, the insulator case (a) is classified as normal: the segmented area

A_{s e g} = 22,483.5 {p x}^{2}

and bounding-box area

A_{b o x} = 36,600.5 {p x}^{2}

yield an area ratio

r = 61.0 %

. Given that this exceeds threshold

θ_{I} = 55 %

, the structure is interpreted as intact. By contrast, the abnormal case has

A_{s e g} = 5364.0 {p x}^{2}

and

A_{b o x} = 13,102.16 {p x}^{2}

, yielding

r = 41.0 %

. This was interpreted as an anomaly due to structural damage or a segmentation miss. These results provide a mathematical decision rule and quantitative, number-based evidence for the diagnosis.

5. Discussion

In diagnostic experiments conducted on 90 tower images, the ADS-LI integrated model achieved an overall accuracy of 99%. For lightning rod diagnosis, all objects were classified correctly, whereas in insulator diagnosis, a single abnormal object was misclassified as normal, resulting in one FN. These results indicate that the system maintains a generally high level of diagnostic reliability. However, the 99% overall accuracy and the 100% accuracy achieved for lightning-rod diagnosis are based on 90 test cases collected from a single steel-plant site. In the industrial surface-defect and visual-inspection literature, it is standard practice to first demonstrate feasibility using a limited, site-specific test set, and then expand evaluation to multi-site and large-scale datasets in subsequent studies [90]. In the same vein, the results of this study should be interpreted as preliminary performance indicators for Site A of the steel mill.

Precision is the proportion of instances labeled anomalous by the system that are truly anomalous, and it measures the ability to minimize false alarms. For lightning rods, precision was 1.00; for insulators, precision was 0.67 owing to a single FP. In this experiment, both classes achieved a recall of 1.00, and no FN occurred. This indicated that all anomalous instances were detected. Recall is the proportion of actual anomalous instances that the system detects and is directly tied to the sensitivity of the diagnostic algorithm. The F1 score is the harmonic mean of precision and recall and serves as a composite indicator of their balance. The overall F1 score was 0.89, indicating that the system provided balanced diagnostic performance without being overly sensitive or overly conservative.

In the test set constructed for this study, only two abnormal (defective) cases are included for each of the lightning rod and insulator classes. Therefore, the performance metrics reported in this section, such as recall and F1-score, should be interpreted not as statistically stable estimates but rather as preliminary results that serve to verify whether the proposed indicators and thresholds operate in the intended manner for this limited number of anomaly cases. Furthermore, because the analysis was based on a limited number of abnormal samples, caution was warranted in the statistical interpretation of these results. The recall and F1-score reported in this study corresponded to estimates derived from a small number of cases. Accordingly, they had limited ability to provide stable inferences about performance at the population level. For example, in this study, four abnormal cases in the test set were detected, and the recall was therefore reported as 1.0. This value, however, only reflected the performance on these four abnormal cases in the test set. It should be noted that, if additional abnormal cases under diverse field conditions are collected in the future, the recall value may decline from this level. Therefore, the performance metrics presented in this study were interpreted as preliminary results. These results provided confirmation that the proposed ADS-LI architecture and the formula-based indicators (

Δ x

,

r

) operated in the intended manner on real field data.

To mitigate the limitation arising from the small number of abnormal samples, this study initially examined the use of data augmentation techniques to expand the abnormal sample set. However, because the rule-based anomaly detection logic proposed in this study depended on the object shape and boundary geometry, it was sensitive to augmentation procedures that involved artificial geometric distortions [91]. Improperly designed synthetic anomalies can introduce unrealistic artifacts that are inconsistent with the shapes and texture distributions of real defects, which may bias the model toward synthetic anomalies during training or distort performance evaluation [92]. For these reasons, no additional data augmentation techniques were applied at any stage of training or testing in this study, in order to preserve the characteristics of the data and maintain the stability of the evaluation. Instead, a conservative performance assessment was conducted based on the small number of abnormal cases observed under real field conditions.

As described in Section 3.4, the anomaly detection module of ADS-LI was implemented as a rule-based formula model built on formula-based indicators, and thus, had a different nature from statistical learning-based classifiers. The empirical data available in this study were limited to lightning-rod tilt patterns and insulator area-loss patterns observed at Steel mill A of Company P, for which the proposed rules were found to be valid. Cases that fall outside this scope (e.g., new damage types or structural configurations) require recalibration of the proposed rules. Therefore, at the current stage, the performance metrics reported in this study should be interpreted within this limited scope rather than as evidence of generalizable performance. This study collects geometric attributes, such as centroid coordinates and areas, after object segmentation and employs analytically defined indicators

Δ x

and

r

to numerically determine anomaly status, thereby enhancing the quantitative rigor and reproducibility of the diagnosis. Building on quantitative data, including centroids and areas extracted from the object recognition and segmentation results, the proposed method diagnoses abnormal conditions and organizes them into explicit analytical metrics, thus establishing a consistent quantitative pipeline that extends from object recognition to structural diagnosis. However, the test set used in this study was still limited in size. The performance evaluation conducted in this study corresponded to a practice-oriented validation procedure. In the future, under the assumption of label-scarce industrial environments, rigorous statistical validation will be required on large-scale, systematically constructed independent validation and test sets.

In this study, the single FP case occurred in an image acquired from a slightly oblique viewpoint, where the flight altitude and camera pitch were still within the predefined operating range. The entire dataset was constructed under relatively tightly controlled viewpoint variance by constraining the flight path, shooting distance, and camera pose, and within this range, the variation of

Δ x

and

r

indicators depended more on the quality of the segmentation masks than on viewpoint changes themselves. Meanwhile, in UAV-based inspection scenarios that allow more diverse viewpoints and flight conditions, viewpoint planning and viewpoint variance have been reported to exert a significant influence on diagnostic accuracy [93]. The viewpoint sensitivity of the

Δ x

and

r

indicators, therefore, should be quantitatively evaluated in future studies using dedicated datasets and experimental designs.

Building on these results, this study yields the following academic and industrial implications. First, by introducing a quantitative evaluation procedure in the form of structure-preserving numerical computations into the segmentation-based visual diagnostic algorithm, we strengthened the formalization and theoretical foundation of image-based predictive diagnosis models. Second, by handling two types of equipment with distinct structural characteristics—lightning rods and insulators—within a single integrated algorithm, we demonstrated the generality and extensibility of the proposed diagnostic approach. Third, the entire diagnostic process is integrated into a pipeline consisting of object detection, numerical decision making, and visualization output, thereby providing an enabling technology for an automated diagnostic framework that can be applied to various types of power equipment.

6. Conclusions

6.1. Summary and Contributions

This study developed an object recognition-based ADS-LI and evaluated its performance to overcome the structural limitations of conventional power equipment maintenance practices that have relied on manual and visual inspection by highly skilled personnel. ADS-LI is implemented as a hybrid architecture that integrates a module applying the PointRend algorithm to aerial images acquired by drones for precise segmentation-based detection of key equipment objects with a mathematical model that quantitatively determines anomaly status based on the geometric properties of the detected objects.

Based on the performance evaluation, ADS-LI achieved an F1-score of 89% and an accuracy of 99% on an independent test dataset comprising 90 cases, demonstrating high diagnostic performance. In particular, by using the centroid-based inclination change

Δ x

for lightning rods and the mask area preservation ratio

r

for insulators as quantitative indicators, this system enhances the consistency and reproducibility of diagnostic outcomes. This represents a significant academic and industrial contribution. Moreover, the flexible architecture that combines a learning-based model with a non-learning analytical model offers high practicality, as it can adapt to diverse field environments primarily through threshold adjustment without requiring complex retraining procedures. Ultimately, ADS-LI provides a practical technological alternative that can aid in alleviating chronic industrial challenges in power equipment maintenance, such as shortages of expert personnel and rising labor costs.

This approach is significant from an academic perspective in that it enabled an “interpretable quantitative evaluation” that has been overlooked in previous studies on image diagnosis by converting instance segmentation-based detection results into physical diagnostic indicators beyond simple automation implementation. In particular, this study introduced numerical-based determination items, such as the distance between the center coordinates of the object (

Δ x

) and the object area preservation rate (

r

), to perform anomaly detection based on mathematically defined threshold criteria rather than the intuition of diagnostic personnel. This study differentiated from previous studies in that it ensured both the transparency of the diagnostic basis and the potential for structural analysis by introducing quantitative analysis stages after model output, while previous studies were mostly focused on the accuracy of binary classification that utilized deep learning models.

Furthermore, the quantitative diagnostic framework proposed in this study can reduce the unnecessary on-site work required for inspecting elevated power installations. By reducing the labor and equipment required during maintenance activities, this contributes to the establishment of a sustainable maintenance regime that simultaneously enhances resource efficiency and occupational safety. In addition, by enabling preventive maintenance of lightning rods and insulators, the framework can extend the service life of the installations and optimize maintenance planning. From a life-cycle perspective, it can contribute to reducing energy consumption, resource waste, and waste material generation associated with the operation of electrical installations.

However, the dataset used for training and evaluation in this study was limited in size and diversity. In particular, the number of labeled anomaly images that could be collected from an operating steel mill was insufficient to ensure strong statistical significance. Consequently, instead of adopting a three-way train–validation–test split, we constructed only a training set and an independent test set. This choice is consistent with the data constraints commonly reported in the industrial surface defect detection and automated visual inspection literature. Additionally, the diagnostic targets in this study were restricted to lightning rods and insulators, indicating the need to broaden the scope for greater generality, and the absence of a validation set represents a limitation in terms of quantitatively assessing the model’s generalization performance.

6.2. Future Studies and Recommendations

To facilitate the commercialization and further performance enhancement of the ADS-LI system, the following directions for future studies are proposed.

First, to improve system reliability and stability, it is necessary to construct a real-world image dataset comprising more than 500 cases that cover diverse weather and environmental conditions. Additionally, a standardized validation protocol should be introduced to quantitatively assess and optimize the model’s generalization performance. In particular, given that the current test set contains only two abnormal cases per class, the recall and other classification metrics reported in this study are not sufficient to support strong claims of statistical significance; therefore, the stability of the performance should be re-examined through additional tests that include a larger number of anomaly samples.

Second, drone operation should be improved in terms of repeatability and operational flexibility. To this end, real-time kinematic (RTK) based automatic position correction should be applied, and intelligent flight path planning grounded in on-site equipment information should be employed to enhance the consistency of image acquisition.

Third, the diagnostic targets should be expanded to at least three major types of electrical equipment, such as switches and transformers, to increase the generality of the system. Additionally, the comprehensive diagnostic capability of ADS-LI should be further enhanced by quantitatively integrating equipment-specific anomaly thresholds

θ

using AI-based or statistical methodologies.

Fourth, the ADS-LI results reported in this study constitute a first-step case study based on data collected from a single steelworks site. Therefore, future studies should construct datasets that reflect diverse field conditions, on the basis of which more rigorous statistical validation and cross-validation will be required.

The findings of this study indicate that the proposed ADS-LI system provides a structural framework for image-based automation of power equipment diagnosis. With subsequent research that extends its application to more diverse field conditions, the system is expected to evolve into practical technology that contributes to the sustainability of power infrastructure by extending equipment service life and enhancing resource and energy efficiency.

Author Contributions

Conceptualization, H.-R.K., S.-W.C. and E.-B.L.; methodology, H.-R.K. and S.-W.C.; validation, H.-R.K. and S.-W.C.; formal analysis, H.-R.K. and S.-W.C.; investigation, H.-R.K., S.-W.C. and G.-W.K.; resources, H.-R.K.; data curation, H.-R.K., S.-W.C. and G.-W.K.; writing—original draft preparation, H.-R.K.; writing—review and editing, S.-W.C. and G.-W.K.; visualization, S.-W.C. and G.-W.K.; supervision, S.-W.C. and E.-B.L.; project administration, E.-B.L.; funding acquisition, E.-B.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was sponsored by the Korea Ministry of Trade Industry and Energy (MOTIE) and the Korea Evaluation Institute of Industrial Technology (KEIT) through the Technology Innovation Program funding for “Development of optimization technology for pipe-cable auto-routing design linked to carbon reduction model” project (Grant No. RS-2022-00143619).

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Acknowledgments

The views expressed in this paper are solely those of the authors and do not represent those of any official organization or research sponsor.

Conflicts of Interest

Author Hyeong-Rok Kim was employed by the company Pohang Iron and Steel Company (POSCO). The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest.

Abbreviations

The following abbreviations and parameters are used in this paper:

AI	Artificial Intelligence
ANN	Artificial Neural Network
AP	Average Precision
CBM	Condition-Based Maintenance
CNN	Convolutional Neural Network
COCO	Common Objects in Context
CV	Computer Vision
DGA	Dissolved Gas Analysis
FCN	Fully Convolutional Network
FPS	Frames Per Second
GPU	Graphics Processing Unit
IoU	Intersection over Union
JSON	JavaScript Object Notation
LSTM	Long Short-Term Memory
ML	Machine Learning
MLP	Multi-Layer Perceptron
PdM	Predictive Maintenance
R-CNN	Region-based Convolutional Neural Network
RoI	Region of Interest
RTK	Real-Time Kinematic
SHM	Structural Health Monitoring
TN	True Negative
TP	True Positive
FN	False Negative
FP	False Positive
UAV	Unmanned Aerial Vehicle
YOLO	You Only Look Once
$∆ x$	Absolute Deviation Sum (unit: px)
$r$	Area Ratio (unit: %)
$A_{b o x}$	Bounding Box Area of Insulator (unit: px²)
$A_{s e g}$	Segmentation Area of Insulator (unit: px²)
$P_{t}$	Point Top
$P_{m}$	Point Middle
$P_{b}$	Point Bottom
$θ_{L R}$	Threshold for Lightning Rod (unit: px)
$θ_{I}$	Threshold for Insulator (unit: %)

References

International Energy Agency (IEA). Power Systems in Transition. Available online: https://www.iea.org/reports/power-systems-in-transition (accessed on 25 June 2025).
International Energy Agency (IEA). Electricity 2025. Available online: https://www.iea.org/reports/electricity-2025 (accessed on 27 June 2025).
U.S. Department of Energy. Transforming the Nation’s Electricity System: The Second Installment of the Quadrennial Energy Review. Available online: https://www.energy.gov/sites/prod/files/2017/02/f34/Quadrennial%20Energy%20Review%20Summary%20for%20Policymakers.pdf (accessed on 2 July 2025).
IEEE Std 1243-1997; IEEE Guide for Improving the Lightning Performance of Transmission Lines. IEEE Standards Association: Piscataway, NJ, USA, 1997; pp. 1–44.
NFPA 780; Standard for the Installation of Lightning Protection Systems 2020. National Fire Protection Association (NFPA): Quincy, MA, USA, 2020. Available online: https://edufire.ir/storage/Library/other/NFPA%20780-2020.pdf (accessed on 18 July 2025).
Li, X.; Li, Z.; Wang, H.; Li, W. Unmanned Aerial Vehicle for Transmission Line Inspection: Status, Standardization, and Perspectives. Front. Energy Res. 2021, 9, 713634. [Google Scholar] [CrossRef]
Ministry of Employment and Labor. 2023 Industrial Accident Status Analysis. Available online: https://www.moel.go.kr/policy/policydata/view.do?bbs_seq=20241201548 (accessed on 20 July 2025).
Zhai, F.; Yang, T.; Chen, H.; He, B.; Li, S. Intrusion Detection Method Based on CNN–GRU–FL in a Smart Grid Environment. Electronics 2023, 12, 1164. [Google Scholar] [CrossRef]
Li, X.; Shang, W. Hyperspectral Anomaly Detection Based on Spectral Similarity Variability Feature. Sensors 2024, 24, 5664. [Google Scholar] [CrossRef] [PubMed]
Pix4D. 60% Faster Transmission Tower Inspections with Drones. Available online: https://www.pix4d.com/blog/transmission-tower-inspections (accessed on 24 July 2025).
Xin, M.; Xu, C.; Gao, J.; Wang, Y.; Wang, B. High-Precision Recognition Algorithm for Equipment Defects Based on Mask R-CNN Algorithm Framework in Power System. Processes 2024, 12, 2940. [Google Scholar] [CrossRef]
Faisal, M.A.A.; Mecheter, I.; Qiblawey, Y.; Fernandez, J.H.; Chowdhury, M.E.H.; Kiranyaz, S. Deep learning in automated power line inspection: A review. Appl. Energy 2025, 385, 125507. [Google Scholar] [CrossRef]
Khan, M.R.; Haider, Z.M.; Malik, F.H.; Almasoudi, F.M.; Alatawi, K.S.S.; Bhutta, M.S. A Comprehensive Review of Microgrid Energy Management Strategies Considering Electric Vehicles, Energy Storage Systems, and AI Techniques. Processes 2024, 12, 270. [Google Scholar] [CrossRef]
Lim, D.Y.; Jin, I.J.; Bang, I.C. Heat-vision based drone surveillance augmented by deep learning for critical industrial monitoring. Sci. Rep. 2023, 13, 22291. [Google Scholar] [CrossRef]
Maraaba, L.; Al-Soufi, K.; Ssennoga, T.; Memon, A.M.; Worku, M.Y.; Alhems, L.M. Contamination Level Monitoring Techniques for High-Voltage Insulators: A Review. Energies 2022, 15, 7656. [Google Scholar] [CrossRef]
Hussain, G.A.; Hassan, W.; Mahmood, F.; Shafiq, M.; Rehman, H.; Kay, J.A. Review on Partial Discharge Diagnostic Techniques for High Voltage Equipment in Power Systems. IEEE Access 2023, 11, 51382–51394. [Google Scholar] [CrossRef]
Waleed, D.; Mukhopadhyay, S.; Tariq, U.; El-Hag, A.H. Drone-Based Ceramic Insulators Condition Monitoring. IEEE Trans. Instrum. Meas. 2021, 70, 6007312. [Google Scholar] [CrossRef]
Rojek, I.; Jasiulewicz-Kaczmarek, M.; Piechowski, M.; Mikołajewski, D. An Artificial Intelligence Approach for Improving Maintenance to Supervise Machine Failures and Support Their Repair. Appl. Sci. 2023, 13, 4971. [Google Scholar] [CrossRef]
Chen, C.; Fu, H.; Zheng, Y.; Tao, F.; Liu, Y. The advance of digital twin for predictive maintenance: The role and function of machine learning. J. Manuf. Syst. 2023, 71, 581–594. [Google Scholar] [CrossRef]
Werbińska-Wojciechowska, S.; Giel, R.; Winiarska, K. Digital Twin Approach for Operation and Maintenance of Transportation System—Systematic Review. Sensors 2024, 24, 6069. [Google Scholar] [CrossRef]
Ma, S.; Flanigan, K.A.; Bergés, M. State-of-the-art review and synthesis: A requirement-based roadmap for standardized predictive maintenance automation using digital twin technologies. Adv. Eng. Inform. 2024, 62, 102800. [Google Scholar] [CrossRef]
Mikołajewska, E.; Mikołajewski, D.; Mikołajczyk, T.; Paczkowski, T. Generative AI in AI-Based Digital Twins for Fault Diagnosis for Predictive Maintenance in Industry 4.0/5.0. Appl. Sci. 2025, 15, 3166. [Google Scholar] [CrossRef]
Garcia, J.; Rios-Colque, L.; Peña, A.; Rojas, L. Condition Monitoring and Predictive Maintenance in Industrial Equipment: An NLP-Assisted Review of Signal Processing, Hybrid Models, and Implementation Challenges. Appl. Sci. 2025, 15, 5465. [Google Scholar] [CrossRef]
Saeed, A.; Khan, M.A.; Akram, U.; Obidallah, W.J.; Jawed, S.; Ahmad, A. Deep learning based approaches for intelligent industrial machinery health management and fault diagnosis in resource-constrained environments. Sci. Rep. 2025, 15, 1114. [Google Scholar] [CrossRef]
Apeiranthitis, S.; Zacharia, P.; Chatzopoulos, A.; Papoutsidakis, M. Predictive Maintenance of Machinery with Rotating Parts Using Convolutional Neural Networks. Electronics 2024, 13, 460. [Google Scholar] [CrossRef]
Bunyan, S.T.; Khan, Z.H.; Al-Haddad, L.A.; Dhahad, H.A.; Al-Karkhi, M.I.; Ogaili, A.A.F.; Al-Sharify, Z.T. Intelligent Thermal Condition Monitoring for Predictive Maintenance of Gas Turbines Using Machine Learning. Machines 2025, 13, 401. [Google Scholar] [CrossRef]
Li, W.; Li, T. Comparison of deep learning models for predictive maintenance in industrial manufacturing systems using sensor data. Sci. Rep. 2025, 15, 23545. [Google Scholar] [CrossRef] [PubMed]
Saleem, F.; Ahmad, Z.; Siddique, M.F.; Umar, M.; Kim, J.-M. Acoustic Emission-Based Pipeline Leak Detection and Size Identification Using a Customized One-Dimensional DenseNet. Sensors 2025, 25, 1112. [Google Scholar] [CrossRef] [PubMed]
Aminzadeh, A.; Sattarpanah Karganroudi, S.; Majidi, S.; Dabompre, C.; Azaiez, K.; Mitride, C.; Sénéchal, E. A Machine Learning Implementation to Predictive Maintenance and Monitoring of Industrial Compressors. Sensors 2025, 25, 1006. [Google Scholar] [CrossRef]
Roy, S.; Yogi, B.; Majumdar, R.; Ghosh, P.; Das, S.K. Deep learning-based crack detection and prediction for structural health monitoring. Discov. Appl. Sci. 2025, 7, 674. [Google Scholar] [CrossRef]
He, Z.; Yang, W.; Liu, Y.; Zheng, A.; Liu, J.; Lou, T.; Zhang, J. Insulator Defect Detection Based on YOLOv8s-SwinT. Information 2024, 15, 206. [Google Scholar] [CrossRef]
Wang, T.; Zhai, Y.; Li, Y.; Wang, W.; Ye, G.; Jin, S. Insulator Defect Detection Based on ML-YOLOv5 Algorithm. Sensors 2024, 24, 204. [Google Scholar] [CrossRef]
Ngwenyama, M.K.; Gitau, M.N. Discernment of transformer oil stray gassing anomalies using machine learning classification techniques. Sci. Rep. 2024, 14, 376. [Google Scholar] [CrossRef]
Dalila, R.A.M.; Turkben, A.K. Artificial intelligence based partial discharge detection using CNN and KNN to increase the quality of electrical insulation. Discov. Comput. 2025, 28, 101. [Google Scholar] [CrossRef]
Eang, C.; Lee, S. Predictive Maintenance and Fault Detection for Motor Drive Control Systems in Industrial Robots Using CNN-RNN-Based Observers. Sensors 2025, 25, 25. [Google Scholar] [CrossRef] [PubMed]
Pohakar, P.; Gandhi, R.; Hans, S.; Sharma, G.; Bokoro, P.N. Analysis of multiple faults in induction motor using machine learning techniques. e-Prime Adv. Electr. Eng. Electron. Energy 2025, 12, 101007. [Google Scholar] [CrossRef]
Tamakloe, E.; Kommey, B.; Kponyo, J.J.; Tchao, E.T.; Agbemenu, A.S.; Klogo, G.S. Predictive AI Maintenance of Distribution Oil-Immersed Transformer via Multimodal Data Fusion: A New Dynamic Multiscale Attention CNN-LSTM Anomaly Detection Model for Industrial Energy Management. IET Electr. Power Appl. 2025, 19, e70011. [Google Scholar] [CrossRef]
Lu, Y.; Li, D.; Li, D.; Li, X.; Gao, Q.; Yu, X. A Lightweight Insulator Defect Detection Model Based on Drone Images. Drones 2024, 8, 431. [Google Scholar] [CrossRef]
Santos, T.; Cunha, T.; Dias, A.; Moreira, A.P.; Almeida, J. UAV Visual and Thermographic Power Line Detection Using Deep Learning. Sensors 2024, 24, 5678. [Google Scholar] [CrossRef] [PubMed]
Yang, Z.; Nashik, S.; Huang, C.; Aibin, M.; Coria, L. Next-Gen Remote Airport Maintenance: UAV-Guided Inspection and Maintenance Using Computer Vision. Drones 2024, 8, 225. [Google Scholar] [CrossRef]
Davis, M.; Nazario Dejesus, E.; Shekaramiz, M.; Zander, J.; Memari, M. Identification and Localization of Wind Turbine Blade Faults Using Deep Learning. Appl. Sci. 2024, 14, 6319. [Google Scholar] [CrossRef]
Rodriguez-Vazquez, J.; Prieto-Centeno, I.; Fernandez-Cortizas, M.; Perez-Saura, D.; Molina, M.; Campoy, P. Real-Time Object Detection for Autonomous Solar Farm Inspection via UAVs. Sensors 2024, 24, 777. [Google Scholar] [CrossRef]
Barraz, Z.; Sebari, I.; Lamrini, N.; Ait El Kadi, K.; Ait Abdelmoula, I. Fast and automatic solar module geo-labeling for optimized large-scale photovoltaic systems inspection from UAV thermal imagery using deep learning segmentation. Clean. Eng. Technol. 2025, 28, 101048. [Google Scholar] [CrossRef]
Lim, D.; Jin, I.J.; Bang, I.C. Advanced thermal monitoring in scaled reactors using deep learning-enhanced drone IR imaging. J. Nucl. Sci. Technol. 2025, 62, 1180–1190. [Google Scholar] [CrossRef]
Intel Corporation. Intel Falcon 8+ System. Available online: https://www.intel.cn/content/dam/www/public/us/en/documents/product-briefs/falcon-8-plus-product-brief.pdf (accessed on 10 June 2025).
Saberironaghi, A.; Ren, J.; El-Gindy, M. Defect Detection Methods for Industrial Products Using Deep Learning Techniques: A Review. Algorithms 2023, 16, 95. [Google Scholar] [CrossRef]
Hashemi, M. Enlarging smaller images before inputting into convolutional neural network: Zero-padding vs. interpolation. J. Big Data 2019, 6, 98. [Google Scholar] [CrossRef]
Qian, X.; Lin, S.; Cheng, G.; Yao, X.; Ren, H.; Wang, W. Object Detection in Remote Sensing Images Based on Improved Bounding Box Regression and Multi-Level Features Fusion. Remote Sens. 2020, 12, 143. [Google Scholar] [CrossRef]
OpenCV. Geometric Image Transformations. Available online: https://docs.opencv.org/4.5.5/da/d54/group__imgproc__transform.html (accessed on 23 June 2025).
Ecma International. ECMA-404: The JSON Data Interchange Syntax, 2nd ed.; December 2017; Available online: https://ecma-international.org/publications-and-standards/standards/ecma-404 (accessed on 24 July 2025).
COCO Consortium. COCO—Common Objects in Context: Data Format Overview. Available online: https://cocodataset.org/ (accessed on 19 July 2025).
Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Proceedings of the 13th European Conference on Computer Vision (ECCV), Zurich, Switzerland, 6–12 September 2014; Springer: Cham, Switzerland, 2014; pp. 740–755. [Google Scholar] [CrossRef]
He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 386–397. [Google Scholar] [CrossRef]
POSCO. POSAI VISION. Available online: http://posai-vision-nginx.psfai.posco.com (accessed on 2 August 2025).
Shi, J.; Chen, G.; Chen, Y. Enhanced boundary perception and streamlined instance segmentation. Sci. Rep. 2025, 15, 23612. [Google Scholar] [CrossRef]
Lee, S.; Kim, Y.; Kim, Y.; Park, J.; Ji, B. Rock Joint Segmentation in Drill Core Images via a Boundary-Aware Token-Mixing Network. Buildings 2025, 15, 3022. [Google Scholar] [CrossRef]
Wu, P.; Sulaiman, N.A.A.; Ding, Y.; Zhao, J. Innovative segmentation technique for aerial power lines via amplitude stretching transform. Sci. Rep. 2025, 15, 2468. [Google Scholar] [CrossRef]
Xu, Y.; Goodacre, R. On Splitting Training and Validation Set: A Comparative Study of Cross-Validation, Bootstrap and Systematic Sampling for Estimating the Generalization Performance of Supervised Learning. J. Anal. Test. 2018, 2, 249–262. [Google Scholar] [CrossRef]
Bilal, M.; Podishetti, R.; Koval, L.; Gaafar, M.A.; Grossmann, D.; Bregulla, M. The Effect of Annotation Quality on Wear Semantic Segmentation by CNN. Sensors 2024, 24, 4777. [Google Scholar] [CrossRef] [PubMed]
Tabernik, D.; Šela, S.; Skvarč, J.; Skočaj, D. Segmentation-based deep-learning approach for surface-defect detection. J. Intell. Manuf. 2020, 31, 759–776. [Google Scholar] [CrossRef]
Clabaut, É.; Lemelin, M.; Germain, M.; Bouroubi, Y.; St-Pierre, T. Model Specialization for the Use of ESRGAN on Satellite and Airborne Imagery. Remote Sens. 2021, 13, 4044. [Google Scholar] [CrossRef]
Yang, J.; Li, Z.; Gu, Z.; Li, W. Research on floating object classification algorithm based on convolutional neural network. Sci. Rep. 2024, 14, 32086. [Google Scholar] [CrossRef]
de Carvalho, O.L.F.; de Carvalho Júnior, O.A.; e Silva, C.R.; de Albuquerque, A.O.; Santana, N.C.; Borges, D.L.; Gomes, R.A.T.; Guimarães, R.F. Panoptic Segmentation Meets Remote Sensing. Remote Sens. 2022, 14, 965. [Google Scholar] [CrossRef]
Hütten, N.; Alves Gomes, M.; Hölken, F.; Andricevic, K.; Meyes, R.; Meisen, T. Deep Learning for Automated Visual Inspection in Manufacturing and Maintenance: A Survey of Open-Access Papers. Appl. Syst. Innov. 2024, 7, 11. [Google Scholar] [CrossRef]
Chen, Z.-Y.A.; Lin, C.-C.; Huang, H.-C.; Su, W.-C.; Cheng, C.-Y. Anomaly Detection for Semiconductor Wafer Multi-Wire Sawing Machines Using Statistical and Deep Learning Methods. Mob. Netw. Appl. 2025. [Google Scholar] [CrossRef]
Gonzalez, R.C.; Woods, R.E. Digital Image Processing, 4th ed.; Pearson: London, UK, 2018; pp. 47–70. [Google Scholar]
OpenCV. Camera Calibration and 3D Reconstruction. Available online: https://docs.opencv.org/4.5.5/d9/d0c/group__calib3d.html (accessed on 29 July 2025).
Chen, L.; Chang, J.; Xu, J.; Yang, Z. Automatic Measurement of Inclination Angle of Utility Poles Using 2D Image and 3D Point Cloud. Appl. Sci. 2023, 13, 1688. [Google Scholar] [CrossRef]
Grumbach, F.; Müller, A.; Reusch, P.; Trojahn, S. Robustness Prediction in Dynamic Production Processes—A New Surrogate Measure Based on Regression Machine Learning. Processes 2023, 11, 1267. [Google Scholar] [CrossRef]
Avendaño, J.C.; Leander, J.; Karoumi, R. Image-Based Concrete Crack Detection Method Using the Median Absolute Deviation. Sensors 2024, 24, 2736. [Google Scholar] [CrossRef]
Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Proceedings of the 16th European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 213–229. [Google Scholar] [CrossRef]
Li, X.; Zhao, Y.; Zhao, Y.; Guo, Z.; Zhang, Y.; Jiao, X.; Yuan, B. TLINet: A defects detection method for insulators of overhead transmission lines using partially transformer block. PLoS ONE 2025, 20, e0327139. [Google Scholar] [CrossRef]
Singh, R.; Gill, S.S. Edge AI: A survey. Internet Things Cyber Phys. Syst. 2023, 3, 71–92. [Google Scholar] [CrossRef]
OpenCV. Contour Properties. Available online: https://docs.opencv.org/4.5.5/d1/d32/tutorial_py_contour_properties.html (accessed on 1 July 2025).
Wu, J.; Deng, Q.; Xian, R.; Tao, X.; Zhou, Z. An Instance Segmentation Method for Insulator Defects Based on an Attention Mechanism and Feature Fusion Network. Appl. Sci. 2024, 14, 3623. [Google Scholar] [CrossRef]
Pittino, F.; Puggl, M.; Moldaschl, T.; Hirschl, C. Automatic Anomaly Detection on In-Production Manufacturing Machines Using Statistical Learning Methods. Sensors 2020, 20, 2344. [Google Scholar] [CrossRef] [PubMed]
Shimizu, M.; Zhao, Y.; Avdelidis, N.P. A Fault Detection Approach Based on One-Sided Domain Adaptation and Generative Adversarial Networks for Railway Door Systems. Sensors 2023, 23, 9688. [Google Scholar] [CrossRef] [PubMed]
He, T.; Li, B.; Yang, J.; Xia, Y.; Qian, J. Theoretical Research on Suspension Bridge Cable Damage Assessment Based on Vehicle-Induced Cable Force. Buildings 2024, 14, 3962. [Google Scholar] [CrossRef]
Zhu, Z.; Gao, Z.; Zhuang, J.; Huang, D.; Huang, G.; Wang, H.; Pei, J.; Zheng, J.; Liu, C. MSMT-RTDETR: A Multi-Scale Model for Detecting Maize Tassels in UAV Images with Complex Field Backgrounds. Agriculture 2025, 15, 1653. [Google Scholar] [CrossRef]
Yun, J.-W.; Choi, S.-W.; Lee, E.-B. Study on Energy Efficiency and Maintenance Optimization of Run-Out Table in Hot Rolling Mills Using Long Short-Term Memory-Autoencoders. Energies 2025, 18, 2295. [Google Scholar] [CrossRef]
Wu, Y.; Kirillov, A.; Massa, F.; Lo, W.-Y.; Girshick, R. Detectron2. 2019. Available online: https://github.com/facebookresearch/detectron2 (accessed on 29 June 2025).
NVIDIA. CUDA Toolkit Documentation. Available online: https://docs.nvidia.com/cuda (accessed on 30 August 2025).
PyTorch Foundation. PyTorch documentation. Available online: https://docs.pytorch.org/docs/1.10 (accessed on 30 August 2025).
Canonical. Ubuntu 20.04 LTS Release Notes. Available online: https://wiki.ubuntu.com/FocalFossa/ReleaseNotes (accessed on 30 August 2025).
NVIDIA. NVIDIA RTX A5000. Available online: https://resources.nvidia.com/en-us-briefcase-for-datasheets/nvidia-rtx-a5000-dat-1 (accessed on 30 August 2025).
Chopra, P.; Yadav, S.K. Restricted Boltzmann machine and softmax regression for fault detection and classification. Complex Intell. Syst. 2018, 4, 67–77. [Google Scholar] [CrossRef]
Malmström, M.; Skog, I.; Axehill, D.; Gustafsson, F. Uncertainty quantification in neural network classifiers—A local linear approach. Automatica 2024, 163, 111563. [Google Scholar] [CrossRef]
Kirillov, A.; Wu, Y.; He, K.; Girshick, R. PointRend: Image Segmentation as Rendering. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 9796–9805. [Google Scholar] [CrossRef]
Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 618–626. [Google Scholar] [CrossRef]
Chen, Y.; Ding, Y.; Zhao, F.; Zhang, E.; Wu, Z.; Shao, L. Surface Defect Detection Methods for Industrial Products: A Review. Appl. Sci. 2021, 11, 7657. [Google Scholar] [CrossRef]
Du, J.; Tao, C.; Cao, X.; Tsung, F. 3D vision-based anomaly detection in manufacturing: A survey. Front. Eng. Manag. 2025, 12, 343–360. [Google Scholar] [CrossRef]
Peng, T.; Zheng, Y.; Zhao, L.; Zheng, E. Industrial Product Surface Anomaly Detection with Realistic Synthetic Anomalies Based on Defect Map Prediction. Sensors 2024, 24, 264. [Google Scholar] [CrossRef] [PubMed]
Cazzato, D.; Cimarelli, C.; Sanchez-Lopez, J.L.; Voos, H.; Leo, M. A Survey of Computer Vision Methods for 2D Object Detection from Unmanned Aerial Vehicles. J. Imaging 2020, 6, 78. [Google Scholar] [CrossRef]

Figure 1. Lightning rods. (a) Lightning rod installed on a steel-mill chimney (yellow box); (b) close-up view of the lightning rod.

Figure 2. Electrical insulator. (a) Insulator mounted on a transmission tower (yellow box); (b) close-up view of the insulator.

Figure 3. The research process.

Figure 4. Industrial inspection for data acquisition at Company P’s steel mill site.

Figure 5. Detailed workflow of the POSAI VISION automated image preprocessing pipeline.

Figure 6. Examples of labeling for lightning rods and insulators. (a) Lightning rods installed on structures (e.g., exhaust stacks and cylindrical tanks); (b) suspension insulator strings on overhead transmission lines (attached to transmission towers).

Figure 7. Architecture of the proposed ADS-LI. (a) Overall architecture; (b) process flow of lightning rod anomaly decision; (c) process flow of insulator anomaly decision.

Figure 8. Box loss and segmentation loss over iterations.

Figure 9. Instance segmentation results for lightning rods and insulators obtained by Mask R-CNN_R101 and PointRend_R101. (a) Lightning rods; (b) insulators.

Figure 10. Examples of Grad-CAM-based class activation maps for insulators and lightning rods. (a) Lightning rods; (b) insulators.

Figure 11. Examples of anomaly detection results for lightning rods through ADS-LI. (a) Normal; (b) abnormal.

Figure 12. Examples of anomaly detection results for insulators through ADS-LI. (a) Normal; (b) abnormal.

Table 1. Learning hyperparameters.

Item	Set Value
Number of epochs	300
Batch size	4

Table 2. Detection results of lightning rod diagnosis according to threshold θ_LR.

Test	$θ_{L R}$ (px)	Normal Detection for Lightning Rods	Model Selection
1st	5	9
2nd	10	10	*✓
3rd	15	10
4th	20	10

*✓ is selected model.

Table 3. Detection results of insulator diagnosis according to

θ_{I}

.

Table 3. Detection results of insulator diagnosis according to

θ_{I}

.

Test	$θ_{I}$ (%)	Normal Detection for Insulators	Model Selection
1st	30	10
2nd	35	10
3rd	40	10
4th	45	10
5th	50	10
6th	55	10	*✓
7th	60	9
8th	65	8
9th	70	8

*✓ is selected model.

Table 4. Model execution environment.

Item	Version
Programming language	Python 3.8.18
Key framework	Detectron2 v0.6
Instance segmentation model	PointRend_R101
Deep learning framework	PyTorch 1.10.2
CUDA version	CUDA 11.3
Image processing library	OpenCV 4.5.5
Operating system	Ubuntu 20.04 LTS
GPU specification	NVIDIA RTX A5000 (24 GB)

Table 5. Comparison of instance segmentation performance between Mask R-CNN_R101 and PointRend_R101.

Model	Backbone	Size (Pixel)	${mAP}_{50}$ (%)	${mAP}_{75}$ (%)
Mask R-CNN	R101-FPN	1280	48.2	32.1
PointRend	R101-FPN	1280	51.8	35.1

Table 6. Diagnostic performance of the ADS-LI system.

Category	TP	TN	FP	Accuracy	Precision	Recall	F1-score
Lightning rods	2	43	0	1	1	1	1
Insulators	2	42	1	0.98	0.67	1	0.80
Total	4	85	1	0.99	0.80	1	0.89

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kim, H.-R.; Choi, S.-W.; Lee, E.-B.; Kim, G.-W. ADS-LI: A Drone Image-Based Segmentation Model for Sustainable Maintenance of Lightning Rods and Insulators in Steel Plant Power Infrastructure. Sustainability 2025, 17, 11151. https://doi.org/10.3390/su172411151

AMA Style

Kim H-R, Choi S-W, Lee E-B, Kim G-W. ADS-LI: A Drone Image-Based Segmentation Model for Sustainable Maintenance of Lightning Rods and Insulators in Steel Plant Power Infrastructure. Sustainability. 2025; 17(24):11151. https://doi.org/10.3390/su172411151

Chicago/Turabian Style

Kim, Hyeong-Rok, So-Won Choi, Eul-Bum Lee, and Geon-Woo Kim. 2025. "ADS-LI: A Drone Image-Based Segmentation Model for Sustainable Maintenance of Lightning Rods and Insulators in Steel Plant Power Infrastructure" Sustainability 17, no. 24: 11151. https://doi.org/10.3390/su172411151

APA Style

Kim, H.-R., Choi, S.-W., Lee, E.-B., & Kim, G.-W. (2025). ADS-LI: A Drone Image-Based Segmentation Model for Sustainable Maintenance of Lightning Rods and Insulators in Steel Plant Power Infrastructure. Sustainability, 17(24), 11151. https://doi.org/10.3390/su172411151

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

ADS-LI: A Drone Image-Based Segmentation Model for Sustainable Maintenance of Lightning Rods and Insulators in Steel Plant Power Infrastructure

Abstract

1. Introduction

1.1. Background of Study

1.2. Problem Statement and Research Objectives

1.3. Research Process

2. Literature Review

2.1. Advances in Equipment Maintenance Strategies

2.2. PdM Using Machine Learning Technology

2.3. AI-Based Diagnostics for Electrical Equipment

2.4. Object Detection-Based PdM

2.5. Limitation of Previous Research

3. Materials and Methods

3.1. ADS-LI Dataset Construction

3.1.1. Data Acquiring

3.1.2. Data Cleaning and Standardization

3.1.3. Data Labeling

3.2. ADS-LI Architecture

3.3. Instance Segmentation Module

3.3.1. Model Training

3.3.2. Fine-Tuning

3.4. Anomaly Detection Module

3.4.1. Lightning Rod Anomaly Detection

3.4.2. Insulator Anomaly Detection

3.5. Experimental Setting

3.5.1. Metrics for Instance Segmentation Evaluation

3.5.2. Metrics for Anomaly Detection Evaluation

3.5.3. Experimental Environment

4. Experimental Results and Analysis

4.1. Instance Segmentation

4.2. Anomaly Detection

5. Discussion

6. Conclusions

6.1. Summary and Contributions

6.2. Future Studies and Recommendations

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI