TPKE: Automated Keypoint Extraction for Multi-Type Transmission Pylons from LiDAR Point Clouds

Wu, Gufen; Gao, Yuan; Liu, Haibo; Zhang, Su; Yang, Zhou; Wang, Pu; Zhou, Yibing; Cheng, Sijin; Nie, Sheng; Wang, Cheng; Wang, Haoyu

doi:10.3390/rs18030429

Open AccessArticle

TPKE: Automated Keypoint Extraction for Multi-Type Transmission Pylons from LiDAR Point Clouds

by

Gufen Wu

^1,2,

Yuan Gao

²

,

Haibo Liu

³,

Su Zhang

³,

Zhou Yang

²

,

Pu Wang

^2,4

,

Yibing Zhou

⁵,

Sijin Cheng

⁵,

Sheng Nie

²

,

Cheng Wang

^1,2 and

Haoyu Wang

^1,2,*

¹

College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541004, China

²

Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

³

State Grid Economic and Technological Research Institute Co., Ltd., Beijing 102209, China

⁴

Zhengzhou Institute for Advanced Research, Henan Polytechnic University, Zhengzhou 451464, China

⁵

State Grid Jilin Electric Power Co., Ltd. Construction Company, Changchun 130021, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(3), 429; https://doi.org/10.3390/rs18030429

Submission received: 24 November 2025 / Revised: 25 January 2026 / Accepted: 26 January 2026 / Published: 29 January 2026

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

The TPKE method achieves centimeter-level precision (MAE of 0.0747 m for insulators, 0.0696 m for grounding wires) across 1427 pylons spanning 14 tower types, demonstrating strong generalization and robustness to data deficiency.
A novel “concavity” morphological metric enables accurate identification and segmentation of V-shaped insulator strings, while the localization-verification-compensation strategy effectively addresses endpoint data deficiency issues prevalent in real-world LiDAR acquisitions.

What are the implications of the main findings?

The automated keypoint extraction framework significantly reduces the workload of manual labeling (average time of 3.03 s per tower processing), thus realizing efficient autonomous navigation and accurate target positioning of drones in large-scale transmission line inspection operations.
This method has good robustness for point cloud sparseness, density change, and cross-tower structure diversity, and provides a practical solution for intelligent power grid operation, maintenance, and inspection automation.

Abstract

Automated positioning of transmission tower keypoints is crucial for drone-based intelligent inspection systems. This paper proposes TPKE (Transmission Pylons Keypoint Extraction), a novel framework designed to extract multiple transmission tower keypoints from LiDAR point clouds. The method targets two core components: insulator string endpoints and ground wire hanging points. For insulator positioning, TPKE introduces adaptive density clustering, a morphological “concavity” index (η) for V-shaped insulators, and a “positioning-verification-compensation” strategy for handling missing data. For ground wire positioning, it combines local geometric feature analysis with spatial orthogonal projection. Using semantic segmentation for preprocessing, the framework reliably identifies components from complex transmission corridor point clouds. Validated on 1427 towers across 14 types, TPKE achieves an MAE of 0.0747 m for insulators and 0.0696 m for ground wires. It maintains centimeter-level accuracy even under challenging conditions like sparse point clouds. With an average processing time of 3.03 s per tower, the method demonstrates high efficiency, significantly reducing manual annotation workload while supporting autonomous navigation for transmission line maintenance.

Keywords:

transmission pylon inspection; pylon keypoint extraction; target point localization; LiDAR; point cloud processing

Graphical Abstract

1. Introduction

In recent years, unmanned aerial vehicle (UAV) technology has been widely used in transmission line inspection, providing an effective solution to overcome the limitations of traditional inspection methods [1,2]. The traditional manual inspection method is inefficient, prone to omissions, labor-intensive, and difficult to make quantitative evaluations, so it cannot meet the urgent needs of intelligent management of modern power grids [3]. With the rapid development of remote sensing and artificial intelligence technology, the unmanned aerial vehicle inspection system equipped with a LiDAR sensor can quickly obtain high-precision and high-density three-dimensional point cloud data from transmission line corridors [4]. With their excellent data acquisition capabilities and reliability, these systems provide key technical support for the refined monitoring of power transmission facilities [5,6,7].

However, although onboard LiDAR technology has shown significant advantages in data collection, there are still many challenges in the intelligent processing of point cloud data. Especially in complex point cloud data, accurately identifying and segmenting key targets such as wires, towers, and insulators, and accurately locating key monitoring points, has become a bottleneck that restricts the development of autonomous UAV inspection technology [8]. At present, the power patrol technology based on drones has gradually developed from the manual control of early visible light photography to the intelligent patrol mode based on the LiDAR point cloud [9,10,11]. However, in practice, inspectors still need to manually screen the target points in a large amount of three-dimensional point cloud data indoors to plan the inspection route. This process is not only time-consuming and labor-intensive but also prone to errors [12,13]. Therefore, the automatic extraction and accurate positioning of keypoints is of great theoretical significance and practical value for further improving the efficiency of inspection, reducing the cost of operation and maintenance, and promoting the intelligent development of autonomous UAV inspection technology [14,15].

In the power transmission system, in addition to path planning, precise positioning of key components such as insulator strings is equally important for fault diagnosis and preventive maintenance. Keypoint positioning technology has been widely studied in the fields of computer vision and robotics. In the field of two-dimensional image processing, target detection and keypoint extraction methods based on deep learning have made remarkable progress in the detection of power facilities, including applications such as insulator detection and defect identification [16,17]. However, the extraction of three-dimensional point cloud keypoints still faces many challenges, such as high computational complexity and strict data quality requirements [18,19]. Point cloud data has the characteristics of being unstructured, disorderly, and sparse, which makes it difficult for traditional two-dimensional methods to be directly applied to three-dimensional point clouds [20,21,22].

As a key component in the transmission system, the insulator string plays a crucial role in ensuring the safe operation of the power grid by monitoring its status [23]. The existing insulator defect detection methods mainly rely on visible light or infrared images [24], while deep learning-based defect identification technology has made remarkable progress. Liu et al. [25] reviewed the insulator defect detection method based on deep learning, pointing out that the existing method still faces challenges in terms of detection accuracy and real-time performance in complex environments. These image-based methods are highly susceptible to light conditions, camera angle, weather conditions, and other factors; in strong light, shadow occlusion, or smog weather, the detection accuracy will be significantly reduced [26]. In contrast, point cloud-based insulator keypoint extraction can provide more accurate and stable three-dimensional location information [27], but research in this field is still limited [28]. Shen et al. [28] proposed an automatic tower detection framework based on stratified thick and fine segmentation, which improved the detection accuracy in complex scenarios through a multi-scale segmentation strategy. However, this method is mainly aimed at the main structure of the tower, and the positioning ability of fine components such as insulators is limited. Chen et al. [29] proposed an insulator string extraction method based on multi-scale feature histograms, which realizes automatic identification through point cloud geometric feature analysis. However, this method relies on a priori knowledge of towers and transmission lines, and the adaptability to complex scenarios still needs to be improved. Zhang et al. [30] proposed an automatic tower extraction framework based on the point cloud of UAV LiDAR, demonstrating the application potential of point cloud data in the identification of power facilities and laying the foundation for subsequent research.

Whether it is the keypoint detection of insulator strings or the keypoint detection of other power transmission facilities, the efficient three-dimensional point cloud keypoint detection technology is at its core. In the field of three-dimensional point cloud processing, keypoint detection technology has been widely studied. The early method was mainly based on geometric features and identified feature points through indicators such as curvature and normal vector changes [31,32]. Traditional three-dimensional keypoint detectors, such as inherent shape features (ISS) [33], identify prominent feature points by calculating the characteristic value distribution of local point cloud neighborhoods and show good robustness in tasks such as rigid body matching. Traditional geometric feature descriptors, such as linearity, planarity, and scattering based on principal component analysis (PCA), have been widely used in the geometric structure analysis of point clouds [34]. These features delineate the distribution characteristics of the point cloud along the main direction by calculating the eigenvalue ratio of the covariance matrix. However, for V-shaped insulator strings, PCA characteristics have limitations: the V-shaped structure is composed of two approximately linear branches. The overall PCA may still show high linearity, and it is difficult to effectively distinguish between V-shaped and single-string linear insulators. These methods, based on manual features, often face challenges such as low computing efficiency and limited feature representation ability when dealing with complex scenarios and large-scale point clouds.

In recent years, deep learning methods have made remarkable progress in the field of three-dimensional keypoint detection. PointNet [35] and its improved version, PointNet++ [36], pioneered the deep learning architecture of direct point cloud processing, realizing permutation invariance through multi-layer perceptrons and symmetric functions, thus laying the foundation for subsequent research. Subsequently, various point cloud processing methods based on deep learning were proposed, including Relation-Shape Convolutional Neural Networks [37] and Point Transformer [38], which enhance feature extraction capabilities through self-attention mechanisms and more complex network architectures. These methods learn the relationship between points and context information, achieve higher accuracy in tasks such as shape classification and semantic segmentation [20], and provide a new technical way for keypoint detection in complex scenarios.

For the specific field of transmission tower keypoint detection, Wu et al. [39] proposed a UPKD method specifically for transmission tower keypoint detection. The method uses unsupervised learning to extract detection points from three-dimensional LiDAR data and identify the structural characteristics of the tower through clustering algorithms, showing the potential in detecting keypoints of the tower. Li et al. [15] developed a target point positioning algorithm based on deep learning for drone patrol route planning and realized end-to-end detection from point cloud to detection target point. Although end-to-end deep learning methods have made remarkable progress in the field of two-dimensional image keypoint detection in recent years, there are still many challenges in directly applying these methods to the keypoint detection of transmission towers in three-dimensional point clouds. First of all, in terms of data requirements, the end-to-end deep learning method requires large-scale annotation data for training, and the three-dimensional annotation of keypoints of the transmission tower is expensive and time-consuming, making it difficult to quickly accumulate sufficient training samples [40]. Secondly, in terms of generalization ability, the structure of different types of transmission towers is significantly different. The performance fluctuations of deep learning models may fluctuate under unprecedented tower types or different acquisition conditions (such as different distances and point cloud densities), which requires continuous data accumulation and model retraining. Third, in terms of actual deployment, the “black box” characteristics of the deep learning model limit its interpretability and debugability, and the real-time performance needs to be improved, which makes it difficult to meet the online processing needs of large-scale point cloud data. The SKD method proposed by Tinchev et al. [41] utilizes a keypoint detection method based on significance, which alleviates the dependence on annotation data to a certain extent, but its applicability in transmission scenarios has not been fully verified. The weakly supervised learning method (“One Thing One Click”) proposed by Liu et al. [42] reduces the amount of labeling workload through self-training strategies and provides new ideas for reducing labeling costs. However, this method is mainly aimed at semantic segmentation tasks, and its application in keypoint detection still needs to be further studied. This paper focuses on the actual needs of the following aspects in the method design: (1) There is no need for large-scale annotation training sets, and only a small number of samples are needed for parameter tuning, which significantly reduces the cost of data preparation; (2) The algorithm logic is clear and transparent, which is convenient for targeted adjustment and troubleshooting according to the actual scenario; and (3) It has good generalization ability for different tower types and point cloud qualities and can adapt to new scenarios without retraining.

The extraction of keypoints of transmission towers faces the following core challenges: (1) The structure of transmission towers is complex and diverse, and the geometric characteristics of different types of towers are significantly different [43,44]; (2) The quality of point cloud data is affected by factors such as scanning distance and occlusion, which is manifested as uneven density and missing local data [45]; and (3) The positioning accuracy of keypoints needs to be further improved in order to achieve precise target point positioning to meet the needs of autonomous UAV inspection applications [15].

The main contribution of this paper is the TPKE method, which can automatically extract key detection points from multiple types of transmission towers after semantic segmentation. The structure of the method is as follows: Section 2.1 briefly introduces the preprocessing semantic segmentation module. Section 2.2 introduces the core algorithm of TPKE, which contains two special keypoint extraction modules: (1) Insulator string module: use the adaptive DBSCAN clustering method to segment the insulator point cloud, introduce the “concavity” morphological metric (η) to identify V-shaped insulators, and realize positioning-verification-compensation strategies through directional bounding box analysis and axial extension search to solve the problem of missing endpoint data. (2) Ground wire module: Identify the candidate connection area through local geometric feature analysis (linearity and flatness measurement based on PCA), and achieve precise positioning through spatial orthogonal projection and weighted center of mass estimation.

2. Methods and Materials

This paper proposes a new automation framework called TPKE (Transmission Pylons Keypoint Extraction), which is used for multiple types of transmission tower keypoint positioning. This method develops a special extraction algorithm for two main inspection objects: the insulator string endpoint and the ground wire connection point. For insulator positioning, TPKE introduces (a) adaptive density clustering to handle different point cloud densities; (b) a morphological “concavity” measure for robust identification of V-shaped insulators; and (c) a positioning-verification-compensation strategy for solving endpoint data missing. For ground wire positioning, this method adopts local geometric feature analysis combined with spatial orthogonal projection technology. In order to ensure the reliability of keypoint extraction, the preprocessing step combines optimized semantic segmentation, anti-noise processing, and structural perception convolution modules. This study divides the keypoints of the tower into two categories: the grounding point, indicating the intersection between the ground wire and the tower, such as point 6 and point 7 in Figure 1; and the keypoint of the insulator string, indicating the connection position of the insulator string, such as points 0–5 and points 8–11 in Figure 1.

The point cloud of the power scene has the characteristics of sparseness, disorder, and serious noise interference. In order to obtain reliable tower component identification semantic labels, this paper adopts a preprocessing framework that combines k-nearest neighbor (KNN)-based noise filtering, an ellipsoidal sampling strategy, and focal loss [46] to enhance the segmentation robustness in power scenarios. On the 220 kV transmission corridor data set, the semantic segmentation results achieved an OA of 0.923 and an mIoU of 0.891, providing a clear tower point cloud for subsequent analysis.

The overall technical framework of this paper is shown in Figure 2.

2.1. Power Scene Semantic Segmentation

In order to obtain high-quality tower component identification semantic labels, we adopted the robust segmentation framework proposed by Liu et al. [46]. The framework solves the problems of discrete noise, spatial anisotropy, and category imbalance through the following mechanisms:

(1): KNN-Based Noise-Robust Processing

The specific principle is as follows: when the model designates a point as the center of the convolutional sphere, the KNN algorithm will evaluate whether the point constitutes noise—if the point lacks enough neighboring points in its neighborhood, indicating that it is relatively sparse, it will be classified as potential noise and skip the sampling of the area so as to reduce noise interference during model training. Given a dataset

D = {x_{1}, x_{2}, \dots, x_{N}}

, where each

x_{i} \in R^{d}

, for any point

x \in D

, let its k-nearest neighbor set be

N_{k} (x)

, representing the

k

points closest to

x

(excluding

x

itself). Compute the average distance to the k-nearest neighbors:

\bar{d_{k}} (x) = \frac{1}{k} \sum_{x_{j} \in N_{k} (x)} | x - x_{j} |_{2}

(1)

And count the number of neighbors within the distance threshold

R (x) = α \cdot \bar{d_{k}} (x)

:

N_{R} (x) = {x_{j} \in N_{k} (x) : | x - x_{j} |_{2} < R (x)}

(2)

If

N_{R} (x) < β \cdot k

(where

α = 0.5

,

β = 0.3

), then point

x

is classified as latent noise and bypassed during sampling. This adaptive threshold ensures that points with unusually sparse neighborhoods relative to their local contexts are identified as noise.

(2): Ellipsoidal Structure-Aware Convolution

In the improved convolution operation, the neighborhood

N_{x}

is redefined as an ellipsoid centered at the point

x

, wherein the semi-axis length in the

Z

-axis direction is relatively short, while those in the

X Y

-plane directions are relatively long. The core principle resides in expanding the ellipsoid’s extent in the

X Y

-plane directions while contracting its semi-axis length in the

Z

-direction, thereby causing the network to prioritize horizontal-direction (

X Y

) neighboring points during feature extraction and convolution aggregation, while attenuating interference from vertical-direction neighbors. Consequently, for linear targets such as conductors and grounding wires that exhibit a more pronounced distribution in the

X Y

-plane while remaining relatively sparse in the

Z

-direction, the network can acquire more comprehensive and discriminative geometric information. As illustrated in Figure 3, an intuitive comparison between ellipsoidal sampling and traditional spherical sampling is presented. Figure 3a,b shows that ellipsoidal sampling can capture denser and continuous point clouds, thus outlining the structure of the conductor more completely. The bar chart on the right further quantifies the difference in feature coverage of the two sampling methods, indicating that ellipsoidal sampling significantly improves the feature coverage of the conductor.

(3): Class Imbalance Loss Function

Focal Loss is specially used to solve the problem of unbalanced distribution of sample categories. Compared with the traditional cross-entropy loss function, it has a significant improvement and a great advantage in the task of point cloud semantic segmentation. The core of the loss function is to introduce the focus parameter

γ

and the category equilibrium factor

α

to reduce the weight of easily classified samples in loss calculation and improve the contribution of hard-to-classify samples so as to guide the model to focus more effectively on key categories and complex samples. The calculation formula of the loss function is shown in Equation (3):

F L (p_{t}) = - α_{t} {(1 - p_{t})}^{γ} l o g (p_{t})

(3)

where

p_{t}

represents the model’s predicted probability for the true class;

α_{t}

denotes the class weight used to balance sample quantities across different classes; and

γ

represents the focusing parameter that controls the degree of model attention toward hard-to-classify samples.

2.2. Keypoint Extraction

2.2.1. Insulator String Keypoint Extraction

(1): Adaptive DBSCAN Clustering Algorithm

The traditional DBSCAN algorithm relies on the global fixed neighborhood radius and the minimum number of points (MinPts) required for a core object, so it is difficult to obtain satisfactory clustering performance when processing point cloud data with significant density changes. This study adopts an adaptive DBSCAN clustering strategy based on local density estimation to dynamically determine the neighborhood radius of each point in the point cloud so that the algorithm can better adapt to the changes in local point cloud density. The calculation formula is as follows:

ε_{i} = ε_{b a s e} + λ (\frac{d_{k} (p_{i})}{{\bar{d}}_{k}}) ε_{b a s e}

(4)

where

p_{i}

represents the three-dimensional coordinates of the

i

-th point in the point cloud dataset

P = \{p_{i} | i = 1, 2, \dots, n\}

,

ε_{b a s e}

denotes the base neighborhood radius,

{\bar{d}}_{k}

represents the average

k

-nearest neighbor distance of the dataset, and

d_{k} (p_{i})

is defined as the Euclidean distance from point

p_{i}

to its

k

-th nearest neighbor, serving as a critical metric for assessing local point cloud density around point

p_{i}

. A smaller

d_{k} (p_{i})

value indicates higher density in the region where

p_{i}

resides. The parameter

k

represents the number of neighboring points used when computing the k-distance, with the

k

value defining the scale of local density estimation. The algorithm’s core principle employs the ratio of k-distances for relative density assessment rather than relying on absolute k-distance values; consequently, within a reasonable interval (values of 5~10), the algorithm’s final results do not exhibit substantial variation. Additionally,

λ

denotes the density adaptation coefficient, which controls the sensitivity of local point cloud density relative to neighborhood radius; in this study,

λ

is set to 1.2.

(2): V-Shaped Insulator String Segmentation Based on PCA and K-Means

According to structural characteristics, insulator strings are primarily classified into linear insulator strings and V-shaped insulator strings. During point cloud clustering, V-shaped insulator strings are frequently clustered into a single cluster. To accurately identify V-shaped insulator strings, this paper introduces a morphological parameter—the “concavity” metric

η

—for insulator string determination, with the principle illustrated in Figure 4.

This metric is defined as follows:

η = \frac{d_{m i n}}{\bar{d}}

(5)

where

d_{m i n}

represents the minimum Euclidean distance from all points in the point cloud cluster to its centroid, and

\bar{d}

denotes the average Euclidean distance from all points within the cluster to the centroid. For morphologically compact point cloud clusters, the centroid is located within the point cloud interior, with certain points invariably positioned in close proximity to the centroid, yielding an extremely small

d_{m i n}

value and consequently causing

η

to approach 0. Conversely, for structures exhibiting extended morphology with “hollow” interiors such as V-shaped configurations, the centroid occupies a vacant position, thereby maintaining

d_{m i n}

at a relatively large value and preventing

η

from approaching 0. Therefore, this metric effectively discriminates between these two structural types; when

η \geq 0.3

, the point cloud cluster can be classified as a V-shaped insulator string.

For identified V-shaped insulator strings

C_{v} = \{c_{j} ∣ j = 1, 2, \dots, m\}

(where

m

represents the point count of the point cloud cluster and

c_{j}

denotes the three-dimensional coordinates of the

j

-th point within the cluster), K-Means clustering is employed for segmentation: first, PCA is performed on the XY-plane projection

C_{i, x y} = \{c_{j, x y} ∣ j = 1, 2, \dots, m\}

of the V-shaped point cloud cluster

C_{v}

. Through the transformation matrix

W \in R^{2 \times 2}

, the point cloud is projected onto a new two-dimensional principal component space:

C_{V}^{'} = C_{V, x y} \times W

(6)

Leveraging geometric priors of V-shaped structures, two extreme points of the projected point set

C_{V}^{'}

along the first principal component direction are selected as initial clustering centers

c_{1}

and

c_{2}

for the K-Means algorithm (k = 2). Based on the selected initial center points

C = \{c_{1}, c_{2}\}

, K-Means clustering is executed on the projected point set

C_{V}^{'}

for segmentation, with segmentation results illustrated in Figure 5. Upon completion of segmentation, each sub-cluster

S_{j}

is assigned a new category identifier, and its critical attribute parameter set

Φ_{j} = \{c_{j}^{3 D}, d_{j}, {\bar{r}}_{j}\}

is computed, where

c_{j}^{3 D}

represents the centroid of the sub-cluster in the original three-dimensional space,

d_{j}

denotes the principal direction vector (typically obtained through re-applying PCA to the sub-cluster point cloud), and

{\bar{r}}_{j}

represents the average radius of the sub-cluster.

(3): Precise Localization Method for Insulator String Endpoint Connections

The endpoints of insulator strings constitute critical locations for achieving pylon-conductor and pylon-jumper wire connections. This paper designs a specialized method for addressing localization deviations caused by data deficiency, employing a “localization-verification-compensation” workflow to ensure reliable localization results even under conditions of incomplete endpoint data. For each effectively segmented insulator string point cloud cluster, PCA is first applied to determine its primary extension direction in three-dimensional space. Using this principal axis direction as the core orientation reference, an Oriented Bounding Box (OBB) is constructed to tightly enclose the point cloud cluster. The geometric center points of the two most distant end faces along the principal axis direction of this OBB are defined as the initial candidate endpoints of the insulator string, as illustrated in Figure 6.

According to the actual pylon structure, insulator endpoints must establish physical connections with components such as the pylon body, conductors, or fittings. Accordingly, for each initial candidate endpoint, a spherical search region with radius

r

is constructed around it to retrieve the presence of pylon body, conductor, or jumper wire point clouds. If sufficient pylon body or conductor point clouds are detected within the neighborhood, indicating complete point cloud data at this location, the position of the initial candidate endpoint is designated as the final keypoint. If no connecting component point clouds are detected within the neighborhood, the point cloud data near this initial candidate endpoint is determined to be incomplete, triggering entry into the axial extension search phase.

As shown in Figure 7, the axial extension search first determines the overall trajectory of the insulator string based on the two initial candidate endpoints and then determines the outward extension search direction of each endpoint. Considering the possible curvature of the insulator string morphology, the algorithm will detect and evaluate the morphology of the insulator string before determining the extension direction.

Use the segmented linear fitting method to detect the curvature of the insulator string: divide the initial candidate insulator string into several equal-length segments, calculate the main direction vector of each segment, and determine whether there is a significant curvature by comparing the direction angles between the adjacent segments. If the maximum angle difference between the main directions of all segments is less than 16.5°, the insulator string is classified as linear; otherwise, it is classified as curved. For linear insulator strings, use the global main direction of the entire insulator string to determine the direction of the extension search. For the curved insulator string, the segmented main direction is adopted: when an extended search is carried out on a specific endpoint, the local main direction of the insulator string segment near the initial endpoint will be used as the direction vector of the extended search.

Subsequently, commencing from the initial endpoint requiring correction and proceeding along the determined outward direction, a series of discrete test points is generated at specified step intervals within a preset maximum allowable distance.

Centered at each generated discrete sampling point

p_{i, j}

, a search is conducted within a given spherical neighborhood to determine the presence of candidate connection point clouds from known pylon structures or conductors (including jumper wires). If the quantity of such candidate connection points identified within this neighborhood reaches the established minimum threshold, an effective connection region is deemed to have been successfully located at that test point position. Upon confirmation of an effective connection region, the centroid of all qualifying candidate connection points within this region is designated as the corrected precise endpoint position of the insulator string.

2.2.2. Precise Grounding Point Localization

Connection points between grounding wires and pylons exhibit distinctive local geometric characteristics, manifesting as directional discontinuities in linear structures and local anomalies in point cloud density. This study employs PCA to quantify these geometric features. For any point

p

on the grounding wire, a local covariance matrix is constructed within its

r

neighborhood, and the principal direction vector is obtained through eigenvalue decomposition. The linearity feature

ρ

and planarity feature

σ

are computed according to Equation (7):

ρ = \frac{λ_{2}}{λ_{1}}, σ = \frac{λ_{3}}{λ_{1}}

(7)

where

λ_{1}

,

λ_{2}

,

λ_{3}

represent eigenvalues of the covariance matrix. When both

ρ

and

σ

are small, strong linear characteristics are indicated; increasing ratios suggest enhanced geometric structural complexity with potential connection points present, as illustrated by eigenvalue ratios in Figure 8.

Based on the spatial proximity relationship between grounding wires and pylons, a

K

-nearest neighbor search algorithm is employed to identify potential connection regions. Uniform sampling is first performed on the grounding wire to generate a representative sampling point set

S

:

S = \{p_{i} ∣ i = j \cdot \frac{N - 1}{M - 1}, j = 0, 1, \dots, M - 1\}

(8)

where

N

denotes the total point count of the grounding wire and

M

represents the number of sampling points. For each sampling point

s \in S

, a neighborhood point set

N_{r} (s)

within radius

r

is searched in the pylon point cloud

P_{tower}

:

N_{r} (s) = {p \in P_{tower} | | p - s |_{2} \leq r}

(9)

where the distance threshold

r

should reflect the anticipated spatial extent at the grounding wire-pylon connection. When the point quantity

|N_{r} (s)|

in the neighborhood point set exceeds the preset density threshold

ρ_{m i n}

, this region is marked as a potential connection region.

To eliminate redundant potential connection regions and obtain unique connection points, merging is performed based on distance and directional similarity. For any two potential regions

R_{i}

and

R_{j}

, if their centroid distance is less than the preset distance threshold and the cosine similarity of their corresponding grounding wire local direction vectors exceeds the preset directional similarity threshold

δ_{dir}

, they are considered to indicate the same connection point and are merged into a new connection region

R

. The parameter

δ_{dir}

ensures that merged regions exhibit consistent grounding wire orientation. The centroid and average direction vector of the merged region are obtained through weighted averaging of the centroids and directions of all contained sub-regions. For each merged connection region

R

, the connection point is precisely localized through projection geometry methods with further result optimization, as illustrated in Figure 9.

Let the point set comprising all pylon points within the region be

P_{r}

, with average direction vector

\bar{d}

and region centroid

\bar{c}

. The connection point is defined as the centroid of the projection points of the points in

P_{r}

onto the plane perpendicular to

\bar{d}

. For any point

p

in

P_{r}

, its projection point

p^{'}

on the perpendicular plane is computed according to Equation (11), and the final connection point position

p_{c}

is defined as the centroid of all projection points

p^{'}

:

p^{'} = p - ((p - \bar{c}) \cdot \bar{d}) \bar{d}

(10)

p_{c} = \frac{1}{|P_{r}|} \sum_{p \in P_{r}} p^{'}

(11)

2.3. Dataset Description

2.3.1. Semantic Segmentation Dataset

This study employs the 220 kV dataset for experimental validation and analysis of the proposed method. The 220 kV dataset was acquired by UAVs equipped with high-precision LiDAR systems, primarily targeting object recognition tasks in high-voltage transmission corridor scenarios. The dataset contains seven semantic categories: conductors, ground, pylons, vegetation, insulators, jumper wires, and grounding wires. All point cloud semantic labels were carefully annotated by professionals to ensure the accuracy and reliability of the annotation quality. The point cloud data density is high, with an average of 100 points per square meter, and each point contains X, Y, Z spatial coordinates and intensity information. In order to meet the needs of model training and performance evaluation, the data set is divided into a training set, verification set, and test set in a ratio of 8:1:1.

2.3.2. Keypoint Extraction Dataset

To evaluate the reliability and generalizability of the proposed keypoint extraction algorithm, a large-scale pylon point cloud dataset was constructed for experimental validation. The dataset comprises a total of 1427 pylons, encompassing 14 typical tower configurations (as illustrated in Figure 10), including both pylons from 220 kV transmission corridors and representative pylons from other voltage levels.

The point counts of individual pylons in the dataset range from tens of thousands to millions of points, with corresponding variations in point cloud density. The dataset also includes samples exhibiting relative sparsity or local deficiency resulting from occlusion, extended scanning distances, or acquisition condition constraints. Detailed statistical information for each pylon type is presented in Table 1.

3. Results

3.1. Semantic Segmentation Performance

3.1.1. Experimental Setup

The semantic segmentation experiments were implemented based on the PyTorch 1.8.2 framework and trained on a high-performance GPU equipped with NVIDIA GeForce RTX 3090 (24 GB memory). The improved KPConv network was adopted as the backbone architecture, with a radius set to 30 m and a downsampling parameter configured at 0.4 m. During training, the batch size was set to 6, each epoch comprised 400 steps, and the maximum training epochs were set to 600. Stochastic Gradient Descent (SGD) optimizer was employed with an initial learning rate of 1 × 10⁻² and momentum value of 0.98.

Evaluation metrics employ OA (Overall Accuracy) and mIoU (mean Intersection over Union) for comprehensive assessment. OA represents the proportion of correctly classified point cloud quantities in the entire test dataset, while mIoU measures the spatial overlap degree of the model for each category. The calculation formulas for OA and mIoU are as follows:

O A = \frac{T P + T N}{T P + T N + F P + F N}

(12)

m I o U = \frac{1}{C} \sum_{i = 1}^{C} \frac{T P}{F N + F P + T P}

(13)

where

C

denotes the number of classes, and

T P

,

T N

,

F P

, and

F N

represent true positive, true negative, false positive, and false negative in the confusion matrix, respectively.

3.1.2. Segmentation Results on 220 kV Dataset

To validate the effectiveness of the preprocessing process, we evaluated the semantic segmentation performance on the 220 kV dataset. Although preprocessing is not the main contribution of this paper, stable segmentation is crucial for reliable keypoint extraction. Table 2 shows the segmentation results of enhanced point cloud processing networks (including noise filtering and structure-aware convolution modules). The segmentation results are significantly improved compared with the baseline KPConv. The OA reached 0.923 (compared with 0.894 of KPConv, an increase of 2.9%), and the mIoU reached 0.891 (compared with 0.832 of KPConv, an increase of 5.9%). It is worth noting that the identification of key power infrastructure components shows good robustness: the IoU of the conductor reaches 0.972, and the IoU of the ground wire reaches 0.928, which effectively alleviates the common breakpoint identification problem in traditional convolution methods (Figure 11). The improved process also performed well on a few types of objects, and the IoU values of insulators and jumpers reached 0.930 and 0.932, respectively. The surface recognition rate is still relatively low (IoU is 0.685), but the overall segmentation stability of all categories must be maintained, providing a reliable semantic label for downstream keypoint extraction.

3.2. Keypoint Extraction Accuracy

3.2.1. Evaluation Metrics

Since precise coordinates of true pylon keypoints cannot be directly acquired, this study employs manually annotated keypoints as reference benchmarks. Considering potential errors introduced by manual annotation, the study conducts cross-validation through multiple manual annotations to mitigate the impact of annotation uncertainty on evaluation results. Euclidean distance serves as the evaluation metric, quantitatively computing the spatial deviation

d

between algorithm-extracted keypoints

P_{a} = (x_{a}, y_{a}, z_{a})

and manually annotated keypoints

P_{m} = (x_{m}, y_{m}, z_{m})

:

d = \sqrt{{(x_{a} - x_{m})}^{2} + {(y_{a} - y_{m})}^{2} + {(z_{a} - z_{m})}^{2}}

(14)

For more comprehensive error quantification, MAE (Mean Absolute Error) and RMSE (Root Mean Square Error) are introduced. MAE reflects the average magnitude of prediction errors, while RMSE exhibits greater sensitivity to larger outlier errors:

M A E = \frac{1}{N} \sum_{i = 1}^{N} |d_{i}|

(15)

R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} d_{i}^{2}}

(16)

3.2.2. Accuracy Evaluation Results

This paper quantitatively evaluates the accuracy of the proposed keypoint positioning algorithm. The evaluation object is multiple actual transmission towers. The detailed average error statistics are shown in Table 3. The diversity of transmission tower structures is a challenge that keypoint automatic positioning algorithms must face in actual deployment. Through the analysis of test results of various typical and complex tower types, this study confirms that the proposed algorithm has good adaptability to different geometric structures and keypoint distribution patterns. As shown in Table 4, the keypoint extraction accuracy of the algorithm on different tower types did not have a significant performance decline due to the difference in tower type, but maintained relative stability and consistency. In order to visually show the performance of the TPKE method, Figure 12 shows the extraction results of a typical transmission tower key keypoint, and compares them with the actual measurement results.

3.3. V-Shaped Insulator Recognition Performance

To evaluate the discrimination performance and threshold robustness of the “concavity” index η proposed for identifying V-shaped insulator strings, this study adopts 845 insulator string cloud samples obtained from transmission corridor LiDAR, covering various insulator string forms on different tower structures. The dataset contains 226 V-shaped insulator string samples and 619 linear insulator string samples, with a point cloud density range of 30 to 230 points/m³.

3.3.1. Distribution of “Concavity” Metric

Figure 13a shows the distribution histogram of η values in the two insulator string types, while Figure 13b visually shows the statistical difference through the box plot. The results show that the η value distribution of the two types of samples shows an obvious double-peak distribution. The peaks are located at η ≈ 0.05 (linear type) and η ≈ 0.52 (V-shaped type), respectively. There is an obvious distinction gap in the interval [0.25, 0.35], indicating that this indicator can effectively distinguish the morphology of the insulator string.

3.3.2. Classification Accuracy

Based on the above distribution characteristics, this study selects η = 0.30 as the discrimination threshold, which is located at the midpoint of the two types of sample distinction interval. The results of threshold sensitivity analysis (Figure 14 and Table 5) show that the classification accuracy rate is always maintained at 100% within the threshold range [0.25, 0.35].

3.4. Computational Efficiency

To verify the feasibility of the TPKE method in practical application, this study batch processed 460 tower point clouds (a total of 129 million points) covering various tower types and quantitatively evaluated the computing efficiency of the algorithm. The test hardware configuration is an Intel Core i7 processor and NVIDIA GeForce RTX 3090 GPU (24 GB graphics memory).

3.4.1. Overall Processing Performance

As shown in Table 6, the TPKE method requires a total processing time of 1394.23 s (approximately 23.2 min) for processing 460 pylons, averaging only 3.03 s per file, with an average processing speed of 108,993 points/s, demonstrating significant advantages compared to traditional manual annotation methods.

The cumulative distribution statistics of processing time (Figure 15) indicate that 90% of samples are completed within 4.61 s, 75% of samples within 2.19 s, and 50% of samples within 0.82 s. Only a minimal fraction of complex scenarios (<1%) exhibit processing times exceeding 30 s, primarily attributable to abnormal insulator point cloud density (>200,000 points) or high structural complexity.

3.4.2. Time Breakdown by Algorithm Steps

To evaluate the TPKE method’s performance and guide subsequent optimization, a detailed statistical analysis of average time consumption for each processing step was conducted (as shown in Figure 16). Results reveal that the insulator clustering step constitutes the primary time-consuming component, averaging 1.421 s and accounting for 46.9% of total processing time; the grounding wire intersection extraction step averages 0.791 s (26.1%); the connection point extraction step averages 0.705 s (23.3%). In contrast, data loading (0.016 s) and point cloud filtering (0.001 s) steps exhibit negligible time consumption.

Figure 16b presents a detailed processing time decomposition for the first 30 pylon samples. It can be observed that for most conventional pylons, time consumption across steps remains balanced with total time remaining under 1 s; whereas for structurally complex pylons, insulator clustering time increases significantly (approximately 2.3 s), while other steps maintain high efficiency.

3.4.3. Processing Speed Across Tower Types

The box plot in Figure 17 displays the TPKE method’s processing speed distribution across 14 sub-datasets (A–N). The results show that despite the differences in tower types, point cloud density, and scanning quality of different data sets, the median processing speed of the TPKE method has always been maintained in the range of 60,000–300,000 points/s on all data sets, and the performance has not decreased significantly. Specifically, the processing speed of data sets F and I is higher (median > 250,000 points/s), which is mainly due to their moderate point cloud density and simple insulator structure characteristics; on the contrary, the processing speed of data sets B and H is relatively low (median < 80,000 points/s). The analysis shows that these two data sets contain a large number of aging towers. The insulators of these towers are arranged irregularly, and the point cloud data is seriously insufficient, resulting in an increase in the time consumption of clustering and endpoint search steps. It is worth noting that the processing speed distribution of all tower types has a certain degree of dispersion, which is mainly due to the difference in point cloud density and scanning quality, rather than the tower structure itself.

4. Discussion

4.1. Innovation and Effectiveness of the Concavity Metric η

4.1.1. Robustness of the Concavity Threshold

The identification of V-shaped insulator strings is based on the morphological concave index η (see Formula (5) for the definition). In order to verify the robustness of the indicator and the reasonableness of the threshold selection, this paper conducts sensitivity analysis on the annotation data set containing 226 V-shaped insulator strings and 619 linear insulator strings.

Figure 18a shows the trend of the classification performance index to change with the threshold value η. The experimental results reveal the significant stability characteristics of this indicator: within the η ∈ [0.25, 0.35] range (the green shaded area in the figure), the accuracy, precision, recall, and F1 scores all maintain a perfect classification performance of 1.000, which proves the low sensitivity of the indicator to threshold selection. When the threshold deviates from the stability interval, the performance shows a predictable monotonous decline: when η < 0.25, because the discrimination standard is too loose, some linear insulators with a center of mass close to the edge of the point cloud are misjudged as V-shaped, resulting in a decrease in the accuracy rate to 0.968; when η > 0.35, the discrimination criteria are too strict, some V-shaped insulators with smaller opening angles are omitted, and the recall rate drops to 0.991. Based on the above analysis, this paper chooses η = 0.30 as the discrimination threshold, which is located at the midpoint of the stable interval, providing a safety margin of ±0.05 (about 17% of the relative fault tolerance range) for the deviation on both sides.

Figure 18b visually verifies the classification effect when η = 0.30 through the confusion matrix. Under this threshold, all 619 linear insulator strings and 226 V-shaped insulator strings were correctly identified, achieving a 100% classification accuracy. The physical meaning of this indicator is clear: because the V-shaped structure is located in the “hollow” area between the insulator strings on both sides, all points are far from the center of mass, resulting in a large ratio between the minimum distance min_d and the average distance mean_d, resulting in a high η value of the typical range of 0.45–0.60; while the center of mass of the linear structure is located inside the point cloud, and there must be points adjacent to the center of mass, so that min_d is significantly smaller than mean_d, resulting in a low η value of the typical range of 0.02–0.15. The significant separation of the two types of structures in the η value distribution (about 3–5 times the difference) provides a reliable geometric basis for automatic classification, which can achieve accurate identification without training data.

4.1.2. Relationship with PCA Features

The concavity measure η proposed in this paper is fundamentally different from PCA-based descriptors. While PCA features (linearity ρ, flatness σ) quantify point distribution along principal directions through eigenvalue ratios, η directly measures whether the center of mass lies in the “hollow” or interior region of the point cloud. For V-shaped insulators, even though each branch is linear, small branch angles can result in high overall PCA linearity, making the distinction from linear insulators difficult. In contrast, η captures the defining characteristic of V-shaped structures: the center of mass located in the gap between branches.

Figure 19 demonstrates this distinction through correlation analysis. The η metric shows a negative correlation with linearity (R² = 0.87, Figure 19a) and a positive correlation with flatness (R² = 0.79, Figure 19b). V-shaped insulators exhibit average linearity of 0.78 ± 0.04 and flatness of 0.21 ± 0.04, while linear insulators show 0.97 ± 0.02 and 0.03 ± 0.02, respectively. Although η is correlated with PCA characteristics, it can still provide unique discrimination ability. The key is that even if two point clouds have similar linearity and flatness, their center of mass positions may be completely different—the center of mass of the linear insulator is located inside the point cloud, while the center of the V-shaped insulator is located in the gap between the two branches. This essential difference in the position of the center of mass is the core feature captured by η, and it is also the geometric information that cannot be directly reflected by the PCA eigenvalue ratio. Therefore, η, as a supplementary feature, achieves 100% classification accuracy within the threshold range [0.25, 0.35] (Table 5), providing a more direct and robust discrimination basis for V-shaped insulator identification.

4.2. Parameter Sensitivity and Robustness Analysis

4.2.1. Adaptive DBSCAN Parameters

The core parameters of the adaptive DBSCAN algorithm include the base neighborhood radius

ε_{b a s e}

, k nearest neighbors k, and the minimum number of samples min_samples.

Basic neighborhood radius

ε_{b a s e}

: Figure 20a shows the sensitivity characteristics of

ε_{b a s e}

within the range of [0.35, 0.65] m. Experimental results show that when

ε_{b a s e}

∈ [0.45, 0.55] m, the algorithm maintains stable clustering performance. Deviation from this stable interval will lead to performance degradation: when

ε_{b a s e}

< 0.45 m, excessive segmentation occurs, the number of clusters increases but the average scale decreases, resulting in a large number of fragmented small clusters; when

ε_{b a s e}

is >0.55 m, under-segmentation occurs, and adjacent insulator strings are mistakenly merged. Based on the above analysis, this paper selects

ε_{b a s e}

= 0.5 m as the default value. This value is located in the center of the stable interval, corresponding to about 3–4 insulator sheet spacing (the typical sheet spacing is 0.146 m), which not only conforms to the physical scale but also provides sufficient fault tolerance space, and maintains stable performance under the condition of multi-tower type and different point cloud density.

K nearest neighbor parameter: Figure 20b shows that the k nearest neighbor parameter is highly robust when k ≥ 5. When k = 4, the cluster result fluctuates abnormally, and all configurations within the range of k ∈ [5,10] produce consistent cluster results. This stability comes from the local density estimation in the adaptive ε adjustment mechanism (Formula (4)): when k ≥ 5, the k nearest neighbor distance can stably reflect the local point cloud characteristics and avoid interference caused by individual out-of-group points. Therefore, this paper adopts k = 5 as the minimum value to ensure stability, which optimizes the calculation efficiency while ensuring performance.

Minimum number of min_samples: As shown in Figure 20c, the min_samples parameters show extremely low sensitivity in the range of [10,24], and all configurations produce the same clustering results and maintain a 100% effective clustering ratio. This robustness is due to the dynamic compensation of local density changes by the adaptive mechanism: even if the density threshold changes, the adjusted neighborhood radius can be automatically adapted to maintain a consistent clustering effect. Based on this feature, min_samples can be selected flexibly within this range. This paper sets min_samples = 15, which can effectively filter small noise clusters (usually containing 5–10 points) while retaining the complete insulator string.

Practical adjustment guide: Although the default parameters (

ε_{b a s e}

= 0.5 m, k = 5, min_samples = 15) are stable on the test tower, some extreme data quality situations may need to be moderately adjusted. For sparse point clouds (the gap between points is obvious, usually caused by the scanning distance of >100 m or poor acquisition quality), it is recommended to increase

ε_{b a s e}

to improve cluster continuity; for dense point clouds (close-range scanning < 30 m or high-frequency lidar systems), it is recommended to reduce

ε_{b a s e}

to enhance the distinction of adjacent structures; for high-noise point clouds (with a large number of discrete abnormal points), it is recommended to increase min_samples to 18–20 to filter false clusters. It should be noted that thanks to the inherent density compensation ability of the adaptive ε mechanism, the parameters of k and min_samples usually do not need to be adjusted with the change in point cloud density.

4.2.2. Justification of the Curvature Detection Threshold

In order to verify the rationality of the global direction extension strategy proposed in this paper, we analyzed the geometric characteristics of 426 insulator string samples, covering transmission lines with different voltage levels.

First, two key indicators of the linearity (first principal component variance contribution rate) and endpoint deviation (the maximum angle between the local direction of the endpoint and the global main direction) of each insulator string are extracted. Statistical results show (see Table 7) that the average linearity of 426 samples reached 97.4%, of which 82.6% of the sample endpoint deviation was less than 10°, 98.6% of the sample endpoint deviation was less than 17°, and only 1.4% of the sample endpoint deviation was between 17 and 20°, and none of which exceeded 20°. The distribution characteristics of this data show that even if there is a slight bending of the insulator string in the actual scene, its overall form is still very close to a straight line, and the deviation between the endpoint area and the main direction is very limited.

Based on the above data characteristics, the applicability of extension strategies in different directions is further evaluated. Theoretical analysis shows that when the endpoint deviation is less than a certain threshold, the global main direction extension can ensure high positioning accuracy; and when the endpoint deviation is large, it may be more appropriate to adopt the local segment direction. By simulating the strategic performance under different threshold conditions (Figure 21), it is found that as the switching threshold gradually increases from 0° to 16.5°, the average consistency score has steadily increased from 83.3% of the baseline (completely using the local direction) to 87.7%, and the global direction usage rate has increased from 2% to 93.3%. When the threshold exceeds 16.5°, both performance curves tend to be flat, indicating that the marginal benefit of further increasing the threshold is extremely small. Therefore, 16.5° is the optimal switching threshold. Combined with the sample distribution characteristics of Table 7, it can be found that because the endpoint deviation of the vast majority of samples (96.5%) is far lower than the optimal threshold of 16.5°, this means that in practical applications, a unified global direction extension strategy can meet the needs of almost all situations. It should be noted that for a very small number of samples (1.4%) with large endpoint deviations, although the extension accuracy of the global direction strategy may decrease slightly, the global direction can still provide a reliable extension benchmark because the linearity of these samples is still high (all > 95%).

4.2.3. Grounding Wire Positioning Parameters

The positioning algorithm of the connection point between the ground wire and the tower (Section 2.2.2) involves two types of parameters: (1) the density threshold ρ_min, which determines potential connection regions based on local point density; and (2) the geometric features

ρ

and

σ

computed via Equation (7), which quantifies linearity and planarity to identify structural changes.

Density threshold ρ_min: When the number of tower points concentrated in the neighborhood point exceeds the preset density threshold ρ_min, the area is marked as a potential connection area. The threshold is not a fixed constant but is determined adaptively based on local statistical characteristics. Specifically, the algorithm first counts the neighborhood points

N_{r} (s_{i})

of all ground wire sampling points, calculates its mean μ and standard deviation σ_std, and then sets ρ_min = μ + 2σ_std. This statistical method ensures that only the area significantly higher than the background density (about 95% confidence level) is identified as a candidate connection point, which effectively suppresses the misdetection caused by random density fluctuations.

Geometric features for connection identification: In Equation (7), linearity

ρ

and planarity

σ

are geometric features used to quantify the local geometric complexity. As shown in Figure 8, when

ρ

and

σ

are both small, it indicates that the area exhibits strong linear characteristics, corresponding to standard linear segments. Conversely, an increase in

ρ

and/or

σ

implies that the complexity of the geometric structure increases, and there may be a change in direction or connection with other structures. The algorithm identifies the candidate area by analyzing its relative change pattern rather than the absolute threshold: when the

ρ

or

σ

value of the sampling point is significantly higher relative to the surrounding point (usually the change is >50%), the area is marked as a potential connection point, and then the real connection is confirmed through spatial density verification. Experimental observation shows that in the connection area of the ground wire and the tower, the

ρ

value usually decreases from 0.85 to 0.95 in the linear segment to 0.65–0.80 at the connection point, while the

σ

value rises from <0.05 to 0.10–0.20 (Figure 8). However, these numerical ranges vary depending on the point cloud density, scanning angle, and material, relying on the relative pattern rather than the absolute value strategy to keep the algorithm consistent in performance under diversified conditions.

4.3. Robustness to Data Variations

4.3.1. Performance with Data Quality Variations

In actual point cloud data acquisition, data quality is influenced by multiple factors, frequently manifesting as density heterogeneity or insufficient resolution. Through keypoint extraction from point cloud data of varying quality (particularly under conditions of uneven point cloud density or local point cloud deficiency), as illustrated in Figure 22, the following observations emerge: in data1-1 and data2-1,3, despite evident point cloud deficiency between grounding wires and pylon bodies, keypoints in these sections can still be correctly localized; in data1-2,3,4 and data2-2,4, point clouds between insulator strings and pylon bodies as well as conductors exhibit discontinuity, yet the axial extension strategy enables accurate inference and extraction of keypoint positions. In short, even under adverse conditions such as uneven point cloud density or missing local data, the TPKE method can still maintain satisfactory robustness and effectively identify and locate keypoints.

In addition, this method also shows good adaptability in other complex scenarios (Figure 23). As shown in Figure 23a, even if the point cloud in the insulator string is extremely sparse, the method can still effectively extract keypoints; Figure 23b shows an atypical insulator string configuration (interconnected insulator string), which is different from the discrete arrangement of traditional double-string or multi-string insulator strings; Figure 23c depicts the spatial layout pattern of multi-string insulator strings. These test cases cover the decline of point cloud quality and the diversity of insulator string structure, verifying the universality and robustness of the TPKE method under different data quality and tower forms.

4.3.2. Impact of Semantic Segmentation Quality

The input of the TPKE method depends on the component category label provided by the semantic segmentation module. Although this study does not conduct a special error transmission experiment, based on the method design principle and experimental observation, the impact of semantic segmentation quality is analyzed as follows.

At present, the semantic segmentation technology of transmission line scenarios is relatively mature. As shown in Table 2, the preprocessing method adopted in this paper achieves an overall accuracy (OA) of 0.923 and an average intersection ratio (mIoU) of 0.891 on the 220 kV data set, of which the IoU of the insulator category is 0.930 and the IoU of the tower category is 0.963. This performance provides reliable data input for keypoint extraction.

From the perspective of method design, TPKE has a certain robustness to semantic segmentation error, which is mainly reflected in the following aspects: (1) Core area dependence: Keypoint extraction mainly depends on the core area of the component rather than the boundary accuracy. The typical error of semantic segmentation is concentrated at the boundary of the component, while the classification of the internal area of the component is usually accurate. As long as the main parts of the insulator string and the main body of the tower are classified correctly, the subsequent clustering and geometric analysis can be carried out normally. (2) Error filtering at the cluster level: The adaptive DBSCAN clustering algorithm (Section 2.2.1) can naturally exclude isolated misclassification points through density and spatial continuity constraints. Even if there are a small number of label errors in the boundary area, these points usually cannot form a valid cluster, so they will not have a substantial impact on the keypoint extraction. (3) Locality of geometric analysis: Keypoint positioning depends on the geometric analysis of local areas after clustering (such as OBB calculation, concavity measurement, etc.), rather than point-by-point semantic labels, and the boundary accuracy requirements for semantic labels are relatively relaxed. However, it should be noted that if the semantic segmentation fails seriously (such as if the entire insulator string or tower segment is misclassified), it will directly lead to the failure of keypoint extraction. Therefore, maintaining the basic reliability of semantic segmentation is still the prerequisite for the effective operation of the TPKE method. Future work can consider introducing an end-to-end joint optimization framework or designing a segmented quality assessment and adaptive correction mechanism to further improve the overall robustness of the system.

4.4. Time Consumption Bottleneck Analysis

Figure 24a indicates that insulator clustering time exhibits a nonlinear positive correlation with insulator point quantity (R² = 0.87). When the insulator point count exceeds 100,000, clustering time increases sharply, consistent with the O (n log n) time complexity of the DBSCAN algorithm. Figure 24b demonstrates that connection point extraction time is primarily influenced by the number of detected clusters (R² = 0.79) rather than total insulator point count. Figure 24c shows that there is a significant linear relationship between the grounding line intersection positioning time and the ground wire number (R² = 0.91), which is attributed to the fact that the complexity of geometric feature calculation is directly proportional to the point count. Figure 24d shows the overall bottleneck time analysis, which shows that when the total scale of the point cloud exceeds 2 million points, the processing time is mainly affected by the aggregation of insulators, while for small- and medium-scale point clouds (<1 million points), the time consumption of each step is relatively balanced.

It is worth noting that there are some outliers in the scatter chart that deviate from the fitting line, and the processing time of these samples is significantly longer than expected. The analysis shows that these abnormal values are mainly caused by the following factors: the abnormal density of the insulator point cloud, the complex geometry of the grounding wire-tower, and the large number of gathering points. Despite these special circumstances, 95% of the files are completed within 8.37 s, indicating that the TPKE method has stable and efficient processing power in routine scenarios.

4.5. Limitations and Future Directions

One of the main limitations of this research is the lack of direct comparison experiments with manual labeling methods. Although the real keypoints used for accuracy assessment are obtained through manual annotation, we have not conducted controlled experiments to quantitatively compare the time consumption and labor costs of the two methods and cannot provide direct evidence of efficiency improvement.

However, multifaceted indirect evidence shows that the proposed method has significant efficiency advantages. In terms of processing speed, the TPKE method takes an average of 3.03 s per tower, and 90% of the samples are completed within 4.61 s. In contrast, manual labeling in transmission line inspection usually requires the operator to visually check the three-dimensional point cloud, identify the insulator endpoint and the ground wire connection point through interactive software, and record the coordinates—according to project experience, this process takes 5–15 min per tower, and the specific time depends on the complexity of the structure and the proficiency of the operator. The batch processing experiment processed 460 towers with a total of 129 million points in about 23 min, proving its linear scalability, while it takes hundreds of man-hours to manually label the same data set, and long-term operation will lead to fatigue and reduce efficiency. Automatic extraction also eliminates the problem of consistency between operators. Due to differences in subjective judgment, the manual annotation of different operators may have a position deviation of 0.1–0.3 m, and the TPKE method can produce deterministic results of centimeter accuracy (insulator MAE: 0.0747 m, ground wire MAE: 0.0696 m). Based on these performance indicators, this method can reduce the manual annotation workload in practical applications such as drone route planning.

The future work should include a controlled comparative study with multiple manual annotators to more strictly quantify efficiency improvement, establish a formal time cost model, and verify the practicality of the method and performance in extreme cases where the current data set is not fully covered in the inspection workflow.

5. Conclusions

In order to reduce the manual labeling workload of UAV transmission tower inspection target points, this paper proposes an automatic keypoint extraction (TPKE) method based on the transmission tower point cloud. This method mainly targets two key inspection targets: insulator string endpoints and ground wire connection points. The experimental results show the following:

(1): The proposed TPKE method is very practical. Its advantages are that there is very little manual intervention, high operational efficiency, and strong cross-scene generalization ability. This method introduces a morphological “concavity” measure (η), and the accuracy of the distinction rate of V-shaped insulator strings can reach 100% within the threshold interval [0.25, 0.35]. Combined with adaptive DBSCAN clustering and positioning-verification-compensation strategies, this method can effectively deal with problems such as point cloud density changes, 14 tower structure diversity, and endpoint data missing common in actual LiDAR data collection.
(2): The TPKE method has been verified for 1427 transmission towers covering different tower configurations and voltage levels. The MAE of the keypoint of the insulator is 0.0747 m (RMSE: 0.0859 m), and the MAE of the keypoint of the ground wire is 0.0696 m (RMSE: 0.0708 m), and the accuracy reaches the centimeter level. This method shows good robustness in large-scale data set verification. Even under challenging conditions such as sparse point clouds, density changes, and incomplete local data, it can maintain stable performance, thus effectively providing reliable keypoint coordinate positions for autonomous inspection operations of drones.
(3): The TPKE method has high calculation efficiency, with an average processing time of 3.03 s per tower and a processing speed of 108,993 points per second. 90% of sample processing can be completed in 4.61 s, and the processing speed is stable on different types of towers (60,000 to 300,000 points per second). This significantly reduces the workload of manual labeling and, at the same time, realizes the efficient autonomous navigation and accurate target positioning of UAVs, which is suitable for large-scale transmission line inspection operations.

While the proposed method demonstrates satisfactory performance in experiments, certain limitations persist. Although internal consistency checks and limited manual sample validation can assess the extraction result rationality to some extent, the absence of ground-truth measurement data and high-resolution imagery precludes comprehensive validation relying on external data sources. Furthermore, the TPKE algorithm in its current design depends on transmission corridor pylon semantic segmentation performance; consequently, the algorithm’s ultimate performance is indirectly influenced by input data classification accuracy.

Author Contributions

Conceptualization, G.W. and H.W.; methodology, G.W. and Y.G.; validation, G.W.; formal analysis, G.W.; investigation, Z.Y.; resources, C.W.; data curation, Z.Y. and P.W.; writing—original draft preparation, G.W.; writing—review and editing, G.W., H.W., S.N. and C.W.; visualization, G.W.; supervision, H.L., Z.Y. and S.C.; project administration, S.N., S.Z. and Y.Z.; funding acquisition, H.W. and C.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Technical Foundation of State Grid Corporation of China, grant number 52230025000F-191-ZN.

Data Availability Statement

The data in this study are owned by the research group and will not be shared.

Conflicts of Interest

Authors Haibo Liu and Su Zhang were employed by the State Grid Economic and Technological Research Institute Co., Ltd. Authors Yibing Zhou and Sijin Cheng were employed by the State Grid Jilin Electric Power Co., Ltd. Construction Company. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Nguyen, V.N.; Jenssen, R.; Roverso, D. Intelligent Monitoring and Inspection of Power Line Components Powered by UAVs and Deep Learning. IEEE Power Energy Technol. Syst. J. 2019, 6, 11–21. [Google Scholar] [CrossRef]
Xie, X.; Liu, Z.; Xu, C.; Zhang, Y. A Multiple Sensors Platform Method for Power Line Inspection Based on a Large Unmanned Helicopter. Sensors 2017, 17, 1222. [Google Scholar] [CrossRef]
He, T.; Zeng, Y.; Hu, Z. Research of Multi-Rotor UAVs Detailed Autonomous Inspection Technology of Transmission Lines Based on Route Planning. IEEE Access 2019, 7, 114955–114965. [Google Scholar] [CrossRef]
Gao, Y.; Xia, S.; Wang, P.; Xi, X.; Nie, S.; Wang, C. LiDAR Remote Sensing Meets Weak Supervision: Concepts, Methods, and Perspectives. arXiv 2025, arXiv:2503.18384. [Google Scholar] [CrossRef]
Guan, H.; Sun, X.; Su, Y.; Hu, T.; Wang, H.; Wang, H.; Peng, C.; Guo, Q. UAV-lidar aids automatic intelligent powerline inspection. Int. J. Electr. Power Energy Syst. 2021, 130, 106987. [Google Scholar] [CrossRef]
Mendu, B.; Mbuli, N. State-of-the-Art Review on the Application of Unmanned Aerial Vehicles (UAVs) in Power Line Inspections: Current Innovations, Trends, and Future Prospects. Drones 2025, 9, 265. [Google Scholar] [CrossRef]
Xi, S.; Zhang, Z.; Niu, Y.; Li, H.; Zhang, Q. Power Line Extraction and Tree Risk Detection Based on Airborne LiDAR. Sensors 2023, 23, 8233. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Shuang, F.; Li, Y.; Zhang, L.; Huang, X.; Qin, J. SS-IPLE: Semantic Segmentation of Electric Power Corridor Scene and Individual Power Line Extraction From UAV-Based Lidar Point Cloud. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 38–50. [Google Scholar] [CrossRef]
Matikainen, L.; Lehtomäki, M.; Ahokas, E.; Hyyppä, J.; Karjalainen, M.; Jaakkola, A.; Kukko, A.; Heinonen, T. Remote sensing methods for power line corridor surveys. ISPRS J. Photogramm. Remote Sens. 2016, 119, 10–31. [Google Scholar] [CrossRef]
Awrangjeb, M. Extraction of Power Line Pylons and Wires Using Airborne LiDAR Data at Different Height Levels. Remote Sens. 2019, 11, 1798. [Google Scholar] [CrossRef]
Munir, N.; Awrangjeb, M.; Stantic, B. Power Line Extraction and Reconstruction Methods from Laser Scanning Data: A Literature Review. Remote Sens. 2023, 15, 973. [Google Scholar] [CrossRef]
Huang, Y.; Du, Y.; Shi, W. Fast and Accurate Power Line Corridor Survey Using Spatial Line Clustering of Point Cloud. Remote Sens. 2021, 13, 1571. [Google Scholar] [CrossRef]
Biundini, I.Z.; Pinto, M.F.; Melo, A.G.; Marcato, A.L.M.; Honorio, L.M.; Aguiar, M.J.R. A Framework for Coverage Path Planning Optimization Based on Point Cloud for Structural Inspection. Sensors 2021, 21, 570. [Google Scholar] [CrossRef]
Xu, C.; Li, Q.; Zhou, Q.; Zhang, S.; Yu, D.; Ma, Y. Power Line-Guided Automatic Electric Transmission Line Inspection System. IEEE Trans. Instrum. Meas. 2022, 71, 3512118. [Google Scholar] [CrossRef]
Li, X.; Li, Y.; Chen, Y.; Zhang, G.; Liu, Z. Deep Learning-Based Target Point Localization for UAV Inspection of Point Cloud Transmission Towers. Remote Sens. 2024, 16, 817. [Google Scholar] [CrossRef]
Liu, J.; Hu, M.; Dong, J.; Lu, X. Summary of insulator defect detection based on deep learning. Electr. Power Syst. Res. 2023, 224, 109688. [Google Scholar] [CrossRef]
Tao, X.; Zhang, D.; Wang, Z.; Liu, X.; Zhang, H.; Xu, D. Detection of power line insulator defects using aerial images analyzed with convolutional neural networks. IEEE Trans. Syst. Man Cybern. Syst. 2018, 50, 1486–1498. [Google Scholar] [CrossRef]
Alaba, S.Y.; Ball, J.E. A Survey on Deep-Learning-Based LiDAR 3D Object Detection for Autonomous Driving. Sensors 2022, 22, 9577. [Google Scholar] [CrossRef] [PubMed]
Si, H.; Wei, X. Feature extraction and representation learning of 3D point cloud data. Image Vis. Comput. 2024, 142, 104890. [Google Scholar] [CrossRef]
Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Bennamoun, M. Deep Learning for 3D Point Clouds: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 4338–4364. [Google Scholar] [CrossRef]
Chen, S.; Fang, Z.; Wan, S.; Zhou, T.; Chen, C.; Wang, M.; Li, Q. Geometrically aware transformer for point cloud analysis. Sci. Rep. 2025, 15, 16545. [Google Scholar] [CrossRef]
Griffiths, D.; Boehm, J. A Review on Deep Learning Techniques for 3D Sensed Data Classification. Remote Sens. 2019, 11, 1499. [Google Scholar] [CrossRef]
Ahmed, M.D.F.; Mohanta, J.C.; Sanyal, A. Inspection and identification of transmission line insulator breakdown based on deep learning using aerial images. Electr. Power Syst. Res. 2022, 211, 108199. [Google Scholar] [CrossRef]
Pei, S.; Wang, W.; Hu, C.; Li, K.; Sun, H.; Wu, M.; Lan, B. Identification of Low-Value Defects in Infrared Images of Porcelain Insulators Based on STCE-YOLO Algorithm. Energy Sci. Eng. 2025, 13, 3779–3790. [Google Scholar] [CrossRef]
Liu, Y.; Liu, D.; Huang, X.; Li, C. Insulator defect detection with deep learning: A survey. IET Gener. Transm. Distrib. 2023, 17, 3541–3558. [Google Scholar] [CrossRef]
Zhao, Z.; Zhen, Z.; Zhang, L.; Qi, Y.; Kong, Y.; Zhang, K. Insulator Detection Method in Inspection Image Based on Improved Faster R-CNN. Energies 2019, 12, 1204. [Google Scholar] [CrossRef]
Li, J.; Zhan, J.; Zhou, T.; Bento, V.A.; Wang, Q. Point cloud registration and localization based on voxel plane features. ISPRS J. Photogramm. Remote Sens. 2022, 188, 363–379. [Google Scholar] [CrossRef]
Shen, Y.; Huang, J.; Chen, D.; Wang, J.; Li, J.; Ferreira, V. An automatic framework for pylon detection by a hierarchical coarse-to-fine segmentation of powerline corridors from UAV LiDAR point clouds. Int. J. Appl. Earth Obs. Geoinf. 2023, 118, 103263. [Google Scholar] [CrossRef]
Chen, M.; Li, J.; Pan, J.; Ji, C.; Ma, W. Insulator Extraction from UAV LiDAR Point Cloud Based on Multi-Type and Multi-Scale Feature Histogram. Drones 2024, 8, 241. [Google Scholar] [CrossRef]
Zhang, R.; Yang, B.; Xiao, W.; Liang, F.; Liu, Y.; Wang, Z. Automatic extraction of high-voltage power transmission objects from UAV lidar point clouds. Remote Sens. 2019, 11, 2600. [Google Scholar] [CrossRef]
Sipiran, I.; Bustos, B. Harris 3D: A robust extension of the Harris operator for interest point detection on 3D meshes. Vis. Comput. 2011, 27, 963–976. [Google Scholar] [CrossRef]
Tu, Z.; Xie, Y.; Jiang, J.; Qin, Q. Point cloud registration based on surface feature extraction and an improved Grey Wolf Optimization algorithm. Sci. Rep. 2025, 15, 19199. [Google Scholar] [CrossRef] [PubMed]
Zhong, Y. Intrinsic shape signatures: A shape descriptor for 3D object recognition. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, Kyoto, Japan, 27 September–4 October 2009; pp. 689–696. [Google Scholar] [CrossRef]
Weinmann, M.; Jutzi, B.; Hinz, S.; Mallet, C. Semantic point cloud interpretation based on optimal neighborhoods, relevant features and efficient classifiers. ISPRS J. Photogramm. Remote Sens. 2015, 105, 286–304. [Google Scholar] [CrossRef]
Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 77–85. [Google Scholar] [CrossRef]
Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. arXiv 2017, arXiv:1706.02413. [Google Scholar] [CrossRef]
Lin, Z.H.; Huang, S.Y.; Wang, Y.C.F. Learning of 3D Graph Convolution Networks for Point Cloud Analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 4212–4224. [Google Scholar] [CrossRef]
Zhao, H.; Jiang, L.; Jia, J.; Torr, P.H.; Koltun, V. Point transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 16259–16268. [Google Scholar]
Wu, J.; Chen, C.; Yan, Z.; Wu, S.; Wang, Z.; Li, L.; Fu, J.; Yang, B. UPKD: Unsupervised pylon keypoint detection from 3D LiDAR data for autonomous UAV power inspection. Int. J. Appl. Earth Obs. Geoinf. 2024, 133, 104106. [Google Scholar] [CrossRef]
Xiao, A.; Huang, J.; Guan, D.; Zhang, X.; Lu, S.; Shao, L. Unsupervised point cloud representation learning with deep neural networks: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 11321–11339. [Google Scholar] [CrossRef]
Tinchev, G.; Penate-Sanchez, A.; Fallon, M. SKD: Keypoint Detection for Point Clouds Using Saliency Estimation. IEEE Robot. Autom. Lett. 2021, 6, 3785–3792. [Google Scholar] [CrossRef]
Liu, Z.; Qi, X.; Fu, C.-W. One thing one click: A self-training approach for weakly supervised 3D semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 1726–1736. [Google Scholar] [CrossRef]
Wu, S.; Chen, C.; Yang, B.; Yan, Z.; Wang, Z.; Sun, S.; Zou, Q.; Fu, J. PylonModeler: A hybrid-driven 3D reconstruction method for power transmission pylons from LiDAR point clouds. ISPRS J. Photogramm. Remote Sens. 2025, 220, 100–124. [Google Scholar] [CrossRef]
Zhang, Y.; Yuan, X.; Li, W.; Chen, S. Automatic Power Line Inspection Using UAV Images. Remote Sens. 2017, 9, 824. [Google Scholar] [CrossRef]
Petras, V.; Petrasova, A.; McCarter, J.B.; Mitasova, H.; Meentemeyer, R.K. Point Density Variations in Airborne Lidar Point Clouds. Sensors 2023, 23, 1593. [Google Scholar] [CrossRef] [PubMed]
Liu, H.; Zhang, S.; Rong, J.; Yan, F.; Gao, Y.; Yang, Z.; Nie, S. Robust Semantic Segmentation of Transmission Corridor Point Clouds Based on Improved KPConv. J. Univ. Chin. Acad. Sci. 2025, 2025063. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of pylon keypoints. (0–5, 8–13: endpoints of insulator strings; 6, 7: connection points of grounding wires and the tower).

Figure 2. Technical framework for point cloud semantic segmentation and transmission pylon keypoint extraction in power system scenes.

Figure 3. Impact of spherical versus ellipsoidal sampling on point cloud feature coverage.

Figure 4. Schematic diagram of V-shaped insulator string identification using the concavity metric

η

.

Figure 4. Schematic diagram of V-shaped insulator string identification using the concavity metric

η

.

Figure 5. Segmentation results of V-shaped insulator strings.

Figure 6. Oriented bounding box.

Figure 7. Axial extension search strategy (a) Endpoint validation; (b) Endpoint compensation; (c) Flowchart of the strategy.

Figure 8. Comparison of eigenvalue ratios.

Figure 9. Detailed diagram of projection geometry localization.

Figure 10. Tower type categories (A) Type A pylon; (B) Type B pylon; (C) Type C pylon; (D) Type D pylon; (E) Type E pylon; (F) Type F pylon; (G) Type G pylon; (H) Type H pylon; (I) Type I pylon; (J) Type J pylon; (K) Type K pylon; (L) Type L pylon; (M) Type M pylon; (N) Type N pylon.

Figure 11. Semantic segmentation results on the 220 kV dataset (a) Data1-KPConv; (b) Data1-Ours; (c) Data2-KPConv; (d) Data2-Ours. Note: Black circles highlight segmentation errors for comparison.

Figure 12. Visualization of pylon keypoint extraction results. (Left: overall view of the pylon; Right: localized views (1–8) showing the extraction details for ground wire peaks, cross-arms, and insulator endpoints).

Figure 13. Distribution of concavity metric η for V-shaped and linear insulator strings: (a) Histogram of η values for V-shaped and linear insulator strings; (b) Box plot comparison of η distributions between two categories.

Figure 14. Threshold sensitivity analysis of the concavity metric η.

Figure 15. Cumulative distribution curve of processing time.

Figure 16. (a) Average processing time by algorithm steps; (b) processing time breakdown by files (sample: first 30).

Figure 17. Processing speed distribution across different tower types. (The letters A–N represent 14 distinct transmission pylon categories used in the study).

Figure 18. Robustness verification of the concave index η: (a) the trend of classification performance changes with the threshold; (b) confusion matrix when η = 0.30.

Figure 19. Relationship between concavity metric and point cloud linearity/planarity characteristics (a) η value vs. linearity; (b) η value vs. planarity.

Figure 20. Parameter sensitivity analysis of the adaptive DBSCAN algorithm: (a)

ε_{b a s e}

sensitivity; (b) k nearest neighbor sensitivity; (c) minimum sample sensitivity.

Figure 20. Parameter sensitivity analysis of the adaptive DBSCAN algorithm: (a)

ε_{b a s e}

sensitivity; (b) k nearest neighbor sensitivity; (c) minimum sample sensitivity.

Figure 21. Strategy performance analysis with different switching thresholds. (The yellow star indicates the maximum average consistency score at 16.5°).

Figure 22. Keypoint detection performance under conditions of point cloud deficiency at insulator endpoints (a) Data1; (b) Data2. The numbers 1–4 represent localized views showing the extraction details at the specific insulator endpoints.

Figure 23. Keypoint detection performance in complex scenarios: (a) sparse insulator string point clouds; (b) compact-type insulator strings; (c) multiple insulator strings.

Figure 24. Correlation analysis between point cloud characteristics and processing time consumption.

Table 1. Information on various pylon types in the dataset.

Tower Type	Quantity	Average Point Count	Average Tower Height/m	Maximum Tower Height/m	Minimum Tower Height/m
A	91	233,035.54	66.23	111.40	19.47
B	120	13,627.36	34.28	77.58	18.53
C	246	57,138.54	61.06	102.94	28.95
D	102	217,226.65	73.52	95.97	55.08
E	127	151,576.80	36.23	71.97	20.32
F	8	622,451.50	37.21	37.35	37.03
G	50	73,203.16	113.25	141.37	65.55
H	45	23,985.93	43.19	62.68	30.50
I	124	252,159.02	45.92	98.27	20.06
J	191	221,703.71	57.94	110.59	28.91
K	31	111,696.16	60.09	90.94	16.79
L	207	57,185.60	51.56	113.37	23.54
M	51	175,017.51	119.32	144.74	62.55
N	34	329,786.41	107.16	133.20	50.84

Table 2. Experimental results on the 220 kV dataset.

Category	Conductor	Pylon	Vegetation	Insulator	Jumper Wire	Grounding Wire	Ground	OA	mIoU
KPConv	0.931	0.892	0.835	0.872	0.861	0.889	0.685	0.894	0.832
Ours	0.972	0.963	0.902	0.930	0.932	0.928	0.685	0.923	0.891

Table 3. Average error statistics for keypoint localization.

Type	MAE (m)	RMSE (m)	Maximum Error (m)	Minimum Error (m)	Average Standard Deviation (m)
Insulator	0.0747	0.0859	0.1435	0.0107	0.0390
Grounding Wire	0.0696	0.0708	0.0800	0.0572	0.0092

Table 4. Keypoint extraction accuracy for different tower types.

Tower Type	Type	MAE (m)	RMSE (m)	Maximum Error (m)	Minimum Error (m)	Average Standard Deviation (m)
A	Insulator	0.1142	0.1247	0.2005	0.0100	0.0500
A	Grounding Wire	0.0707	0.0723	0.0905	0.0510	0.0149
B	Insulator	0.0927	0.1051	0.1631	0.0173	0.0496
B	Grounding Wire	0.0678	0.0678	0.0700	0.0656	0.0022
C	Insulator	0.0504	0.0682	0.1122	0.0040	0.0459
C	Grounding Wire	0.0687	0.0693	0.0782	0.0591	0.0095
D	Insulator	0.0338	0.0478	0.0735	0.0101	0.0338
D	Grounding Wire	0.0457	0.0477	0.0593	0.0322	0.0135
E	Insulator	0.0612	0.0684	0.1269	0.0141	0.0306
E	Grounding Wire	0.0497	0.0522	0.0608	0.0224	0.0160
F	Insulator	0.0534	0.0619	0.1118	0.0142	0.0312
F	Grounding Wire	0.0586	0.0600	0.0714	0.0459	0.0128
G	Insulator	0.0828	0.0889	0.1425	0.0412	0.0324
G	Grounding Wire	0.0659	0.0663	0.0735	0.0583	0.0076
H	Insulator	0.0501	0.0631	0.0907	0.0200	0.0383
H	Grounding Wire	0.0542	0.0543	0.0583	0.0500	0.0042
I	Insulator	0.0840	0.0933	0.1418	0.0141	0.0407
I	Grounding Wire	0.0542	0.0543	0.0583	0.0500	0.0042
J	Insulator	0.1065	0.1087	0.1393	0.0702	0.0214
J	Grounding Wire	0.0594	0.0600	0.0678	0.0510	0.0084
K	Insulator	0.0945	0.1000	0.1521	0.0332	0.0328
K	Grounding Wire	0.0914	0.0914	0.0916	0.0911	0.0003
L	Insulator	0.0959	0.1004	0.1658	0.0424	0.0296
L	Grounding Wire	0.0710	0.0760	0.1145	0.0412	0.0272
M	Insulator	0.1018	0.1061	0.1536	0.0490	0.0302
M	Grounding Wire	0.0965	0.0987	0.1175	0.0755	0.0210
N	Insulator	0.1139	0.1183	0.1584	0.0316	0.0321
N	Grounding Wire	0.0520	0.0550	0.0749	0.0332	0.0179

Table 5. Classification performance under different thresholds.

Threshold η	Accuracy (%)	V-Shaped Correct Rate (%)	Linear Correct Rate (%)	F1 Score
0.20	99.71	100.00	99.60	0.9971
0.25	100.00	100.00	100.00	100.00
0.30	100.00	100.00	100.00	100.00
0.35	100.00	100.00	100.00	100.00
0.40	99.71	98.91	100.00	0.9971

Table 6. Overall processing performance statistics of the TPKE method.

Metric	Value
Number of Pylons	460
Total Point Count	128,957,001
Total Processing Time	1394.23 s
Average Processing Time per File	3.03 s
Average Processing Speed	108,993 points/s

Table 7. Distribution of endpoint deviation angles.

Deviation Range	Sample Size	Percentage
0–10°	352	82.6%
10–15°	59	13.9%
15–17°	9	2.1%
17–20°	6	1.4%
>20°	0	0%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wu, G.; Gao, Y.; Liu, H.; Zhang, S.; Yang, Z.; Wang, P.; Zhou, Y.; Cheng, S.; Nie, S.; Wang, C.; et al. TPKE: Automated Keypoint Extraction for Multi-Type Transmission Pylons from LiDAR Point Clouds. Remote Sens. 2026, 18, 429. https://doi.org/10.3390/rs18030429

AMA Style

Wu G, Gao Y, Liu H, Zhang S, Yang Z, Wang P, Zhou Y, Cheng S, Nie S, Wang C, et al. TPKE: Automated Keypoint Extraction for Multi-Type Transmission Pylons from LiDAR Point Clouds. Remote Sensing. 2026; 18(3):429. https://doi.org/10.3390/rs18030429

Chicago/Turabian Style

Wu, Gufen, Yuan Gao, Haibo Liu, Su Zhang, Zhou Yang, Pu Wang, Yibing Zhou, Sijin Cheng, Sheng Nie, Cheng Wang, and et al. 2026. "TPKE: Automated Keypoint Extraction for Multi-Type Transmission Pylons from LiDAR Point Clouds" Remote Sensing 18, no. 3: 429. https://doi.org/10.3390/rs18030429

APA Style

Wu, G., Gao, Y., Liu, H., Zhang, S., Yang, Z., Wang, P., Zhou, Y., Cheng, S., Nie, S., Wang, C., & Wang, H. (2026). TPKE: Automated Keypoint Extraction for Multi-Type Transmission Pylons from LiDAR Point Clouds. Remote Sensing, 18(3), 429. https://doi.org/10.3390/rs18030429

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

TPKE: Automated Keypoint Extraction for Multi-Type Transmission Pylons from LiDAR Point Clouds

Highlights

Abstract

1. Introduction

2. Methods and Materials

2.1. Power Scene Semantic Segmentation

2.2. Keypoint Extraction

2.2.1. Insulator String Keypoint Extraction

2.2.2. Precise Grounding Point Localization

2.3. Dataset Description

2.3.1. Semantic Segmentation Dataset

2.3.2. Keypoint Extraction Dataset

3. Results

3.1. Semantic Segmentation Performance

3.1.1. Experimental Setup

3.1.2. Segmentation Results on 220 kV Dataset

3.2. Keypoint Extraction Accuracy

3.2.1. Evaluation Metrics

3.2.2. Accuracy Evaluation Results

3.3. V-Shaped Insulator Recognition Performance

3.3.1. Distribution of “Concavity” Metric

3.3.2. Classification Accuracy

3.4. Computational Efficiency

3.4.1. Overall Processing Performance

3.4.2. Time Breakdown by Algorithm Steps

3.4.3. Processing Speed Across Tower Types

4. Discussion

4.1. Innovation and Effectiveness of the Concavity Metric η

4.1.1. Robustness of the Concavity Threshold

4.1.2. Relationship with PCA Features

4.2. Parameter Sensitivity and Robustness Analysis

4.2.1. Adaptive DBSCAN Parameters

4.2.2. Justification of the Curvature Detection Threshold

4.2.3. Grounding Wire Positioning Parameters

4.3. Robustness to Data Variations

4.3.1. Performance with Data Quality Variations

4.3.2. Impact of Semantic Segmentation Quality

4.4. Time Consumption Bottleneck Analysis

4.5. Limitations and Future Directions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI