Real-Time Detection and Spatial Localization of Insulators for UAV Inspection Based on Binocular Stereo Vision

Unmanned aerial vehicles (UAVs) have become important tools for power transmission line inspection. Cameras installed on the platforms can efﬁciently obtain aerial images containing information about power equipment. However, most of the existing inspection systems cannot perform automatic real-time detection of transmission line components. In this paper, an automatic transmission line inspection system incorporating UAV remote sensing with binocular visual perception technology is developed to accurately detect and locate power equipment in real time. The system consists of a UAV module, embedded industrial computer, binocular visual perception module, and control and observation module. Insulators, which are key components in power transmission lines as well as fault-prone components, are selected as the detection targets. Insulator detection and spatial localization in aerial images with cluttered backgrounds are interesting but challenging tasks for an automatic transmission line inspection system. A two-stage strategy is proposed to achieve precise identiﬁcation of insulators. First, candidate insulator regions are obtained based on RGB-D saliency detection. Then, the skeleton structure of candidate insulator regions is extracted. We implement a structure search to realize the ﬁnal accurate detection of insulators. On the basis of insulator detection results, we further propose a real-time object spatial localization method that combines binocular stereo vision and a global positioning system (GPS). The longitude, latitude, and height of insulators are obtained through coordinate conversion based on the UAV’s real-time ﬂight data and equipment parameters. Experiment results in the actual inspection environment (220 kV power transmission line) show that the presented system meets the requirement of robustness and accuracy of insulator detection and spatial localization in practical engineering.


Introduction
Overhead transmission lines connect power plants, substations and consumers to form power transmission and distribution networks [1,2]. In transmission lines, insulators are widely used essential equipment with dual functions of electrical insulation and mechanical support [3,4]. The failure of insulators directly threatens the stability and safety of transmission lines. Statistically, accidents are caused by insulator faults account for the highest proportion of power system faults [5]. Therefore, the condition monitoring of insulators is of great significance to the safety and stability of the power system. Traditional manual inspections have high labor costs and low-efficiency. In addition, unfavorable factors such as climate and the geographical environment will restrict manual inspections, leading to many hidden dangers that cannot be discovered in time [6][7][8].
Instead of traditional manual patrol, transmission line inspection by UAV, which has emerged in recent years, is much less limited by natural climate and terrain conditions and 2 of 21 demonstrates high-efficiency, safety and accuracy [9][10][11]. During the inspection, a large number of aerial images and videos of transmission lines can be acquired by the visual sensors mounted on the aerial platform. To realize the automatic inspection and intelligent state diagnosis of insulators, the key step is to accurately detect the insulators from aerial images for the subsequent inspection tasks such as target tracking, defect diagnosis and inspection data management. In addition, insulators are also important visual cues for the automatic navigation of the UAVs. We can also reverse-control the UAV to carry out continuous inspection of the designated area or automatically adjust the inspection route according to the real-time positioning results of the insulators.
Currently, insulator detection and location are mainly performed manually, which is obviously inefficient and costly. In order to overcome the limitation of manual inspection, it is necessary to develop automatic detection technology to assist or replace human decision. However, images captured by UAVs usually contain cluttered backgrounds, including vegetation, rivers, roads, houses, etc. In addition, due to the diversity of insulators, the different illumination conditions, and the filming angle in the actual inspection scene, the status of the insulators in the image is also different. These adverse factors make it difficult to detect insulators in aerial images.
Several scholars have conducted a great deal of work on insulator detection and obtained lots of meaningful results. Zhang et al. [12] proposed a simple and preliminary method for the detection of tempered glass insulators. The airborne image was binarized with a set threshold value, and then the noise area is filtered by a morphological operation to obtain the insulator detection results. Wu et al. [13] proposed an active contour model based on texture distribution to detect inhomogeneous insulators from aerial images. However, the computational effort of this method is too much, and the model cannot be initialized automatically. Zhao et al. [14] used orientation angle detection and binary shape prior knowledge (OAD-BSPK) to detect multiple insulators with different angles in aerial images. This approach is limited because the shape of insulators must be known in advance, and it can only work when images with untextured background. Li et al. [15] applied components of H and S in the HSV color space model and texture information to filter background areas and obtain the outline of the insulator strings. But this algorithm cannot get satisfactory results when the background in the image is diverse and complex, or insulators are not clear. Zhai et al. [16] established color models of both glass and ceramic insulators and then identified the target areas of the insulators according to the color determination combined with the insulator's spatial features. Similarly, the performance of insulator identification can significantly degrade in the presence of a complex background. In contrast to such less robust methods, Liao and An [17] introduced a robust insulator detection algorithm based on local features and spatial orders for aerial images with complex background. Zhao et al. [18] used an MRF model to find lattice generated by analyzing the texture structure of insulator strings to detect multiple insulators. Although the method is robust to complex background images, it only works when insulators appear in groups, which limits the application of this method. Wang et al. [19] adopted a support vector machine (SVM) as a classifier to distinguish insulators from the cluttered background based on Gabor features. However, the training process of SVM is very complex because the classification results are sensitive to the selection of samples. This is a common problem in machine learning algorithms. In addition, the computational load of this algorithm limits the feasibility of real-time applications. Sadykova et al. [20] used you only look once (YOLO) deep learning neural network model to detect insulators under the conditions of uncluttered background.
These methods are based on two-dimensional images of transmission line scene, which usually rely on a few simplified assumptions, especially the size, shape, and background of insulators. However, in practical applications, due to the different shooting angles and lighting conditions, insulators are extremely diverse in shape, appearance, size, and their backgrounds are also complex and changeable, which has brought great challenges to the research of image recognition.
In this context, an automatic transmission line inspection system based on binocular visual perception technology is developed to accurately detect and locate power equipment in real time. The visual perception module can obtain binocular images of the current inspection scene. We calculate the parallax of left and right images by stereo matching and then obtain depth maps. It is difficult to accurately identify foreground targets in optical images with complex backgrounds, but this problem becomes simple when the depth information is taken into account [21][22][23]. In addition, the saliency detection algorithm has brought new solutions to various computer vision applications and has achieved highly encouraging results in recent years [24][25][26][27][28][29]. Inspired by these principles, we use RGB-D saliency detection to obtain the candidate insulator regions. Then, the skeleton structure of the candidate insulator region is extracted to drive the final accurate identification of the insulator. The important function of a binocular stereo vision system is spatial positioning. Once the insulator has been detected, we further propose a real-time object spatial localization method that combines a binocular stereo vision system and GPS. The main contributions of this paper are as follows: • An automatic transmission line inspection system integrated with UAV remote sensing and binocular stereo vision technology is developed to accurately detect and locate power equipment in real time. This system would be beneficial for transmission line inspection and similar applications in other fields; • We propose an insulator detection algorithm based on RGB-D saliency detection and skeleton structure characteristics, which can detect insulators in real time for aerial images with complex backgrounds; • We propose a real-time object spatial localization method that combines binocular stereo vision and a GPS device. The latitude, longitude and height of insulators are accurately obtained through coordinate conversion during the UAV inspection.
The rest of this paper is organized as follows: Section 2 introduces the dataset and the system framework. In Section 3, we describe our insulators detection and spatial location methods in detail. The experimental results and the detailed discussion on our work are presented in Sections 4 and 5, respectively. Finally, the study's conclusions and future work plans are discussed in Section 6.

Dataset
In order to fully verify the effectiveness of the proposed insulator detection and positioning algorithm in this paper, we selected three 220 kV transmission lines with a total length of about 3 km and conducted several UAV flighting experiments, the basic information of the selected three lines is shown in Table 1. A total of 400 pairs (400 left images, 400 corresponding right images) of aerial binocular images containing insulators with the size of 2448 × 2048 pixels were extracted to form the dataset for the subsequent insulator detection model. As shown in Figure 1, since our research involves insulator positioning, in order to facilitate the management, we grouped images and manually measured insulator location information according to different inspection routes and tower numbers. Each left image is annotated by manually drawing a border around each insulator using a graphic image annotation tool [30]. Since these images are obtained in the actual inspection environment, they are closer to the engineering application, making the experimental results more convincing. About 70% of the images contained complex backgrounds, including power towers, buildings, vegetation, rivers, etc. The remaining 30% contained relatively simple backgrounds, mainly electric power towers and the sky. In the actual inspection environment, the characteristics of the insulators in the aerial images are not completely consistent due to the influence of factors such as lighting conditions and shooting angles. Therefore, our dataset contains the results obtained in different weather, different moments and different angles. Moreover, the brightness of images and the angle of insulators are diverse. In addition, an important point is the diversity of insulators. Our dataset contains Remote Sens. 2021, 13, 230 4 of 21 the three most common types of insulators: composite insulators, glass insulators, and ceramic insulators. ing angles. Therefore, our dataset contains the results obtained in different weather, different moments and different angles. Moreover, the brightness of images and the angle of insulators are diverse. In addition, an important point is the diversity of insulators. Our dataset contains the three most common types of insulators: composite insulators, glass insulators, and ceramic insulators.

Automatic Insulator Inspection System
In this paper, an automatic inspection system is developed for insulator detection and spatial location. As shown in Figure 2, the system consists of a UAV module, which is equipped with a GPS device, a binocular visual perception module, which is composed of two visible-light cameras with the same parameters, an embedded industrial computer, where the detection algorithm is implanted, and the control and observation module.

Automatic Insulator Inspection System
In this paper, an automatic inspection system is developed for insulator detection and spatial location. As shown in Figure 2, the system consists of a UAV module, which is equipped with a GPS device, a binocular visual perception module, which is composed of two visible-light cameras with the same parameters, an embedded industrial computer, where the detection algorithm is implanted, and the control and observation module.
As shown in Table 2, the hardware devices in this study include UAV, visual sensor (Industrial CCD Camera), embedded industrial computer and portable display screen. We use the software development kit (SDK) provided by the camera manufacturer to perform secondary development of the binocular vision system. The embedded industrial control computer is used to deploy software algorithms to process captured images in real time and generate inspection reports automatically. The portable display screen is used to display the inspection results to the users. As shown in Table 2, the hardware devices in this study include UAV, visual sensor (Industrial CCD Camera), embedded industrial computer and portable display screen. We use the software development kit (SDK) provided by the camera manufacturer to perform secondary development of the binocular vision system. The embedded industrial control computer is used to deploy software algorithms to process captured images in real time and generate inspection reports automatically. The portable display screen is used to display the inspection results to the users.  [31], the system proposed in this paper has the following improvements in terms of hardware architecture: (1) We use binocular cameras instead of a monocular camera in the conventional system, which can obtain the depth of the scene and provide more supplementary information for the detection of the transmission line equipment; (2) We innovatively integrate an embedded industrial computer, which is installed on the UAV and powered directly by the UAV's power supply. It mainly includes the following functions:  Compared with the existing transmission line inspection systems [31], the system proposed in this paper has the following improvements in terms of hardware architecture: (1) We use binocular cameras instead of a monocular camera in the conventional system, which can obtain the depth of the scene and provide more supplementary information for the detection of the transmission line equipment; (2) We innovatively integrate an embedded industrial computer, which is installed on the UAV and powered directly by the UAV's power supply. It mainly includes the following functions: • Storing the images acquired by the binocular visual perception system. The traditional UAV storage module can only store the information obtained by a single visual sensor at a time and save the information to a memory card. When recalling the collected information, it is necessary to read the memory card first and then process the data, which is inflexible. In our system, images obtained by the binocular visual perception system are directly stored in the embedded industrial computer synchronously, which is convenient for the subsequent real-time invocation and processing of data information; • Processing the image information collected by the binocular vision system in real time. In traditional transmission line inspections, data processing and analysis still need to rely on manual labor. The main workflow includes data information copying, manual inspection and analysis, manual writing of inspection reports, etc. It is very labor-intensive and time-consuming. In this paper, the images collected by cameras are processed in real time using the embedded industrial computer with the implanted algorithm. Moreover, the inspection report can be generated automatically; • Integrating the multichannel image signal into a single-channel signal. In the traditional UAV system, only a single image transmission module is equipped. The pilot's observation screen only displays the image information acquired by a single camera. The embedded industrial computer can integrate multichannel image signals into single-channel image signals and then send them to the observation screen through the UAV's image transmission module. Moreover, combined with a real-time information processing function, the current inspection area detection results can also be transmitted to the observation screen.

Methods
In this paper, we propose an insulator detection and spatial localization method for aerial images with complex backgrounds. The sketch map of the proposed method is shown in Figure 3. First, the region segmentation of the left image in the binocular vision system is carried out, and the sparse parallax points are calculated through stereo matching. The depth map is generated by combining the result of regional segmentation with sparse parallax points, which reflects the influence of spatial position on visual saliency. Then, a two-stage strategy is proposed to achieve precise detection of insulators. In the first step, the candidate areas of insulators are determined via the RGB-D saliency detection method fusing color contrast features, texture contrast features and depth features. In the second step, we define the characteristic descriptors of the insulator skeleton structure and conduct structure searches in the candidate insulator region to filter false targets and realize accurate detection of insulators. Finally, the binocular stereo vision and UAV's GPS coordinates are used to obtain the spatial position of insulators.

Depth Information Acquisition
Generally, the human visual system would give priority to the nearest objects and then gradually spread to the distance. In other words, the object closer to the observer would attract more attention. Therefore, in the process of detecting the salient object, the spatial depth information is also one of the important salient features [32]. Moreover, the depth information can effectively eliminate the interference of complex background texture and facilitate the detection of insulators in aerial images. Most of the existing low-

Depth Information Acquisition
Generally, the human visual system would give priority to the nearest objects and then gradually spread to the distance. In other words, the object closer to the observer would attract more attention. Therefore, in the process of detecting the salient object, the spatial depth information is also one of the important salient features [32]. Moreover, the depth information can effectively eliminate the interference of complex background texture and facilitate the detection of insulators in aerial images. Most of the existing low-cost depth cameras rely on the time of flight (TOF) principle or structured light. However, these techniques have limitations in resolution and measurement accuracy. Light detection and ranging (LiDAR) as an active remote sensor is characterized by providing high-precision depth information. However, hardware modules are large, heavy, and expensive. The LiDAR generates a three-dimensional point cloud based on the distance measurement results of multiple laser beams. The 3D point cloud is discretely distributed, and the distance information is incomplete, which will affect the later automation processing.
In this context, better depth results can be obtained by stereo vision. Obtaining object and scene depth information based on image analysis methods is a research hotspot in the field of computer vision. Under the perspective projection imaging model, the mapping of a 3D scene to a 2D image is a process of loss of depth information. Therefore, it is necessary to estimate the depth information of the scene based on binocular disparity clues. However, the problem of corresponding point matching is a difficult point. Considering that the speed and accuracy of the existing depth calculation method may be unstable in practical application, a new depth estimation method is proposed. In this method, depth maps are constructed by combining the result of regional segmentation with sparse parallax points.

Image Segmentation
Excellent region segmentation results can improve the precision of the parallax boundary. We apply the image segmentation algorithm proposed by Arbelàez et al. [33] to obtain the region segmentation result of the left image. First, a multiscale contour detector is utilized to derive the globalized probability of boundary (gPB). Then a watershed transformation called the oriented watershed transform (OWT) is introduced to construct initial regions from contour detector output. Finally, an ultrametric contour map (UCM) is constructed to form these initial regions into a hierarchical region tree. The segmentation result is shown in Figure 4c.

Construction of Depth Map
Depth maps have been widely used as one of the expressions of three-dimensional spatial information. The gray value of each pixel in the depth map can be used to represent the distance of a certain point in the scene from the camera. During the study, we found that the detailed pixel information of the depth map is of little use in target detection and can be ignored. In addition, the excessive subdivision of the depth map will reduce the

Construction of Depth Map
Depth maps have been widely used as one of the expressions of three-dimensional spatial information. The gray value of each pixel in the depth map can be used to represent the distance of a certain point in the scene from the camera. During the study, we found that the detailed pixel information of the depth map is of little use in target detection and can be ignored. In addition, the excessive subdivision of the depth map will reduce the integrity of subsequent object detection. Therefore, in this article, we assume that the parallax value on the same object is the same. To ensure the accuracy of the depth boundary, we use the region segmentation structure obtained in the above steps to assist in the construction of the depth map, as shown in Figure 4. The detailed process is as follows: 1.
Speeded-up robust features (SURF) [34] is a fast and stable algorithm to detect feature points. In our method, SURF is used to extract feature points of binocular images and calculate their 64-dimension descriptors; 2.
According to the Euclidean distance between the feature points in the left image and the right image, the original matching point pair is selected; 3.
There are some mismatched feature point pairs in the initial matching results obtained in the above steps; we use slope consistency to eliminate mismatches. First, we connect the original matching point pairs with lines and calculate their slope in the image coordinate system. Then the frequency of slope ratio is calculated, and the slope with the highest frequency is taken as the principal slope. The matching point pairs that have the same slope values as the principal slope are retained and defined as sparse parallax point pairs. Finally, the depth values Z spp of the sparse parallax point pairs are calculated; 4.
We count the projection distribution of sparse parallax points in the segmentation region and take the mean parallax of matching points in each region as the parallax value of this region. Then the depth map D spp is constructed based on the sparse parallax map.

Insulator Detection
In the process of UAV inspection and shooting, insulators are usually close to the cameras, which are salient in space. In addition, it can be seen that although insulators are diverse in aerial images, they are composed of a series of insulator caps arranged regularly, so their skeleton structure is consistent. Therefore, taking the above two ideas into consideration, a two-stage strategy is proposed in this paper. First, candidate insulator regions are obtained based on RGB-D saliency detection. Then, the skeleton structure of the candidate insulator region is extracted. We define the feature descriptor of the skeleton structure and implement structure searching according to the descriptor to realize the final accurate identification of the insulator.

Construction of RGB-D Saliency Map
The significance of an area in an image depends on its difference from the surrounding areas, which is usually reflected in the feature, such as color features, shape features, texture features, etc. In this paper, color contrast features and texture contrast features are fused to obtain 2D salient images. Then, the depth information obtained in the previous procedure is exploited to refine the 2D saliency detection results.

Color contrast
The color feature is the most widely used visual feature in saliency detection. The simple linear iterative cluster (SLIC) [35] algorithm is used to segment the image into K superpixels, and each superpixel is assigned a unique identifier i (i = 1, 2, · · · , K). We extract the color features of each superpixel to form a feature vector F color summarizing the spatial weighted color distance between the current superpixel and all other superpixels: where the weight W P ij = exp(− p i , p j 2 2 /σ 2 ) is based on the distance between the center positions of superpixels i and j. It has been proven to be effective in saliency detection. When the spatial distance decreases, the correlation between superpixels increases. σ is a constant that controls the strength of the weight.

2.
Texture contrast Texture reflects the repeated local patterns in the image and their organization and arrangement properties. In this paper, we use the differential excitation of Weber local descriptor (WLD) [36], which is a simple and efficient texture descriptor. Let hist i and hist j denote the WLD histograms of superpixels i and j. We apply the earth mover's distance (EMD) [37] to calculate the texture distances of the two superpixels. A signature {s m = {v m , w vm }} is defined to represent a set of feature clusters, where v m is the central value in bin m of the histogram and w vm is equal to hist(m). Since histograms hist i and hist j can be viewed as signatures P = (p 1 , w p 1 ), (p 2 , w p 2 ), · · · , (p M , w p M ) and Q = (q 1 , w q1 ), (q 2 , w q 2 ), · · · , (q M , w q N ) , where M and N represent the number of bins of the histograms hist i and hist j , respectively. The Euclidean distance between p m and q n is defined as d mn . The goal is to find a flow f mn that minimizes the overall cost: After the optimal flow f mn is found, the earth mover's distance of the two superpixels i and j is calculated as the resulting work normalized by the total flow: The texture contrast of superpixel i is obtained by summarizing all texture distances of superpixel i to the others, weighted by their spatial distances: Feature contrast map fusion In the above procedures, color and texture feature contrast maps are derived. Both saliency feature maps are linear normalized to the range [0, 1]. The 2D visual saliency map Sal 2D is computed by adaptive weight fusion of the two feature maps: Sal 2D (i) = w c Con color (i) + w t Con texture (i) (5) where w c and w t are the fusion weights. To relieve the burden of manually balancing the feature components, we adaptively set the values of w c and w t according to their data uncertainty. In information theory, the smaller the entropy corresponding to a feature, the smaller the uncertainty of the result. Therefore, the weight assigned to the feature should be greater, and vice versa. The weight calculation formula is as follows: where H c = −∑ K i=1 Con color (i) · log 2 Con color (i) and H t = −∑ K i=1 Con texture (i)· log 2 Con texture (i) are the entropy of the color and texture feature, respectively.

Saliency Optimization Using Depth Information
While most of the salient regions are highlighted in the RGB saliency map computed by fusing color and texture feature, some superpixels which belong to the background are also highlighted. See Figure 5d. If some background superpixels have high contrast with the surrounding superpixels, they may also have high saliency. In order to suppress the background in the saliency map, we exploit the depth map obtained in the previous procedure to refine our 2D saliency detection results. The idea of this method is based on the fact that if the regions have the same (or similar) depth value, then their salience value should also be the same (or similar). In addition, the object closest to the observer would attract more attention and should be assigned a higher saliency value. According to these assumptions, we define a set of thresholds according to the depth value to stratify the image, represented by I g G g=1 , where G denotes the number of layers. The optimized saliency map is calculated as: where depth g is the depth value of the g − th image layer in the depth map, num g denotes the number of superpixels in the image layer I g . δ controls the weight's sensitivity to spatial distance, which is set to 0.2 empirically.

Skeleton Structure
We binarize and skeletonize the obtained saliency map. These skeletons still retain important information about the shape and structure of the original object, as shown in Figure 6. Although there are many types of insulators in power transmission lines, the insulator string has unique skeleton structure characteristics compared with other pseudo targets, which can be summarized as the following three points: 1. In the skeleton structure diagram, the central axis of the insulator strings corresponds to a long straight line, and the insulator caps correspond to several short straight lines; 2. All short straight lines are traversed by long straight lines; 3. The short straight lines are of approximately equal length and are arranged in parallel in the long straight lines at equal intervals.
The above three feature descriptors are formulated, respectively, then they are used as the basis to perform insulator structure search in the saliency results to achieve accurate detection of insulators. The structure search steps are as follows: 1. Center axis searching. We use the Hough algorithm [38] to detect straight lines in skeleton images. According to Equation (8), a straight line whose length is greater than 1/3 of the longer side length of the circumscribed rectangle of the connected domain is regarded as the suspected insulator central axis. The detection results are shown in Figure 6c.

Skeleton Structure
We binarize and skeletonize the obtained saliency map. These skeletons still retain important information about the shape and structure of the original object, as shown in Figure 6. Although there are many types of insulators in power transmission lines, the insulator string has unique skeleton structure characteristics compared with other pseudo targets, which can be summarized as the following three points: 1.
In the skeleton structure diagram, the central axis of the insulator strings corresponds to a long straight line, and the insulator caps correspond to several short straight lines; 2.
All short straight lines are traversed by long straight lines; 3.
The short straight lines are of approximately equal length and are arranged in parallel in the long straight lines at equal intervals. Remote Sens. 2021, 13, x FOR PEER REVIEW 1

Insulator Localization
In the process of inspection, the existing UAV inspection system can only displa coordinates of the UAV but cannot obtain the location of the target captured in the v The geolocation of the UAV is usually used t to approximate the location of the det objects [10]. In this section, we will introduce a novel insulator spatial localization me combining binocular stereo vision and GPS. The main goal of the UAV inspection o location is to match the pixel coordinates of the target in 2-dimensional images wit coordinates in real scenes, such as GPS coordinates. According to the UAV real-time data and equipment parameters, the conversion matrix between the image coordinat tem, the world coordinate system and the geographic coordinate system is calculated then the longitude, latitude and height of the object are obtained by coordinate co sion.

Binocular Visual-Spatial Location
The first step of the spatial location of an insulator is to obtain the three-dimens coordinates of the object point in the world coordinate system through binocular v model analysis using geometrical relationships.
The internal and external parameters of the left and right cameras are obtaine the calibration algorithm [39]. We set the optical center of the left camera as the origi the optical axis as the w Z -axis to establish a world coordinate system - Since the world coordinate system coincides with the left camera coordinate sy w Z is equal to the vertical distance c Z between the object and the baseline of two eras. As shown in Figure 7, according to the principle of binocular vision and tri similarity theorem, the conversion relation between pixel coordinate system and w coordinate system can be obtained as:  The above three feature descriptors are formulated, respectively, then they are used as the basis to perform insulator structure search in the saliency results to achieve accurate detection of insulators. The structure search steps are as follows:

1.
Center axis searching. We use the Hough algorithm [38] to detect straight lines in skeleton images. According to Equation (8), a straight line whose length is greater than 1/3 of the longer side length of the circumscribed rectangle of the connected domain is regarded as the suspected insulator central axis. The detection results are shown in Figure 6c.

2.
Insulator caps searching. We search for lines that are vertically bisected by the candidate central axis and record their length and position. Then, the number of lines num l is counted. We set a threshold T N , if num l ≥ T N , the candidate target is retained for the third step of filtering. T N is set to 6 in our experiment.

3.
Uniform arrangement judgment. The length variance S 2 length of the short line is calculated to represent the length consistency of the short line. The distance variance S 2 dis tan ce of the short line is calculated to represent the distance consistency of the short line. If the two parameters satisfy Equation (9), these short straight lines are determined to be the skeleton of insulator caps and are retained. Otherwise, they will be judged as false targets and eliminated so as to achieve precise detection of insulators.
where the threshold value T s is empirically set to 5.

Insulator Localization
In the process of inspection, the existing UAV inspection system can only display the coordinates of the UAV but cannot obtain the location of the target captured in the video. The geolocation of the UAV is usually used t to approximate the location of the detected objects [10]. In this section, we will introduce a novel insulator spatial localization method combining binocular stereo vision and GPS. The main goal of the UAV inspection object location is to match the pixel coordinates of the target in 2-dimensional images with the coordinates in real scenes, such as GPS coordinates. According to the UAV realtime flight data and equipment parameters, the conversion matrix between the image coordinate system, the world coordinate system and the geographic coordinate system is calculated, and then the longitude, latitude and height of the object are obtained by coordinate conversion.

Binocular Visual-Spatial Location
The first step of the spatial location of an insulator is to obtain the three-dimensional coordinates of the object point in the world coordinate system through binocular vision model analysis using geometrical relationships.
The internal and external parameters of the left and right cameras are obtained by the calibration algorithm [39]. We set the optical center of the left camera as the origin and the optical axis as the Z w -axis to establish a world coordinate system O w − X w Y w Z w , which means that the world coordinate system coincides with the left camera coordinate system O l c − X c Y c Z c . Assuming that the coordinates of the target point P in the world coordinate system are (X w , Y w , Z w , the coordinates in the image pixel coordinate system are (u l , v l ) and (u r , v r ). The distance between two cameras is defined as baseline distance b . d = u i − u r is denoted as parallax value.
Since the world coordinate system coincides with the left camera coordinate system, Z w is equal to the vertical distance Z c between the object and the baseline of two cameras. As shown in Figure 7, according to the principle of binocular vision and triangle similarity theorem, the conversion relation between pixel coordinate system and world coordinate system can be obtained as:  Note that obtaining three-dimensional spatial information based on parallax is derived under an ideal binocular system. However, in reality, a binocular stereo vision system with two perfectly coplanar cameras does not exist. In addition, the problem of image distortion needs to be considered. Therefore, we need to perform distortion correction and stereo rectification on the left and right images before stereo matching and spatial positioning.

Geographic Coordinates of Insulators
In this section, the longitude, latitude, and altitude of insulators are calculated by incorporating the insulator spatial information obtained in the previous section with the UAV flight data. We use the industrial computer to receive the flight data of the UAV in real time and extract the information required for object location, including the UAV's longitude and latitude 1 1 ( , ) long lat , altitude UAV h , pitch angle α , azimuth angle β , roll angle γ , and a pitch angle of the cameras θ . As shown in Figure 8, - Note that obtaining three-dimensional spatial information based on parallax is derived under an ideal binocular system. However, in reality, a binocular stereo vision system with two perfectly coplanar cameras does not exist. In addition, the problem of image distortion needs to be considered. Therefore, we need to perform distortion correction and stereo rectification on the left and right images before stereo matching and spatial positioning.

Geographic Coordinates of Insulators
In this section, the longitude, latitude, and altitude of insulators are calculated by incorporating the insulator spatial information obtained in the previous section with the UAV flight data. We use the industrial computer to receive the flight data of the UAV in real time and extract the information required for object location, including the UAV's longitude and latitude (long 1 , lat 1 ), altitude h U AV , pitch angle α, azimuth angle β, roll angle γ, and a pitch angle of the cameras θ. As shown in Figure 8, O w − X w Y w Z w is the world coordinate system O w − X w Y w Z w is the left camera coordinate system when the camera's pitch angle is zero. The conversion relationship between these two coordinate systems is as follows: Let ( , , ) in in in x y z represent the coordinates of the insulator in the world coordinate system, which can be obtained in the previous section. '' '' '' ( , , )   The UAV's coordinates (x U AV , y U AV , z U AV ) in the coordinate system O w − X w Y w Z w can be obtained by manual measurement according to the camera's installation position. Then we use formula (11) to calculate the UAV's coordinates (x U AV , y U AV , z U AV ) in the world coordinate system. As shown in Figure 8, the Y w -axis of the coordinate system O w − X w Y w Z w points to the vertical direction, the X w -axis and the Z w -axis are in the horizontal plane. We rotate the coordinate system O w − X w Y w Z w around the Z w -axis by an angle of γ, and then around the X w -axis by an angle of α + θ, which can coincide with the coordinate system O w − X w Y w Z w . The conversion relationship between the coordinate system O w − X w Y w Z w and the coordinate system O w − X w Y w Z w is calculated as: Let (x in , y in , z in ) represent the coordinates of the insulator in the world coordinate system, which can be obtained in the previous section. (x in , y in , z in ) and (x U AV , y U AV , z U AV ) represent the coordinates of the insulator and the UAV in the coordinate system O w − X w Y w Z w , respectively. The height difference between the UAV and the insulator can be calculated as: Obviously, the insulator height h in = h U AV + ∆h. As shown in Figure 9, we calculate the insulator's longitude and latitude (long 2 , lat 2 ) through the UAV's longitude, latitude and the relative position relationship between the insulator and the UAV.
lat 2 = lat 1 +d hr * cos β /(R * 2π/360) (15) where d hr = (x U AV − x in ) + (z U AV − z in ) denotes the horizontal distance between UAV and insulator, β = β + ∆β = β + arctan((x U AV − x in )/(z U AV − z in )) is the azimuth angle of the connecting line between the UAV and the insulator. R is the radius of the earth.
is the azimuth angle of the connecting line between the UAV and the insulator. R is the radius of the earth.

Results
In this section, the experimental validation was developed to prove the feasibility of the proposed UAV inspection system together with the insulator detection and spatial positioning algorithm. All the experiments are carried out on the Windows 10 operating system with Intel Core i7-8500 U @ 1.8 GHz 4-core CPU and 16 GB RAM, and the software platform of the inspection system is Microsoft Visual Studio 2010.

Performance of the Insulator Detection Algorithm
In this section, we adopted aerial images in the real inspection scene acquired by the UAV inspection system to evaluate the effectiveness and performance of the proposed solution. In fact, the transmission line image captured by UAV may contain more than one insulator, and the image background will be diversified due to the change of inspection area and shooting angle. We randomly select some aerial images with different backgrounds and a different number of insulators as samples to evaluate the performance of the proposed algorithm.
The experimental results under simple and complex backgrounds are shown in Figures  10 and 11, respectively. Experiment results show that the proposed solution cannot only accurately detect insulators in images with simple backgrounds but also demonstrates good performance for images with complex backgrounds. Figure 12 shows the detection results of a single insulator, and Figure 13 shows the detection results of multiple insulators. It is obvious that the proposed algorithm can accurately detect a single insulator.

Results
In this section, the experimental validation was developed to prove the feasibility of the proposed UAV inspection system together with the insulator detection and spatial positioning algorithm. All the experiments are carried out on the Windows 10 operating system with Intel Core i7-8500 U @ 1.8 GHz 4-core CPU and 16 GB RAM, and the software platform of the inspection system is Microsoft Visual Studio 2010.

Performance of the Insulator Detection Algorithm
In this section, we adopted aerial images in the real inspection scene acquired by the UAV inspection system to evaluate the effectiveness and performance of the proposed solution. In fact, the transmission line image captured by UAV may contain more than one insulator, and the image background will be diversified due to the change of inspection area and shooting angle. We randomly select some aerial images with different backgrounds and a different number of insulators as samples to evaluate the performance of the proposed algorithm.
The experimental results under simple and complex backgrounds are shown in Figures 10 and 11, respectively. Experiment results show that the proposed solution cannot only accurately detect insulators in images with simple backgrounds but also demonstrates good performance for images with complex backgrounds. Figure 12 shows the detection results of a single insulator, and Figure 13 shows the detection results of multiple insulators. It is obvious that the proposed algorithm can accurately detect a single insulator. Since depth information is introduced into our algorithm as an important visual cue to eliminate background interference, for images containing two or more insulators, the proposed algorithm can detect the insulators closer to the camera, while distant insulators are filtered out as background. This is conducive to the processing of the inspection system. The mechanism of this system is to deal with insulators that are close and clearly photographed. Even if the insulator that is not clear at a distance is identified, the subsequent equipment status detection cannot be accurately performed. The main function of the RGB-D saliency detection algorithm is to filter unclear long-distance insulators and background areas so as to reduce the waste of computing resources and improve the accuracy of insulator state diagnosis.
To further verify the proposed method, we analyze the real-time performance and detection success rate of the detection algorithm through the detection of 400 transmission line inspection images collected by UAV. According to the complexity of the image background, the images in the dataset are divided into two parts, in which the complex background samples account for 70%, and the remaining 30% are simple background samples. The efficiency of the proposed algorithm is evaluated through testing the average processing time and detection success rate. Moreover, compared with the solutions proposed in [14,16,40], the results are shown in Table 3.
Remote Sens. 2021, 13, x FOR PEER REVIEW 16 of 23 eliminate background interference, for images containing two or more insulators, the proposed algorithm can detect the insulators closer to the camera, while distant insulators are filtered out as background. This is conducive to the processing of the inspection system. The mechanism of this system is to deal with insulators that are close and clearly photographed. Even if the insulator that is not clear at a distance is identified, the subsequent equipment status detection cannot be accurately performed. The main function of the RGB-D saliency detection algorithm is to filter unclear long-distance insulators and background areas so as to reduce the waste of computing resources and improve the accuracy of insulator state diagnosis.   eliminate background interference, for images containing two or more insulators, the proposed algorithm can detect the insulators closer to the camera, while distant insulators are filtered out as background. This is conducive to the processing of the inspection system. The mechanism of this system is to deal with insulators that are close and clearly photographed. Even if the insulator that is not clear at a distance is identified, the subsequent equipment status detection cannot be accurately performed. The main function of the RGB-D saliency detection algorithm is to filter unclear long-distance insulators and background areas so as to reduce the waste of computing resources and improve the accuracy of insulator state diagnosis.     To further verify the proposed method, we analyze the real-time performance and detection success rate of the detection algorithm through the detection of 400 transmission line inspection images collected by UAV. According to the complexity of the image background, the images in the dataset are divided into two parts, in which the complex background samples account for 70%, and the remaining 30% are simple background samples. The efficiency of the proposed algorithm is evaluated through testing the average processing time and detection success rate. Moreover, compared with the solutions proposed in [14,16,40], the results are shown in Table 3. The average consumption refers to the time required for the algorithm to process an image with a size of 2448 × 2048 pixels on average. It can be observed that the proposed solution is superior to the one in [14], which significantly reduces the average time consumed for insulators detection and improves the detection success rate. The method in [40] and [16] meet the real-time requirements in terms of detection time, but the perfor-  The average consumption refers to the time required for the algorithm to process an image with a size of 2448 × 2048 pixels on average. It can be observed that the proposed solution is superior to the one in [14], which significantly reduces the average time consumed for insulators detection and improves the detection success rate. The method in [16,40] meet the real-time requirements in terms of detection time, but the performance of the algorithms is not stable enough. Moreover, their detection success rate will significantly decrease when process samples with complicated backgrounds, which limits the practical application of these algorithms. The detection success rates of the proposed solution are 92.8% and 91.9% for simple background images and complex background images, respectively. The detection time cost of aerial images is approximately 0.615 s. The experimental results show that our method can realize the real time detection of insulators with high accuracy, no matter for the images with simple backgrounds or the images with complex backgrounds.

Performance of the UAV Automatic Inspection System
As shown in Figure 14, we selected a typical 220 kV power transmission line with a length of about 1.82 km to test the proposed system. There were 9 towers that are numbered from 78 to 86 and 216 insulators. The background includes vegetation, roads, buildings and rivers, etc. Before the testing, we measured and recorded the longitude, latitude and height of the insulator with a hand-held instrument in advance with the help of professionals. In order to assess the performance of our system more comprehensively, six flights were conducted around the power transmission lines. The duration of each flight was 20-25 min, the flight altitude was 15 m to 35 m, and the maximum flight distance was about 4.5 km.

Performance of the UAV Automatic Inspection System
As shown in Figure 14, we selected a typical 220 kV power transmission line with a length of about 1.82 km to test the proposed system. There were 9 towers that are numbered from 78 to 86 and 216 insulators. The background includes vegetation, roads, buildings and rivers, etc. Before the testing, we measured and recorded the longitude, latitude and height of the insulator with a hand-held instrument in advance with the help of professionals. In order to assess the performance of our system more comprehensively, six flights were conducted around the power transmission lines. The duration of each flight was 20-25 min, the flight altitude was 15 m to 35 m, and the maximum flight distance was about 4.5 km. During the inspection, the real-time results processed by the embedded industrial computer could be displayed in the pilot control and observation module, as shown in Figure 15a. In addition, the observation screen can also monitor the UAV's flight status, weather information and the geolocation of insulators in real time. The inspection report can be exported after the inspection, which improves the informatization and automation of UAV inspection, as shown in Figure 15b. During the inspection, the real-time results processed by the embedded industrial computer could be displayed in the pilot control and observation module, as shown in Figure 15a. In addition, the observation screen can also monitor the UAV's flight status, weather information and the geolocation of insulators in real time. The inspection report can be exported after the inspection, which improves the informatization and automation of UAV inspection, as shown in Figure 15b.
The performance of the automatic target space positioning can be evaluated by analyzing and comparing the errors between the detected geolocations of the insulators and the geolocations of the insulators recorded previously in the manual inspection. Figure 16 shows the mean errors of longitude errors, latitude and altitude, respectively. The specific error analysis is as shown in Table 4, the mean errors values are −9.7741 × 10 −6 , 7.1167 × 10 −6 and 0.025 m, and the standard deviation are 3.9125 × 10 −5 , 5.2592 × 10 −5 and 0.2919 m. We can conclude that the target automatic spatial positioning method proposed in this study meets the requirements of actual inspection applications. Remote Sens. 2021, 13, x FOR PEER REVIEW 19 of 23 (a) (b) Figure 15. Results of transmission line inspection system: (a) interface of inspection information; (b) example of automatically generated patrol report.
The performance of the automatic target space positioning can be evaluated by analyzing and comparing the errors between the detected geolocations of the insulators and the geolocations of the insulators recorded previously in the manual inspection. Figure 16 shows the mean errors of longitude errors, latitude and altitude, respectively. The specific error analysis is as shown in Table 4, the mean errors values are

Discussion
In this paper, we construct a novel UAV inspection system, which can realize realtime insulator detection, positioning and automatic generation of inspection reports. Different from the existing work [41][42][43], our system can determine the true geographic coordinates of every insulator by binocular vision and realize real-time processing on the UAV's onboard terminal by embedded industrial computer. It is not only more convenient for the detection of transmission line insulators but also is crucial for UAV inspection development.  The performance of the automatic target space positioning can be evaluated by analyzing and comparing the errors between the detected geolocations of the insulators and the geolocations of the insulators recorded previously in the manual inspection. Figure 16 shows the mean errors of longitude errors, latitude and altitude, respectively. The specific error analysis is as shown in Table 4, the mean errors values are

Discussion
In this paper, we construct a novel UAV inspection system, which can realize realtime insulator detection, positioning and automatic generation of inspection reports. Different from the existing work [41][42][43], our system can determine the true geographic coordinates of every insulator by binocular vision and realize real-time processing on the UAV's onboard terminal by embedded industrial computer. It is not only more convenient for the detection of transmission line insulators but also is crucial for UAV inspection development.

Discussion
In this paper, we construct a novel UAV inspection system, which can realize real-time insulator detection, positioning and automatic generation of inspection reports. Different from the existing work [41][42][43], our system can determine the true geographic coordinates of every insulator by binocular vision and realize real-time processing on the UAV's onboard terminal by embedded industrial computer. It is not only more convenient for the detection of transmission line insulators but also is crucial for UAV inspection development.
First, the accurate three-dimensional geographical coordinates are very important for the inspection of transmission lines. Insulators play a vital role in power transmission lines. If a failure occurs, it will seriously threaten the stable operation of the power grid and cause unpredictable losses. Accurate spatial locations are necessary for insulator fault detection and maintenance. In addition, real-time insulator detection and positioning technology of aerial images can provide valuable information for the automatic navigation of UAVs because the insulators are also important symbols in the transmission line. In detail, the coordinates of insulators can further help UAVs to determine the position of the tower and plan the inspection route.
Second, the UAV's onboard terminal processing interface set up in our system can provide a good hardware platform for the realization of more functions, especially for some tasks with high real-time requirements. The proposed system can be easily adapted to other transmission line inspection tasks or similar applications in other fields. Moreover, our system has reserved multiple interfaces for power supply and information transmission. It is easy to expand the system hardware to carry other sensors in the UAV system according to requirements, such as infrared sensors, airborne lidar, etc. As shown in Figure 17, we have preliminary attempted to add an infrared camera in the UAV system and successfully obtained the temperature of power equipment in real time, which proved the scalability of the system.
In detail, the coordinates of insulators can further help UAVs to determine the position of the tower and plan the inspection route.
Second, the UAV's onboard terminal processing interface set up in our system can provide a good hardware platform for the realization of more functions, especially for some tasks with high real-time requirements. The proposed system can be easily adapted to other transmission line inspection tasks or similar applications in other fields. Moreover, our system has reserved multiple interfaces for power supply and information transmission. It is easy to expand the system hardware to carry other sensors in the UAV system according to requirements, such as infrared sensors, airborne lidar, etc. As shown in Figure 17, we have preliminary attempted to add an infrared camera in the UAV system and successfully obtained the temperature of power equipment in real time, which proved the scalability of the system.
In conclusion, the study described in this paper provides a more flexible and robust approach of real-time insulator detection and spatial location for the UAV inspection application and can further serve for the navigation of UAV, target tracking, defect diagnosis and inspection data management in operation effectively.
(a) (b) Figure 17. Multi-source sensor UAV inspection system: (a) hardware framework of the system; (b) interface of multisource sensor information collaborative processing.

Conclusions
In this paper, an automatic transmission line inspection system incorporating UAV remote sensing with binocular vision perception technology is developed to accurately detect and locate power equipment in real time. The system consists of a UAV module, embedded industrial computer, binocular visual perception module, and control and observation module. Taking insulators as the detection targets, we proposed a novel insulator detection approach based on RGB-D saliency detection and structural feature searching for aerial images captured by a UAV power transmission line inspection system. First, candidate insulator regions are obtained based on RGB-D saliency detection. Then, according to the consistency of the insulator structure, we implement a structure search to realize the final accurate detection of the insulator. On the basis of insulator detection results, we further propose a real-time object spatial localization method that combines binocular stereo vision and GPS. The proposed approach and inspection system have been Figure 17. Multi-source sensor UAV inspection system: (a) hardware framework of the system; (b) interface of multi-source sensor information collaborative processing.
In conclusion, the study described in this paper provides a more flexible and robust approach of real-time insulator detection and spatial location for the UAV inspection application and can further serve for the navigation of UAV, target tracking, defect diagnosis and inspection data management in operation effectively.

Conclusions
In this paper, an automatic transmission line inspection system incorporating UAV remote sensing with binocular vision perception technology is developed to accurately detect and locate power equipment in real time. The system consists of a UAV module, embedded industrial computer, binocular visual perception module, and control and observation module. Taking insulators as the detection targets, we proposed a novel insulator detection approach based on RGB-D saliency detection and structural feature searching for aerial images captured by a UAV power transmission line inspection system. First, candidate insulator regions are obtained based on RGB-D saliency detection. Then, according to the consistency of the insulator structure, we implement a structure search to realize the final accurate detection of the insulator. On the basis of insulator detection results, we further propose a real-time object spatial localization method that combines binocular stereo vision and GPS. The proposed approach and inspection system have been tested in the actual inspection environment (220 kV power transmission line). Experimental results have shown that our system meets the robustness and accuracy requirements of insulator detection and spatial localization in practical engineering.
Further research will focus on the following three aspects to improve our inspection system: (i) the implementation of the insulator defect detection on our system, which is very useful in power transmission line inspection; (ii) the expansion of the system's hardware, we can add an infrared camera and a telephoto camera to form a new vision system to realize the detection of abnormal heating and surface defects of electric equipment. (iii) the implementation of visual servo control of UAV, the current transmission line inspection requires manual control of UAVs. Next, we will further study the autonomous flight of drones based on visual servo control to realize automatic inspection of transmission lines. Data Availability Statement: The data are not publicly available due to the confidentiality of the research projects.