A Method of Insulator Faults Detection in Aerial Images for High-Voltage Transmission Lines Inspection

: Insulator faults detection is an important task for high-voltage transmission line inspection. However, current methods often su ﬀ er from the lack of accuracy and robustness. Moreover, these methods can only detect one fault in the insulator string, but cannot detect a multi-fault. In this paper, a novel method is proposed for insulator one fault and multi-fault detection in UAV-based aerial images, the backgrounds of which usually contain much complex interference. The shapes of the insulators also vary obviously due to the changes in ﬁlming angle and distance. To reduce the impact of complex interference on insulator faults detection, we make full use of the deep neural network to distinguish between insulators and background interference. First of all, plenty of insulator aerial images with manually labelled ground-truth are collected to construct a standard insulator detection dataset ‘InST_detection’. Secondly, a new convolutional network is proposed to obtain accurate insulator string positions in the aerial image. Finally, a novel fault detection method is proposed that can detect both insulator one fault and multi-fault in aerial images. Experimental results on a large number of aerial images show that our proposed method is more e ﬀ ective and e ﬃ cient than the state-of-the-art insulator fault detection methods. network to obtain accurate insulator position in the aerial image with complex background interference. Experimental results show that the performance of our proposed network is superior to the YOLOv2 and is close to the YOLOv3 network, in which YOLOv2 and YOLOv3 are considered to be the state-of-the-art object detectors. Most importantly, our proposed


Introduction
The status detection of electric power equipment is an essential technique for the high-voltage transmission lines inspection in which a wide variety of sensors are used [1]. Over the past few decades, vision sensor-based methods have been developed rapidly, and many topics have been examined, such as insulator detection [2], power line detection [3,4], and power tower detection [5,6]. Early studies show that the insulator string is one of the most important pieces of equipment of the high-voltage transmission line as it can provide both mechanical support and electrical insulation. However, due to In traditional manual inspection, people have to walk along paths near high-voltage transmission lines, and then check each insulator status by using various types of instruments, such as audition sensors [7], infrared imagers [8][9][10], cameras [11], and ultraviolet imagers [12]. However, the traditional manual method is inefficient and not feasible in practice as high-voltage transmission lines are usually built in complex surroundings containing forests and lakes. Recently, with the development of the unmanned aerial vehicle (UAV) control and image processing technique, insulator status inspection has trended towards the analysis of aerial images captured by UAVs [13]. The existing methods can be generalized into three main categories: (1) Man-made features-based methods, (2) machine learning-based methods, and (3) deep learning-based methods.
In the man-made feature-based methods, multiple features including color [11,14], shape [15][16][17], edge [18,19], gradient [20], texture [21], key-points [22][23][24][25] and their fusions [26,27] have been explored. Meanwhile, some mathematical models have also been applied, such as the snake model [28], Hough transform [29], Active Contour Model [30], Fuzzy c-means [31], and Receptive field model [32]. In the work of Wang [14], a threshold filtering scheme based on Lab color space is proposed to locate the insulators in aerial images. Then, the coordinates of each insulator are obtained through a developed mathematical model. Finally, the insulator fault is determined by the ratio of the insulator area to its external rectangle area. However, this method can be significantly affected by the complex background contains objects that are similar in color to the insulators. Moreover, its performance will degrade in cases in which insulators are overlapped. To solve this problem, Zhai et al. [20] propose a two-step strategy to obtain more accurate insulator locations. First, the insulators are roughly located by a saliency detection method. After that, the insulators are finely segmented from the background through a series of rules. Finally, an adaptive morphology method is proposed to detect the insulator fault. However, this method cannot achieve good performances in complex scenes that contain various types of background interference which are usually more salient than the insulators. Moreover, this method can only detect one fault in an insulator string, and cannot detect multi-fault in an insulator string. In [17,28], the Otsu algorithm is applied to obtain the insulator regions. Subsequently, the insulator contours are achieved by the wavelet modulus maximum In traditional manual inspection, people have to walk along paths near high-voltage transmission lines, and then check each insulator status by using various types of instruments, such as audition sensors [7], infrared imagers [8][9][10], cameras [11], and ultraviolet imagers [12]. However, the traditional manual method is inefficient and not feasible in practice as high-voltage transmission lines are usually built in complex surroundings containing forests and lakes. Recently, with the development of the unmanned aerial vehicle (UAV) control and image processing technique, insulator status inspection has trended towards the analysis of aerial images captured by UAVs [13]. The existing methods can be generalized into three main categories: (1) Man-made features-based methods, (2) machine learning-based methods, and (3) deep learning-based methods.
In the man-made feature-based methods, multiple features including color [11,14], shape [15][16][17], edge [18,19], gradient [20], texture [21], key-points [22][23][24][25] and their fusions [26,27] have been explored. Meanwhile, some mathematical models have also been applied, such as the snake model [28], Hough transform [29], Active Contour Model [30], Fuzzy c-means [31], and Receptive field model [32]. In the work of Wang [14], a threshold filtering scheme based on Lab color space is proposed to locate the insulators in aerial images. Then, the coordinates of each insulator are obtained through a developed mathematical model. Finally, the insulator fault is determined by the ratio of the insulator area to its external rectangle area. However, this method can be significantly affected by the complex background contains objects that are similar in color to the insulators. Moreover, its performance will degrade in cases in which insulators are overlapped. To solve this problem, Zhai et al. [20] propose a two-step strategy to obtain more accurate insulator locations. First, the insulators are roughly located by a saliency detection method. After that, the insulators are finely segmented from the background through a series of rules. Finally, an adaptive morphology method is proposed to detect the insulator fault. However, this method cannot achieve good performances in complex scenes that contain various types of background interference which are usually more salient than the insulators. Moreover, this method can only detect one fault in an insulator string, and cannot detect multi-fault in an insulator string. In [17,28], the Otsu algorithm is applied to obtain the insulator regions. Subsequently, the insulator contours are achieved by the wavelet modulus maximum method or snake model. Finally, each insulator contour is fitted to an ellipse by the Hough transform or the least square method, and the insulator fault can be judged by calculating the number of insulators. However, both [17,28] can only determine whether there are insulator faults in the aerial image but cannot give the positions of the insulator faults. Wu et al. [30] consider that the insulators in aerial images often exhibit the problem of texture inhomogeneities. To solve this problem, the texture features of insulators is extracted by a semi-local operator under the Beltrami framework. Then, a new active contour is proposed to extract insulators from an aerial image. However, this method is time-consuming and far from a practical application. In the work of [22,24,25], the key-point features of the insulator are analyzed. Subsequently, the SURF or DoG (Difference of Gaussians) key-point features are used to locate the insulators in an aerial image. Finally, an elliptical spatial descriptor is applied to check the insulator faults. However, although these methods can achieve perfect performance in some cases, they can be easily affected by the filming angle and distance when applied in practical applications. To enhance the robustness of insulator faults detection. Jiang et al. [27] adopt multiple insulator features containing color, shape, and texture, to extract the insulators from the complex background. After that, they develop an insulator piece-to-piece distance-based strategy to detect the insulator fault. However, due to the high computational complexity of the feature extraction strategy, this method is also far from a real-time application. Moreover, this method is only applicable to independent situations among adjacent insulators in the aerial image. Based on observations of many man-made feature-based methods, these methods are quite sensitive to background interference. Moreover, their performance is usually suppressed by the filming angles and filming distances.
In the machine learning-based methods, AdaBoost [33,34], Sparse representation-based classifier [35], SVM [36], Cascade classifier [37], and KNN [38] are applied to locate the insulator positions and detect the insulator faults. Shang and Li [34] extract seven invariant moment features of insulator by using samples from 300 insulator aerial images to train an AdaBoost classifier. Then, they use the trained AdaBoost classifier to locate the insulators in the aerial image. Finally, they design a strategy similar to that of [27] to detect insulator faults. Experiment results show that their proposed method can be used to detect multi-fault in an insulator string. However, because the filming angle and distance are always changing during UAV inspection, the insulators in aerial images are usually overlapped, which means it is quite hard to obtain the spatial information of each insulator. Consequently, the method proposed in [34] can only achieve good performance when insulators are isolated. In [35], an insulator dataset is constructed and the HOG (Histogram of Oriented Gradient) feature of each image is calculated. After that, the PCA (Principal Component Analysis) is applied to create an over-complete dictionary for each insulator sample. Finally, a sparse-representation-based classifier is trained to obtain the positions of insulator faults. However, only using an over-complete dictionary of the HOG feature to train a classifier cannot achieve good performance in the background with complex texture interference. To compensate for this shortcoming. Yan et al. [36] applied fusion features composed of HOG and LBP (Local Binary Pattern) features to train an SVM (Support Vector Machine) classifier. Experiment results show that this method can obtain multi-angle insulator's locations in complex scenes. In the work of [37], haar-like feature, integral graph feature and directional gradient histogram feature are combined to train a cascade classifier and an SVM classifier. Then, the two classifier models are applied to locate the insulators. Finally, the fault location can be determined by an incremental contour value-based strategy. In [38], a KNN classifier is trained to distinguish between insulator caps and background clutter. Then, an automatic insulator fault detector is developed to analyze each cap for faults based on an elliptical descriptor. Figure 1 in [38] shows that their proposed method has the potential to detect insulator multi-missing-fault. However, since there is currently no publicly-available dataset for insulator missing faults detection, their dataset contains only 10 images with insulator missing faults. Although the machine learning-based methods have increased the accuracy of insulator location and faults detection, all of them have a common limitation in that they are time-consuming as they have to adopt the slide-window strategy to check the whole aerial image.
In the past five years, object detection has achieved great breakthroughs with the development of hardware equipment and deep learning theory. Many representative deep convolutional networks such as RCNN [39], Fast-RCNN [40], Faster-RCNN [41], YOLOv1 [42], YOLOv2 [43], YOLOv3 [44], and their variants [45] are proposed and validated on public datasets [46]. These methods depict objects by learning high-dimensional semantic features. Specifically, SPP-Net, RCNN, Fast-RCNN, and Faster-RCNN belong to two-stage networks while YOLOv1, YOLOv2, and YOLOv3 are one-stage networks. The two-stage methods have two common shortcomings: that they are time-consuming and hard to train. On the contrary, the one-stage methods can run in real-time at a moderate expense of accuracy compared with the two-stage methods [47][48][49]. Therefore, one-stage methods have higher feasibility for deployment on embedded devices. Motivated by these pioneering researches, it is worth investigating how to use deep learning models to locate insulators and detect faults in aerial images [1]. Although there are few related works, a summary of literatures are given and analyzed as follows: In the work of [40,41,48,49], Fast-RCNN and Faster-RCNN are adopted to locate the insulators. However, the training process of Fast-RCNN and Faster-RCNN is complicated and difficult to deploy. Moreover, they cannot locate insulators in aerial images in real-time. In [50], Faster-RCNN uses rectangle bounding boxes to label the insulator positions in the aerial image. After that, U-net is developed to segment the fault contour in the rectangle bounding boxes. The performance of this cascade framework will degrade in cases in which insulators are serious overlapping. Moreover, there is no public dataset for insulator fault detection. Accordingly, it is hard to train an end-to-end network with good performance for insulator faults detection. To address this challenge, Tao et al. [51] segment the insulator string that contains insulator fault from an aerial image. Subsequently, they paste the segmented insulator string on another aerial image that only contains background to augment their insulator fault dataset. However, the insulator fault in a simulated aerial image is similar to that of the original aerial image: this disadvantage will affect the experimental results, which in turn affect the generalization ability of their proposed network.
In general, most of the man-made feature-based methods and machine learning-based methods are quite sensitive to complex background interference. Moreover, their performances are usually suppressed by the filming angle and distance. Furthermore, most of these methods are time-consuming and far from a real-time application. In respect of the existing deep learning methods, since there are no insulator aerial image datasets available from public resources, such methods have not been significantly developed at present. Most importantly, regardless of the existing man-made feature-based methods, machine learning methods, or the deep learning methods, they have no systematic analysis and solve the problem of the insulator multi-fault detection. Therefore, it is meaningful to propose a method that can solve the problems in the existing methods.
In this paper, we propose a novel two-step method for insulator faults detection that is based on the CNN feature of UAV aerial images while considering the unique color and area feature of the insulator faults. The main idea of the proposed method can be concluded as follows: first, plenty of insulator aerial images are collected and labelled ground-truth to construct an unprecedented dataset. Subsequently, a new deep convolutional network is trained and adopted to obtain the accurate insulator position. Finally, the obtained insulator location is set as a RoI (Region of Interest), and then a novel method is proposed to detect insulator faults in the RoI. The main contributions of this paper are summarized as follows.
To compensate the shortcoming of the lack of the dataset. We construct a large UAV-based insulator dataset with plenty of images filmed in various aerial scenes, and we label the ground-truth for each image. This dataset is suitable for training and testing deep convolutional networks. It also can be applied for validating the performance of the traditional insulator detection methods.
We propose an effective network to obtain accurate insulator position in the aerial image with complex background interference. Experimental results show that the performance of our proposed network is superior to the YOLOv2 and is close to the YOLOv3 network, in which YOLOv2 and YOLOv3 are considered to be the state-of-the-art object detectors. Most importantly, our proposed network's memory usage is 14.5% less than YOLOv3 and 21.5% less than YOLOv2, which means our proposed network is more conducive to the deployment of embedded devices.
We develop a new idea to create insulator fault images to compensate for the lack of insulator faults dataset. After that, we propose a novel method for insulator faults detection in the RoI that are obtained by our proposed network. Experimental results show that our proposed network is more effective and efficient than two state-of-the-art insulator fault detection methods. Most importantly, compared with the previous works, our proposed method can accurately detect not only insulator one fault but also insulator multi-fault.
The remainder of this paper is organized as follows. Existing methods for insulator detection and insulator fault detection are reviewed in Section 1. A detailed description of our proposed method is presented in Section 2. Experimental results and discussion are discussed in Section 3. Finally, conclusion and future work are shown in Section 4. The experimental results are exhibited in pictures, and please zoom in for a better view.

Insulator Detection
For the insulators' detection, there are two well-known challenges given as follows: first, the background of UAV aerial images is usually complex and varied. Therefore, the detector easily judges background interference as insulators. Second, due to the different filming angle and distance, the phenotypes of the insulators in each image are extremely different. Consequently, it is necessary to design an effective and robust model.

Model Structure
As we know, deep convolutional networks have shown a huge success in image recognition. To better extract high-dimensional semantic features of the objects, some representative backbone networks, such as AlexNet [52], VGG [53], and ResNet [54], have been developed and validated on public datasets. A comparison of AlexNet, VGG, and ResNet on ImageNet dataset [46] is shown in Table 1. Specifically, the performances of ResNet are superior to that of the AlexNet and the VGG, and they can increase their Top-5 performance by increasing the depth of the network structure. Despite the Top-5 percentage of ResNet50 is only 0.8% lower than that of ResNet101, its computation time and memory usage are almost half of those of the ResNet101. Considering the advantages of the ResNet50, we adopt it to be the backbone of our proposed network. Moreover, we replace the channel of the last convolutional layer in ResNet50 with 1024. Since the filming angles and filming distances are varied in different images, the phenotypes of the insulators can be divided into three scales: small, middle, and large. To ensure each scale of the insulators can be effectively detected, we refer to the work of [44,56], and then develop a three branches structure in the proposed model to detect insulators with different scales. Figure 2 shows the whole architecture of the proposed network. Moreover, in the work of [55], the ResNet50 is combined with the header layer of the YOLOv2 to design a new deep convolutional model, which is named as ResnetV2 in this paper. However, the experimental results on our insulator dataset show that it has a poor performance (Please see experiment), which means the features only obtained by ResNet50 cannot effectively represent the characteristics of the insulators. Thus, it is necessary to design a deeper network structure to learn more effective insulator features from shallow layers. To address this challenge, we develop a cascade convolutional structure in each branch to extract high-dimensional semantic features of different scales of the insulators, as shown in Formula (1): where L, M, and N indicate the channel number of the kernels. From Formula (1), the designed cascade convolutional structure consists of three convolutional kernels. Specifically, the first convolutional kernel (i.e., 3 × 3) is applied to extract finer insulator feature maps in the 8-neighbor region. Then, the second convolutional kernel (i.e., 1 × 1) is adopted to change the channel number for increasing the non-linearity without changing the receptive fields of the convolutional layers. Finally, the third convolutional kernel (i.e., 1 × 1) is used to achieve cross-channel interaction and feature integration.
To make the proposed network as an effective one-stage model that is easy to train and can detect insulators in real-time, it is necessary to connect each branch and share the features in different branches. However, with the depth of the network getting deeper and deeper, the insulators could not be effectively detected as so few features cannot indicate the characteristics of the insulators. Therefore, it is unreasonable to only use the feature maps of a layer in the previous branch as the input of the current branch. To accomplish this goal, we take the fact that the shallow convolution layers can provide low-level features of the insulators (color, texture, and shape, etc.). Subsequently, the conv4_6 of the backbone network is routed to the conv11 in 'large' branch to create fusion feature maps. After that, the fusion feature maps are considered to be the input for the 'Middle' branch. Similarly, the conv3_4 feature maps are combined with conv22 as the input of the 'Small' branch.

Training Preparation
To obtain the accurate locations of different scales of the insulators in an image, the k-means cluster algorithm is applied to automatically find a good bounding box prior instead of hand-picked prior, and the result is shown in Figure 3. Based on the observation of Figure 3, it is found that k = 12 can be treated as a compromise that has a good IoU (Intersection over Union) and moderate model complexity. Therefore, we choose 12 clusters corresponding to IoU = 64.73, and then divide the 12 clusters into three categories for different detectors, which is given as follows: The proposed method adopts the loss function that is proposed in [44], which is given in Formula (2).  In Formula (2), i x is the x coordinate of the prediction while the x with symbol ' ∧ ' is the coordinate of the ground-truth. The definitions of the other parts in Formula (2) are similar to the above definition. Moreover, the first line and the second line of Formula (2) indicates the coordinates loss and distance loss between the prediction and the ground-truth. The third line represents the confidence loss of the predicted bounding box containing the insulators, while the fourth line gives the confidence loss of the predicted bounding box does not contain the insulators. The fifth line denotes the category prediction loss. Unlike the work of [44], the k is set to be 4 in the proposed network.

Insulator Faults Detection
As previously mentioned, most of the existing methods can only give good results when detecting one fault in the insulator string, and they cannot detect an insulator multi-fault. To address these challenges, we refer to the work of [11,20], which are considered to be the state-of-the-art insulator one fault detection methods. Then, a novel solution with a systematic analysis is proposed for insulator multi-fault in this section. Figure 4 exhibits the flowchart of the proposed method, and detailed explanations are given as follows.
First of all, based on the observations of large numbers of insulator aerial images, three important features of insulators were found, which are listed as follows.
First, although the color of the insulators is similar to that of the background, the color between them is still different.
Second, the positions of insulator faults are usually random, and these positions are not fixed. Last but not least, the contour sizes of different insulator faults are very similar in an image.  In Formula (2), x i is the x coordinate of the prediction while the x with symbol '∧' is the coordinate of the ground-truth. The definitions of the other parts in Formula (2) are similar to the above definition. Moreover, the first line and the second line of Formula (2) indicates the coordinates loss and distance loss between the prediction and the ground-truth. The third line represents the confidence loss of the predicted bounding box containing the insulators, while the fourth line gives the confidence loss of the predicted bounding box does not contain the insulators. The fifth line denotes the category prediction loss. Unlike the work of [44], the k is set to be 4 in the proposed network.

Insulator Faults Detection
As previously mentioned, most of the existing methods can only give good results when detecting one fault in the insulator string, and they cannot detect an insulator multi-fault. To address these challenges, we refer to the work of [11,20], which are considered to be the state-of-the-art insulator one fault detection methods. Then, a novel solution with a systematic analysis is proposed for insulator multi-fault in this section. Figure 4 exhibits the flowchart of the proposed method, and detailed explanations are given as follows.
First of all, based on the observations of large numbers of insulator aerial images, three important features of insulators were found, which are listed as follows.
First, although the color of the insulators is similar to that of the background, the color between them is still different.
Second, the positions of insulator faults are usually random, and these positions are not fixed. Last but not least, the contour sizes of different insulator faults are very similar in an image. In Formula (2), i x is the x coordinate of the prediction while the x with symbol ' ∧ ' is the coordinate of the ground-truth. The definitions of the other parts in Formula (2) are similar to the above definition. Moreover, the first line and the second line of Formula (2) indicates the coordinates loss and distance loss between the prediction and the ground-truth. The third line represents the confidence loss of the predicted bounding box containing the insulators, while the fourth line gives the confidence loss of the predicted bounding box does not contain the insulators. The fifth line denotes the category prediction loss. Unlike the work of [44], the k is set to be 4 in the proposed network.

Insulator Faults Detection
As previously mentioned, most of the existing methods can only give good results when detecting one fault in the insulator string, and they cannot detect an insulator multi-fault. To address these challenges, we refer to the work of [11,20], which are considered to be the state-of-the-art insulator one fault detection methods. Then, a novel solution with a systematic analysis is proposed for insulator multi-fault in this section. Figure 4 exhibits the flowchart of the proposed method, and detailed explanations are given as follows.
First of all, based on the observations of large numbers of insulator aerial images, three important features of insulators were found, which are listed as follows.
First, although the color of the insulators is similar to that of the background, the color between them is still different.
Second, the positions of insulator faults are usually random, and these positions are not fixed. Last but not least, the contour sizes of different insulator faults are very similar in an image.  Subsequently, the proposed insulator faults detection method can be divided into the following three steps: # Step 1: Grab-cut algorithm [57] is adopted to manually segment the insulators region from 300 insulator aerial images in the training set of the InST_detection Dataset. After that, a more proper color distribution of the insulators in RGB color space can be concluded as following.
Then, the insulators detected by the proposed network is marked by a rectangle bounding box. This bounding box is set to be a RoI. Finally, the color model listed in Formula (3) is applied to segment the insulators from the RoI.
# Step 2: The adaptive morphology [20] is exploited to fill the gaps between the adjacent insulator pieces, and the insulator string becomes a connected component. Then, the coordinates of the connected component's minimum bounding rectangle are calculated, and the minimum bounding rectangle perform the 'XOR' operation with the connected component. Finally, the contours of the insulator fault candidates are highlighted. It is worth noting that these candidates consist of not only the contours of real insulator faults but also the interference contours. Based on the observation of the contours of the candidates, it is found that some of the interference contours are elongated rectangles. Therefore, parts of the interference contours can be removed by the following rule: In this work, if the aspect ratio of a contour is greater than 1:5, it should be retained in the candidate's sequence; otherwise, it will be regarded as an interference. # Step 3: In response to the actual situation, there are two aspects should be considered: (1) The number of the fault candidates is larger than the max number of the real-fault in aerial images of the 'InST_detection' dataset; and (2) the number of the fault candidates is just equal to the max number of the real-fault in the aerial image of 'InST_detection' dataset. In most cases, the number of fault candidates is larger than the real-fault in an image. Moreover, the area of the insulator fault contour is usually much larger than the area of the interference contour. Therefore, on the one hand, if the candidate number is larger than the max number of the real-fault in aerial images of 'InST_detection', the K-means algorithm is applied to cluster the candidate areas into two categories: real fault contours and interference contours. Subsequently, one of the two categories with a larger average area is considered as the real-fault. On the other hand, if the candidate number is smaller than the max number of the real-fault, the bubble sort algorithm is adopted to sort the candidate areas in descending order. Then, we loop and check the array from the smallest contour. If one contour area is larger than 0.6 times the maximum contour area, this contour is considered to be a real insulator fault. Compared with the state-of-the-art methods, the proposed fault detection method is more effective and can detect not only insulator one fault but also insulator multi-fault. More details can be seen in Section 4.

Data Collection
Since there has been no publicly available dataset for insulator detection in UAV aerial images, the 'InST_detection' dataset is constructed to validate the performances of the proposed network, as shown in Table 2. The 'InST_detection' dataset consists of 4031 images that contain different surroundings, and the filming distances and angles are varied in every image. These aerial images are collected from the power company's database and captured by an on-UAV camera. In this work, first, all the images in 'InST_detection' dataset are normalized to the same size of 416 pixel × 416 pixel.
Then, we label the ground-truth for each insulator string using LabelImg tool [58]. Finally, we get 9609 insulator string with their ground-truth in total. To verify the proposed insulator faults detection method, it is necessary to construct an insulator faults dataset. However, although insulator faults detection is an important task in high-voltage transmission line inspection, it is well known that the main factor for limiting the development of insulator fault detection methods is that there are few insulator faults aerial images which can be collected, and we share a common limitation with the previous works in that we only obtained 42 images with one fault or multi-fault at the beginning. To address this problem, a novel data augmentation method is proposed in this work. First, based on the analysis of insulator fault aerial images, it is found that although the backgrounds of different regions in an image are different, the background patches in a local region still have similar features. Second, it is conceivable that the background behind the normal insulator should be similar to the pixels around it. Third, we learn that if there are four insulator faults in an insulator string, it should be repaired within 24 h. Otherwise, the electrical performance of the insulator string will be greatly reduced, which will further affect the stability of the power grid operation. Based on the above facts, Photoshop software [59] is used to erase the normal insulator regions and replace them with their nearby pixels. Finally, a dataset containing 120 insulator fault images was created, and there were a total of 228 insulator faults (detail is shown in Table 3). Specifically, 60 images contain insulator multi-fault while the other 60 images only contain insulator one fault. The number of insulator faults in each insulator string is at most four in our dataset. Our data augmentation method is simple but effective: it reduces the time and the cost of insulator fault images collection, which is meaningful for future deep-learning-based methods that need plenty of insulator fault aerial images to train.

Experimental Results and Discussion
We adopt the ResNet50 model pre-trained on the ImageNet dataset [46] to be the backbone of the proposed network. The weights of the remaining layers in the proposed network are randomly initialized. In the process of training, the maximum numbers of the iterations of both the proposed network and the four compared networks are set to be 35,000, and the learning rates of the five networks are initialized as 0.001. After 20,000 and 28,000 iterations, both the learning rates of the five networks are reduced to 0.0001 and 0.00001 to achieve finer convergences. Inspired by the work of [43], we apply random hue, saturation, and exposure shifts to realize data augmentation during the training process of the five networks. Specifically, hue = 0.1, saturation = 1.5, and exposure = 1.5. Hue = 0.1 means a 10% random shift will be made in the hue space of the images that participate in the training. The saturation and the exposure shift are similar to that of the hue shift.

Analysis of the Proposed Network
The experiments are conducted on a PC with an Intel quad-core i7-7700, 3.6 GHz CPU, 32 G of RAM, and a NVIDIA GeForce TITAN XP (12 GB). The proposed network is trained on Dark-net framework [55] and it takes 28 h to obtain its final model. After that, the final model and the source code of the proposed insulator faults detection method are evaluated on the Visual studio framework.
To evaluate the effectiveness of the proposed network, the 'InST_detection' dataset is divided into the training set and the testing set. The training set consists of 2675 images and the testing set is composed of 1356 images, approximately 2:1. We compare the proposed network with four existing networks: YOLOv3, YOLOv2, YOLOv3-tiny, and ResnetV2. Specifically, YOLOv3 and YOLOv2 are considered to be the state-of-the-art one-stage object detectors that can achieve good performances and run in real-time. YOLOv3-tiny is the abbreviated version of YOLOv3, which runs faster than YOLOv3. ResnetV2 takes the Resnet50 as the backbone and adopts the header layer of YOLOv2 as the detection layer.
For a fair comparison, both the compared networks and the proposed network are trained and tested on the 'InST_detection' dataset. Moreover, three measurements: AP, running time, and Memory usage, are introduced to validate the effectiveness of the proposed network quantitatively.
Specifically, AP indicates the area under each Precision-Recall curve (i.e., PR-curve); the better the network, the higher the AP value. The definitions of Precision and Recall are given in Formula (5). Specifically, True Positive (TP) indicates the number of insulators that have been correctly detected. False Positive (FP) and False negative (FN) indicates the number of background regions that are marked as insulators and the insulators that are incorrectly identified, respectively. We calculate the AP values for different networks after conducting on the testing set of the 'InST_detection' dataset, as shown in Figure 5, where the horizontal axis shows different recall values, and the vertical axis gives the corresponding precision values. Moreover, the running times and memory usages of different networks are also exhibited in Table 4. Based on the observation of Figure 5, it is found that the AP value of the proposed network (89.96%) is higher than that of the YOLOv2 (89.83%), the ResnetV2 (85.92%), and the YOLOv3-tiny (52.78%), while is only a little smaller than that of the YOLOv3 (90.05%), which means that the performance of the proposed network is almost consistent with that of the state-of-the-art object detection networks (i.e., YOLOv3 and YOLOv2) and is superior to the ResnetV2 and YOLOv3-tiny. When considering the running times and the memory usages, both the proposed networks, YOLOv3 and YOLOv2, can run in real-time, while the memory usage of the proposed network is 14.5% and 21.5% less than that of the YOLOv3 and the YOLOv2, respectively. Therefore, the proposed network is more advantageous when been deployed on embedded devices. composed of 1356 images, approximately 2:1. We compare the proposed network with four existing networks: YOLOv3, YOLOv2, YOLOv3-tiny, and ResnetV2. Specifically, YOLOv3 and YOLOv2 are considered to be the state-of-the-art one-stage object detectors that can achieve good performances and run in real-time. YOLOv3-tiny is the abbreviated version of YOLOv3, which runs faster than YOLOv3. ResnetV2 takes the Resnet50 as the backbone and adopts the header layer of YOLOv2 as the detection layer.
For a fair comparison, both the compared networks and the proposed network are trained and tested on the 'InST_detection' dataset. Moreover, three measurements: AP, running time, and Memory usage, are introduced to validate the effectiveness of the proposed network quantitatively.
Specifically, AP indicates the area under each Precision-Recall curve (i.e., PR-curve); the better the network, the higher the AP value. The definitions of Precision and Recall are given in Formula (5). Specifically, True Positive (TP) indicates the number of insulators that have been correctly detected. False Positive (FP) and False negative (FN) indicates the number of background regions that are marked as insulators and the insulators that are incorrectly identified, respectively. We calculate the AP values for different networks after conducting on the testing set of the 'InST_detection' dataset, as shown in Figure 5, where the horizontal axis shows different recall values, and the vertical axis gives the corresponding precision values. Moreover, the running times and memory usages of different networks are also exhibited in Table 4. Based on the observation of Figure 5, it is found that the AP value of the proposed network (89.96%) is higher than that of the YOLOv2 (89.83%), the ResnetV2 (85.92%), and the YOLOv3-tiny (52.78%), while is only a little smaller than that of the YOLOv3 (90.05%), which means that the performance of the proposed network is almost consistent with that of the state-of-the-art object detection networks (i.e., YOLOv3 and YOLOv2) and is superior to the ResnetV2 and YOLOv3-tiny. When considering the running times and the memory usages, both the proposed networks, YOLOv3 and YOLOv2, can run in real-time, while the memory usage of the proposed network is 14.5% and 21.5% less than that of the YOLOv3 and the YOLOv2, respectively. Therefore, the proposed network is more advantageous when been deployed on embedded devices.   To validate the accuracy and robustness of the proposed network in complex aerial scenes, we select some images with complex background interference. Moreover, it is worth noting that these images are also taken on different filming angles and filming distances. After that, we compare the proposed network with two state-of-the-art insulator fault detection methods as their first steps are also to locate the insulator positions in aerial images, some results are shown in Figure 6. The first column to the third column of the Figure 6 depicts the performances of the proposed method, method [20] and method [11], respectively. Based on the observation of Figure 6, it is found that all of the three methods achieve good results when dealing with a pure background (i.e., clean sky, the first row of Figure 6), while method [11] and method [20] are extremely sensitive to different types of background interference. In contrast, due to the suitable training on a large number of insulator samples, the proposed network can detect the insulator positions more accurately. To further measure the performances of the proposed network in the aerial videos, two UAV-based insulator aerial videos filmed in China are selected to test the proposed network, as shown in Figure 7. Based on the observations of Figure 7, it is found that the proposed network can obtain accurate positions of insulators continuously, which makes a good foundation for subsequent insulator faults detection.  To validate the accuracy and robustness of the proposed network in complex aerial scenes, we select some images with complex background interference. Moreover, it is worth noting that these images are also taken on different filming angles and filming distances. After that, we compare the proposed network with two state-of-the-art insulator fault detection methods as their first steps are also to locate the insulator positions in aerial images, some results are shown in Figure 6. The first column to the third column of the Figure 6 depicts the performances of the proposed method, method [20] and method [11], respectively. Based on the observation of Figure 6, it is found that all of the three methods achieve good results when dealing with a pure background (i.e., clean sky, the first row of Figure 6), while method [11] and method [20] are extremely sensitive to different types of background interference. In contrast, due to the suitable training on a large number of insulator samples, the proposed network can detect the insulator positions more accurately. To further measure the performances of the proposed network in the aerial videos, two UAV-based insulator aerial videos filmed in China are selected to test the proposed network, as shown in Figure 7. Based on the observations of Figure 7, it is found that the proposed network can obtain accurate positions of insulators continuously, which makes a good foundation for subsequent insulator faults detection.   [20] and method [11], respectively. The labels in results of the proposed method are marked with the word 'Insulator', please zoom in for a better view. Based on the observation of Figure 6, it can be seen the proposed method can accurately detect the insulators in different aerial scenes.   [20] and method [11], respectively. The labels in results of the proposed method are marked with the word 'Insulator', please zoom in for a better view. Based on the observation of Figure 6, it can be seen the proposed method can accurately detect the insulators in different aerial scenes.  [20] and method [11], respectively. The labels in results of the proposed method are marked with the word 'Insulator', please zoom in for a better view. Based on the observation of Figure 6, it can be seen the proposed method can accurately detect the insulators in different aerial scenes.

Analysis of the Proposed Insulator Multi-Fault Detection Method
To verify the importance of the step 2 in the proposed insulator faults detection method (i.e., Section 2.2), an ablation experiment is developed and the results are shown in Figure 8. The average precision rate and the average recall rate are used to exhibit all the experimental results in Section 3.2, and we follow the rounding principle to keep only one digit after the decimal point.
In formula (6), n indicates the total image number in a testing set. Precision (i) and Recall (i) show the precision rate and the recall rate when detecting the ith image, respectively. Precision and Recall indicate the average precision rate and the average recall rate in a testing set, respectively. Based on the observation of Figure 8, it is found that by setting the threshold of the aspect ratio of the contour for interference contour filtering, the precision rate can be increased by 14% and the recall rate by 10.8%. To verify the rationality of the choice of 1:5, different ratios are selected to perform experiments on our dataset. The results are shown in Table 5. It can be seen that the 1:5 is the most appropriate ratio, which achieves precision rate 96.3%, and recall rate 93.3%.

Analysis of the Proposed Insulator Multi-Fault Detection Method
To verify the importance of the step 2 in the proposed insulator faults detection method (i.e., Section 2.2), an ablation experiment is developed and the results are shown in Figure 8. The average precision rate and the average recall rate are used to exhibit all the experimental results in Section 3.2, and we follow the rounding principle to keep only one digit after the decimal point.
Precision (i) Recall (i) In formula (6), n indicates the total image number in a testing set. Precision (i) and Recall (i) show the precision rate and the recall rate when detecting the ith image, respectively. Precision and Recall indicate the average precision rate and the average recall rate in a testing set, respectively. Based on the observation of Figure 8, it is found that by setting the threshold of the aspect ratio of the contour for interference contour filtering, the precision rate can be increased by 14% and the recall rate by 10.8%. To verify the rationality of the choice of 1:5, different ratios are selected to perform experiments on our dataset. The results are shown in Table 5. It can be seen that the 1:5 is the most appropriate ratio, which achieves precision rate 96.3%, and recall rate 93.3%.  To verify the performances of the proposed insulator fault detection method, we compare it with methods [11] and [20], which are considered to be the state-of-the-art methods. Moreover, for a fair comparison, all the insulator fault images are normalized to the same size of 800 × 530 as the way used in the work of [11,20]. For the insulator one fault detection, some typical images are selected and tested as shown in Figure 9. The first column to the third column depicts the performance of the proposed method, method [20], and method [11], respectively. The detected insulator fault positions are marked with red bounding boxes. It can be observed that the bounding boxes detected by the proposed method are much closer to the real insulator faults, which means that the proposed method yields more accurate results. On the contrary, the methods [11,20] are quite sensitive to the background interference and result in error detections. In the second row of Figure 8, method [20] does not detect the insulator fault, which leads to a wrong judgment that the working state of the insulators is normal. In the third row of Figure 9, the performance of method [11] is affected by the complex background interferences and part of the power tower is regarded as an insulator fault. To verify the accuracy of the proposed method in the case of insulator multi-fault detection, we choose some typical insulator multi-fault images from our dataset, some examples are shown in Figure 10. Based on the observation of Figure 10, the methods [11,20] can only detect one insulator fault, while the proposed method can detect not only one insulator fault but also insulator multi-fault. To verify the performances of the proposed insulator fault detection method, we compare it with methods [11,20], which are considered to be the state-of-the-art methods. Moreover, for a fair comparison, all the insulator fault images are normalized to the same size of 800 × 530 as the way used in the work of [11,20]. For the insulator one fault detection, some typical images are selected and tested as shown in Figure 9. The first column to the third column depicts the performance of the proposed method, method [20], and method [11], respectively. The detected insulator fault positions are marked with red bounding boxes. It can be observed that the bounding boxes detected by the proposed method are much closer to the real insulator faults, which means that the proposed method yields more accurate results. On the contrary, the methods [11,20] are quite sensitive to the background interference and result in error detections. In the second row of Figure 8, method [20] does not detect the insulator fault, which leads to a wrong judgment that the working state of the insulators is normal. In the third row of Figure 9, the performance of method [11] is affected by the complex background interferences and part of the power tower is regarded as an insulator fault. To verify the accuracy of the proposed method in the case of insulator multi-fault detection, we choose some typical insulator multi-fault images from our dataset, some examples are shown in Figure 10. Based on the observation of Figure 10, the methods [11,20] can only detect one insulator fault, while the proposed method can detect not only one insulator fault but also insulator multi-fault. Figure 9. Experimental results in aerial images with insulator one fault. In each row, the first column to the third column depicts the performance of the proposed method, method [20], and method [11], respectively. Figure 10. Experimental results in aerial images with insulator multi-fault. In each row, the first column to the third column depicts the performance of the proposed method, method [20], and method [11], respectively.
To further quantitatively evaluate the performances of the above methods, we test them on two sub-datasets in which the one contains only insulator one fault images while another one contains Figure 9. Experimental results in aerial images with insulator one fault. In each row, the first column to the third column depicts the performance of the proposed method, method [20], and method [11], respectively. Figure 9. Experimental results in aerial images with insulator one fault. In each row, the first column to the third column depicts the performance of the proposed method, method [20], and method [11], respectively. Figure 10. Experimental results in aerial images with insulator multi-fault. In each row, the first column to the third column depicts the performance of the proposed method, method [20], and method [11], respectively.
To further quantitatively evaluate the performances of the above methods, we test them on two sub-datasets in which the one contains only insulator one fault images while another one contains Figure 10. Experimental results in aerial images with insulator multi-fault. In each row, the first column to the third column depicts the performance of the proposed method, method [20], and method [11], respectively.
To further quantitatively evaluate the performances of the above methods, we test them on two sub-datasets in which the one contains only insulator one fault images while another one contains only insulator multi-fault images. Specifically, each sub-dataset contains 60 samples, and the experimental results are shown in Figure 11a,b. Based on the observation of Figure 11a, the proposed method achieves an precision rate of 94.2%, which is much higher than those of method [11] (precision rate: 65%) and method [20] (precision rate: 50%). When considering the multi-fault detection results, the proposed method, method [11], and method [20] achieve precision rates of 98.3%, 88.3%, and 80.8%, respectively. In addition, the average running times of the three methods are also analyzed through the detection of all the 120 aerial insulator faults images, as shown in Table 6. The proposed method takes less running time than the two compared methods. In general, it can be concluded that whether in the insulator one fault detection or the insulator multi-fault detection, the proposed method achieved higher precision rates and lower running time compared with method [11] and method [20]. Most importantly, the proposed method not only achieves high precision rates, but also maintains satisfactory recall rates. only insulator multi-fault images. Specifically, each sub-dataset contains 60 samples, and the experimental results are shown in Figure 11a,b. Based on the observation of Figure 11a, the proposed method achieves an precision rate of 94.2%, which is much higher than those of method [11] (precision rate: 65%) and method [20] (precision rate: 50%). When considering the multi-fault detection results, the proposed method, method [11], and method [20] achieve precision rates of 98.3%, 88.3%, and 80.8%, respectively. In addition, the average running times of the three methods are also analyzed through the detection of all the 120 aerial insulator faults images, as shown in Table 6. The proposed method takes less running time than the two compared methods. In general, it can be concluded that whether in the insulator one fault detection or the insulator multi-fault detection, the proposed method achieved higher precision rates and lower running time compared with method [11] and method [20]. Most importantly, the proposed method not only achieves high precision rates, but also maintains satisfactory recall rates.
(a) Insulator one fault detection.
(b) Insulator multi-fault detection. Figure 11. The detection performances of the proposed method and two compared methods on 60 insulator one fault images and 60 multi-fault images. The proposed method not only achieves high precision rates, but also maintains satisfactory recall rates. Figure 11. The detection performances of the proposed method and two compared methods on 60 insulator one fault images and 60 multi-fault images. The proposed method not only achieves high precision rates, but also maintains satisfactory recall rates.

Running Times (s/per Image)
Method [11] 0.677 Method [20] 0.525 The proposed method 0.127 To validate the robustness of the proposed method in different aerial scenes, 90 images are selected and then divided into three sub-datasets: (A) Different backgrounds, (B) different filming angles, and (C) different filming distances. Specifically, each category contains 30 images, half of which contain the insulator one fault, and the other half of which contain the insulator multi-fault. Then, both the proposed method and the two compared methods are tested on each sub-dataset. The results are shown in Table 7. Based on the observation of Table 7, it is observed that the proposed method achieves better results on the three sub-datasets than the Method [11,20]. Based on the observations of the above experimental results, it can be concluded that the proposed method is more effective and efficient than the two compared methods. The possible reasons for this good performance are given as follows. First, a large number of insulator aerial images are collected to create an unprecedented dataset 'InST_detection' for the proposed network training. Second, the proposed network detected the accurate insulator positions and then removed the complex background interference, which potentially improved the accuracy of the subsequent insulator faults detection. Finally, we systematically analyzed the features of the insulator faults and then propose an effective method to filter the interference contours, which further increased the accuracy of insulator faults detection. Figure 12 shows more detection results of the proposed method in different aerial scenes.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 18 of 22 Table 6. The running times of the different methods.

Methods
Running Times (s/per Image) Method [11] 0.677 Method [20] 0.525 The proposed method 0.127 To validate the robustness of the proposed method in different aerial scenes, 90 images are selected and then divided into three sub-datasets: (A) Different backgrounds, (B) different filming angles, and (C) different filming distances. Specifically, each category contains 30 images, half of which contain the insulator one fault, and the other half of which contain the insulator multi-fault. Then, both the proposed method and the two compared methods are tested on each sub-dataset. The results are shown in Table 7. Based on the observation of Table 7, it is observed that the proposed method achieves better results on the three sub-datasets than the Method [11,20]. Based on the observations of the above experimental results, it can be concluded that the proposed method is more effective and efficient than the two compared methods. The possible reasons for this good performance are given as follows. First, a large number of insulator aerial images are collected to create an unprecedented dataset 'InST_detection' for the proposed network training. Second, the proposed network detected the accurate insulator positions and then removed the complex background interference, which potentially improved the accuracy of the subsequent insulator faults detection. Finally, we systematically analyzed the features of the insulator faults and then propose an effective method to filter the interference contours, which further increased the accuracy of insulator faults detection. Figure 12 shows more detection results of the proposed method in different aerial scenes.

Conclusions and Future Works
In this paper, an accurate and robust method is proposed for insulator faults detection in UAV-based aerial images. The proposed method consists of two steps: (1) A novel neural network is developed to obtain accurate insulator positions; (2) An RoI-based method is designed to highlight the insulator fault locations. Experimental results on various insulator aerial images validate the proposed method had higher average precision rates and lower running times compared with two state-of-the-art methods. Most importantly, whether it is in the insulator one fault detection or multi-fault detection, the proposed method can obtain not only high average precision rates, but also high average recall rates. Since the insulator fault is a common accident that damages the operation of the power grid, the proposed method has high prospects for implementation in high-voltage transmission lines inspection applications for unmanned aerial vehicles.
Although the proposed method indeed promotes the quality of insulator faults detection in most of the aerial images, it is still not a real-time solution. Considering the actual problem that current insulator faults images are not sufficient to directly train a deep network for insulator faults detection, more insulator fault simulated images should be created by our proposed data augmentation method, and the future work can be developed a deep learning framework to detect insulators positions and faults positions simultaneously. In addition, with the development of UAV flight control technology, there is an important and meaningful need for work in the future to explore the insulator faults detection in bad weather conditions, such as foggy days.