OTL-Classiﬁer: Towards Imaging Processing for Future Unmanned Overhead Transmission Line Maintenance

: The global demand for electric power has been greatly increasing because of industrial development and the change in people’s daily life. A lot of overhead transmission lines have been installed to provide reliable power across long distancess. Therefore, research on overhead transmission lines inspection is very important for preventing sudden wide-area outages. In this paper, we propose an Overhead Transmission Line Classiﬁer (OTL-Classiﬁer) based on deep learning techniques to classify images returned by future unmanned maintenance drones or robots. In the proposed model, a binary classiﬁer based on Inception architecture is incorporated with an auxiliary marker algorithm based on ResNet and Faster-RCNN(Faster Regions with Convolutional Neural Networks features). The binary classiﬁer deﬁnes images with foreign objects such as balloons and kites as abnormal class, regardless the type, size, and number of the foreign objects in a single image. The auxiliary marker algorithm marks foreign objects in abnormal images, in order to provide additional help for quick location of hidden foreign objects. Our OTL-Classiﬁer model achieves a recall rate of 95% and an error rate of 10.7% in the normal mode, and a recall rate of 100% and an error rate of 35.9% in the Warning–Review mode.


Introduction
Nowadays, people's daily life and industrial facilities are highly dependent on electric power.Therefore, research on electric power facilities inspection and maintenance is very important for ensuring a stable power supply.A lot of overhead transmission lines have been installed to distribute energy across long distances in the world.It is meaningful to prevent sudden wide-area outages caused by foreign objects suspended on uninsulated overhead transmission lines.
At present, foreign objects could be detected by foot patrol, piloted helicopter patrol, drones inspection, and transmission line robots inspection.Foot patrol is risky or unable to pass through complex areas such as highways, rivers, and mountains.Helicopter inspection is expensive and also limited by the shortage of pilots.Though unmanned drones and specialized robots are still not used in practice for various limitations such as path planning, law, and regulations.However, they are still highly considered by the electric power field for future maintenance of smart grids.
The main challenges of using drones for UAV (Unmanned Aerial Vehicles) maintenance are automatic pilot, flight time and communication bandwidth.The authors in [1] aim at solving automatic transmission line tracking problems.The authors in [2,3] face path planning and routing challenges when UAVs are flying along power transmission lines.The authors in [4,5] study wireless charging techniques for the increasing of drone's flight time.The authors in [6] focus on UAV communication toward 5G, which supports high-speed camera data transmission.
The PTL (Power Transmission Line, PTL) maintenance robot equipped with cameras can walk through transmission line for inspection.It is possible to perform inspection and maintenance work at a low cost in the future.Recently, transmission lines have been built with bundled conductors because of the increasing power demand.However, conventional robots can only inspect a line while traveling along it [7].Thus, most research focused on developing new robot architectures for smart navigation over bundle transmission lines [8][9][10].
With the rapid hardware development of smart drones and PTL maintenance robots, the demand of automatic data processing for transmission line inspection will increase quickly.A number of research works have been carried out to extract transmission lines, insulators, and foreign objects from aerial images automatically.The authors in [11] use Robot LiDAR data for cable inspection, [12] extracts power lines based on Markov Random Field theory, foreign objects are detected with a morphology-based approach [13] and a motion compensation based method [14], and all of them use traditional algorithms.As reviewed in [15], the potential of deep learning in power line inspection is promising.For example, the automated inspection of insulator [16], transmission towers [17], and transmission lines [18] based on deep learning have already been carried out.The detection of foreign objects on transmission lines based on Faster-RCNN and YOLO (You Only Look Once) were also studied in [19,20] respectively.However, the foreign object image used for the experiment are images with foreign objects by default, so the algorithm does not have the classification function.In addition, the amount of data they use for experiments is very small, and the number of images in our dataset is more than 10 times that of them.There are also some detection algorithms for the detection of insulators.It is more challenging to detect foreign objects without a fixed shape compared to insulators with regular shapes.
Enlightened by image classification and object detection architectures based on deep learning (i.e., VGG [21], ResNet [22], Inception [23,24], Faster RCNN [25], and SSD (Single Shot MultiBox Detector) [26]), a two-stage approach is proposed for automated image processing, which detects and marks foreign objects in the image.The model is trained, fine-tuned, and tested with images collected by electric maintenance departments.The reminder of the article is organized as follows: Section 1 reviews related work.Section 2 presents the methodology of the proposed model.Section 3 describes the preparation of data set.In Section 4, the experiment is analyzed and discussed.Finally, conclusions and contributions of this work are drawn in Section 5.

Problem Statement
Detecting foreign objects on overhead transmission lines is a very important work regarding power system maintenance.Overhead transmission lines are a primary method for transmitting high-voltage power across long distances.The high energy of transmission lines requires very thick insulation to prevent the insulating material from catching fire itself.If they are insulated, the insulation would make power distribution lines too costly and very heavy and thus unlikely to set up in air.Thus, unlike low-voltage cable, overhead transmission lines don't have insulation, they are insulated by air.During high-wind events, foreign objects such as plastic greenhouses, kites, and balloons blew onto overhead transmission lines, thus prone to short-cuts or electrical sparks, causing power trips during humid seasons or wildfires during dry seasons.
In this study, we collected and sorted out the images that were retained during the manual cleaning of foreign objects in the transmission line, as shown in Figure 1.In addition, in the classification and marking, whether it is balloons, kites, or plastic greenhouses, we are uniformly classified as one class foreign object.(d) Data we collected included not only colorful balloons and kites, but also translucent plastic and black agricultural greenhouses.(e) The contrast between the translucent plastic and the sky is not so obvious, and the black plastic shed is easily confused with the trees.

Warning-Review Strategy
In the first part, the algorithm workflow that constitutes the whole 'foreign object image classification-warning-personnel review' is introduced.We also introduce the framework used in the image classification algorithm and the target detection algorithm.In the second part, the preparation and division process of the data used in the experiment are described.
All of the test images in references [19,20] are images with foreign objects, which is equivalent to artificially removing the interference image without foreign objects before their foreign object detection algorithm detects it.
However, the images collected by the current intelligent inspection equipment contain a large number of images without foreign objects.In order to get closer to the real inspection situation, in this paper, the images used for training and testing algorithms are composed of the image with foreign objects and the image without foreign objects.The whole process is shown in Figure 1, which is the biggest difference between the research work of this paper and the previous research.After the mixed inspection image passes the classification algorithm of the first stage, the image with a foreign object may be marked by the classification algorithm, thereby alerting the power grid staff and prompting the staff to review the image with the alarm.
In the first stage, this paper focuses on the 100% recall rate algorithm and trains and tests SVM (Support Vector Machine), InceptionV3-retrain, InceptionV3-fine-tuning, and InceptionV4-fine-tuning.In the second stage, the inspection image marked as the presence of foreign object is sent to the target detection algorithm, and the region where the foreign object exists in the image is located and marked with a rectangular frame.The significance of this step is to assist the staff to quickly determine the type of foreign object and locate the position of the suspended foreign object.This paper has trained and tested SSD, Faster-RCNN , Faster-RCNN, and Faster-RCNN in this section.All the algorithm structures are concentrated in Figures 2 and 3. Please note that the rectangular block in the network structure is only indicative and does not represent the true size of a layer in the actual network.When we test the algorithm, all of the 753 images in the testing set are first input into the classifier of the first stage, and then according to the set classification threshold, a part of the images in the testing set are marked as "images with foreign objects" by the classifier.Finally, only the image marked by the classifier is sent to the foreign object indicator of the second stage for foreign object detection.The entire algorithm flow is shown in Figure 3.

SVM
The SVM algorithm flow can be represented as the first part of Figure 4. SVM is a generalized linear classifier that classifies data according to the supervised learning method.Its decision boundary is the maximum margin hyperplane for solving learning samples.The purpose of the SVM is to find a hyperplane to divide the samples into two categories with the largest interval.The ω obtained by the algorithm represents the coefficient of the hyperplane that the algorithm needs to find.In mathematical terms, it can be described as Label (1), max 1 where y i ∈ {−1, 1}.The larger the score obtained by y i (ω T x i + b), the greater the probability of predicting the category.Each image input into the SVM is compressed into a matrix of [1 × 3072].
There are two categories in the power line image classification.Therefore, the size of the matrix ω is  Except for the last layer, the parameters of the other layers are all solidified and cannot be updated, so the training speed is faster and less time-consuming.

Inception Fine-Tuning
The InceptionV3/V4-fine-tuning algorithm flow can be expressed as the third part of Figure 4, and the foreign object classification model is fine-tuned under the InceptionV3 and InceptionV4 models provided by Google.The fine-tuning mode is to use a CKPT (checkpoint) file, which derived from the InceptionV3 or InceptionV4 model based on ImageNet image training.During the training process, the parameters of the entire network can be modified accordingly, not only limited to the replaced softmax layer.The fine-tuning for InceptionV3/V4 is done by loading the pre-trained model without loading the parameters of the Logits layer and AuxLogits layer, and then fixing the parameters of all layers before.The foreign object training data set only trains the newly created Logits layer and AuxLogits layer.
When fine-tuning the model, restoring checkpoint weights requires attention.In particular, when a new task is fine-tuned with an output tag different from the number of ImageNet detection tasks, the final classification layer cannot be restored.Therefore, this paper uses the checkpoint_exclude_scopes flag, which prevents certain variables from being loaded.For example, if the ImageNet trained model is fine-tuned on the foreign object classification dataset, the pre-trained logits layer has dimensions [2048 × 1001], but the new logits layer has dimensions [2048 × 2].Therefore, the flag indicates that the TF-Slim avoids loading these weights from the checkpoint.

SSD with VGG16
The SSD (with VGG16) algorithm flow can be represented as the first part of Figure 5.
When training the target detection algorithm, the training data used are manually labeled foreign object images.The VGG-16 (Visual Geometry Group Network 16) model has sixteen convolutional layers and five pooled layers and three fully connected layers connected to a softmax layer.Conv4, Conv7, Conv8, Conv9, Conv10, and Conv11 are extracted separately in SSD as the feature layer of classification and box regression.In the SVM model experiment, each image is scaled to a size of 300 × 300, and the number of predicted classifications (ClassesNum) of the six feature layers according to Formula (2) is 2, where ObjectNum represents the number of manually labeled categories, and one represents an additional background classification.

Faster-RCNN with VGG16
The Faster-RCNN (with VGG16) algorithm flow can be represented as the second part of Figure 5.In the framework of the Faster-RCNN algorithm, the input image is extracted by the convolutional network of the feature extraction layer, and the feature map output by the specified convolution layer is used as the input of the RPN.The feature map obtained under different convolutional structures has different characterization capabilities for input images.In this experiment, the Faster-RCNN structure using VGG16 as the feature extraction network is first tested.This paper standardizes the image size to 1000 × 600 as input, which is consistent with the original author's parameter settings in [18].The feature map output by the convolution layer Conv5 is used as the input of the RPN, and nine different size anchor boxes are generated according to the regulations at each anchor point.All bounding boxes with high confidence in the anchor box are selected as a region proposal and sent to the full convolution layer through ROI (Region of Interest) Pooling to obtain the category confidence and regression box of the detected image.

ResNet
The Faster-RCNN (with ResNet50/ResNet101) algorithm flow can be represented as the third part of Figure 5.The overall algorithm flow of Faster-RCNN (with ResNet50/ResNet101) is basically the same as the previous one.The original VGG16 network is replaced by the ResNet network in the feature extraction layer.At the same time, considering the performance of the experimental server, the input image is adjusted.The image size of the input algorithm is 500 × 300, which enables the experimental server to completely load a large network such as ResNet in a relatively low hardware configuration.

Data Set Preparation
When we divide the data set, the ratio of positive and negative samples in the training set is about 1:1.6.This division belongs to a balanced division mode, which helps the algorithm to learn more key features of the classification in the learning phase.However, the ratio of positive and negative samples on the test set is about 1:6, which is to simulate the real situation that the foreign object accident is a low-frequency high-risk electric accident.Most of the real inspection images are images without foreign objects.
As shown in Table 1, in the training of the classification algorithm, 305 images with foreign objects are used as positive samples, and 500 images without foreign objects are used as negative samples.Because we prepare training and testing data for SVM based on the cifar-10 data format, in this paper, only the amount of data used by SVM has been trimmed.There are 300 images with foreign objects and 500 images without foreign objects in the SVM training set; 100 images with foreign objects and 500 images without foreign objects in the SVM testing set.In this paper, the image is processed according to the rules into a dictionary format that Python can read quickly, as shown in Figure 6.For the training of the target detection algorithm in the foreign object indicator, this paper only used 305 foreign object images, and the images were manually labeled.This is because the target detection algorithm uses anchors for training.The positive samples during training are taken up by the proposal region with the artificially labeled ground truth IOU (Intersection over Union) value being the largest or larger than the set threshold.The negative sample is assumed by the proposal region with the ground truth, whose IOU value is less than the set threshold.Therefore, there can be no negative sample input during training.The experimental environment of the above experiment is one server, and its specific parameters are as follows: The Tensorflow experimental framework is the Linux and Windows10 environment.The software environment is Python3.6,CUDA9.0,cuDNN7.0,and Tensorflow1.8;and the hardware environment is a Lenovo ZHENGJIUZHE REN7000 desktop PC produced in China, it is equipped with Intel i7-8700 Core , 8 GB memory and NVIDIA GeForce GTX 1060 GPU.

Calculation Formula for Classifier Evaluation
After the above test, the paper analyzes the classification results of the first stage, and draws the receiver operating characteristic (ROC) curves of the SVM, InceptionV3-retrain, InceptionV3-fine-tuning and InceptionV4-fine-tuning models, as shown in Figure 7.The abscissa of the ROC curve is false positive rate (FPR), the ordinate is true positive rate (TPR), and the area under curve (AUC) is defined as the area under the ROC curve.When a positive sample and a negative sample are randomly selected, the probability that the current classification algorithm ranks the positive example before the negative example based on the calculated score is the AUC value.Therefore, the larger the value of AUC, the more likely the current classification algorithm is to sort the positive samples before the negative samples, which enables better classification.The above values can be calculated using Equations ( 3), ( 4) and ( 5), respectively, where TP : True positive; TN : True negative; FP : False positive; TN : True negative.At the same time, the point selected according to the Youden index is plotted in Figure 7.The ROC curve is often used as an evaluation curve for medical diagnosis.When a comprehensive evaluation of the diagnosis results is required, the sensitivity and specificity can be given the same weight in the medical field.It is characterized by the same significance of the missed diagnosis rate and the misdiagnosis rate of the research object.The larger the Youden index, the better the screening ability.See (6) for the calculation method.The x-axis of the ROC curve is (1 − speci f icity), so the final formula can be simplified to Label (7): 4.1.2.Calculation Formula for Evaluation of Foreign Object Indicator The foreign object indicator performance index based on the target detection algorithm is the recall rate and accuracy rate, where TP + FN = 126.AveragePrecision(AP) is the integral of the PR-curve, which is the area under the curve ( 8)-( 10): Average Precision =

Classification Performance Evaluation
In Figure 7, the blue dashed line is referred to as the "random chance", which means that the probability of the sample being classified as a positive or negative sample is random.In the ROC coordinate system, the classification threshold at the (0, 0) point is the largest, and the classification threshold at the (1, 1) point is the smallest.According to the ROC curve, it can be clearly observed that when the classification recall rate is 100% (corresponding to the ordinate value is 1), the InceptionV3-retrain model has the lowest error rate.At the same time, the InceptionV3-retrain model reduced the AUC value by 0.4% compared to the InceptionV3-fine-tuning model, but reduced the error rate by 8%.  1) in the case of a 100% recall.The threshold points of the classifier are marked with triangles, indicating that all images with foreign objects can be screened when the images are classified by the threshold corresponding to the points.The abscissa value corresponding to the threshold point is the misclassification rate, which means that the triangle mark closer to the left side corresponds to the lower classification error rate, and the performance of the algorithm is better.(2) We calculate the optimal classification threshold by using the Yoden index.The optimal threshold point of the classifier is marked by a pentagon, which means that only a part of the foreign object image can be selected when the image is classified by the threshold value of the mark point, and the classification error rate is also decreased.(3) InceptionV3-fine-tuning has AUC = 0.977 as the maximum value and SVM has AUC = 0.843 as the minimum value.
In the foreign object image detection task, the paper should focus more on finding out all the foreign objects in the image.InceptionV3-retrain, as the first stage classifier, can maintain the lowest error rate among the four classification algorithms under the premise of 100% recall.Although the error rate of InceptionV3-retrain is reduced by 25% at the optimal threshold classification point, the algorithm cannot classify all the foreign object images, and there is a security risk in the scene of the foreign object inspection of the transmission line.

Automate Marking Performance Evaluation
In Table 2 this paper, the InceptionV3-retrain algorithm is selected as the classifier when the recall rate is 100%, and the corresponding classification threshold is 0.102.The remaining 233 sheets are all misclassified images with an error rate of 36%.In addition, 334 classified images are used as the input of the second stage target detection algorithm.The target detection algorithm PR curve is shown in Figure 8.When 0.5 is used as the display threshold, only the bounding box whose confidence is higher than this threshold is displayed on the test picture.The specific values are shown in Table 3.The total box indicates the total number of bounding boxes that the algorithm ultimately presents to the power grid staff, and the TP number indicates the number of targets that are correctly found.The missed target indicates the number of bounding boxes that the algorithm missed.Target detection precision and Target detection recall are calculated according to Equations ( 8) and ( 9), respectively.

Conclusions
In this paper, we introduce an OTL-Classifier, a binary classifier with an auxiliary automate marker module.Compared to recent research, our method is much more application oriented.We have three main differences:

•
Our OTL-Classifier module can classify images with and without foreign objects.However, recent research only processes images with foreign objects; they focused on detecting the type and location of the foreign objects in the abnormal images.However, aerial images return by drones and robots inspection include much more normal images than abnormal images.Searching abnormal images manually is not only time-consuming, but also has poor precision due to attention feature of human.Therefore, it is much more important to design a module which could automatically extract abnormal images directly from original images returned by unmanned vehicles.

•
During the evaluation phase, we consider recall rate as more important than precision in our application.A sudden wide-area outage caused by even one undetected foreign object will affect people's lives and industrial production seriously and may lead to a lot of economic loss.Therefore, we think it is very critical to have a recall rate of 100%, so no abnormal images will be missed during classification.

•
Most recent research evaluated detection speed.For example, RCNN4SPL module spends 230 ms per frame, YOLOv3 based module is 46 ms in average, Morphology based module is 95.8 ms in average, and Motion compensation-based module is 64 ms.We didn't test execution time because it is highly dependent on the hardware.In addition, in our application, we don't have a very high timing requirement as path planning for automatic drive.
In this article, we have evaluated the classification performance of SVM and three Inception variants, and the marking performance of SSD, Faster-RCNN with VGG, and ResNet.Experiments shows our module based on Inceptionv3-retrain, and Faster-RCNN with ResNet101 achieves best performance on the data set we collected from electric maintenance departments.
We summarize our contributions as follows: • We proposed an OTL-Classifier module; it can classify images with and without foreign objects.It can work in either Warning-Review mode or Normal mode.

•
In the normal mode, the OTL-Classifier works the same as most common classification tasks, the module uses optimal parameters that balances recall rate and error rate.It can achieve a recall rate of 95% and an error rate of 10.7%.

•
In the Warning-Review mode, the OTL-Classifier achieves a recall rate of 100% and an error rate of 35.9%.It has a two-stage workflow.In the first stage, the binary classifier module provides the warning.In the second stage, the automated marker module helps electric workers review the image quickly.This strategy can prevent outage caused by foreign objects and save more than half of the time on image checking.Our future work will focus on decreasing the error rate with a recall rate of 100%.

Figure 1 .
Figure 1.Sample images in the data set.(a) Balloons, kites and agricultural plastic which are the main foreign objects hanging on overhead transmission lines.(b) Images without foreign objects which have been used as negative samples of the classification task.These negative sample images also contain high-voltage towers and transmission lines, as well as daily images collected by inspection equipment.(c) Foreign objects of different sizes.Some tiny foreign objects are not easy to detect.(d) Data we collected included not only colorful balloons and kites, but also translucent plastic and black agricultural greenhouses.(e) The contrast between the translucent plastic and the sky is not so obvious, and the black plastic shed is easily confused with the trees.

Figure 3 .
Figure 3. Two-stage foreign object detection flow chart.

Figure 5 .
Figure 5. Foreign object indication based on a target detection algorithm.

Table 1 .ForeignFigure 6 .
Figure 6.Cifar-10 data storage rules.The data stored in the [32 × 32 × 3] matrix are the compressed data of the original image.Labels store image tags, image with foreign objects is 1, and image without foreign objects is 0. File names are stored in filenames.The batch label stores the image as being divided into training or test tags.

Figure 7 .
Figure 7. ROC of the classifier.(1) in the case of a 100% recall.The threshold points of the classifier are marked with triangles, indicating that all images with foreign objects can be screened when the images are classified by the threshold corresponding to the points.The abscissa value corresponding to the threshold point is the misclassification rate, which means that the triangle mark closer to the left side corresponds to the lower classification error rate, and the performance of the algorithm is better.(2) We calculate the optimal classification threshold by using the Yoden index.The optimal threshold point of the classifier is marked by a pentagon, which means that only a part of the foreign object image can be selected when the image is classified by the threshold value of the mark point, and the classification error rate is also decreased.(3) InceptionV3-fine-tuning has AUC = 0.977 as the maximum value and SVM has AUC = 0.843 as the minimum value.

Table 2 .
(1)ssification algorithm data.Target detection algorithm PR (Precision-Recall) curve.(1)In the figure, each target detection algorithm draws two PR curves.One is a PR curve that is directly input with 753 sheets as an algorithm without going through the classification process.The other is the PR curve drawn with 334 images as the algorithm input.(2) After using the two stages framework, each algorithm has a different degree of increase in the AP value, which is since the first stage classifier filters out 60% of the negative samples.
(3)The SSD has the highest 0.7485 AP value, while the Faster-RCNN (with VGG16) AP is the lowest 0.64 of all experiments.

Table 3 .
Target detection algorithm data.