Design and Implementation of UAVs for Bird’s Nest Inspection on Transmission Lines Based on Deep Learning

: In recent years, unmanned aerial vehicles (UAV) have been increasingly used in power line inspections. Birds often nest on transmission line towers, which threatens safe power line operation. The existing research on bird’s nest inspection using UAVs mainly stays at the level of image postprocessing detection, which has poor real-time performance and cannot obtain timely bird’s nest detection results. Considering the above shortcomings, we designed a power inspection UAV system based on deep learning technology for autonomous ﬂight, positioning and photography, real-time bird nest detection, and result export. In this research, 2000 bird’s nest images in the actual power inspection environment were shot and collected to create the dataset. The parameter optimization and test comparison for bird’s nest detection are based on the three target detection models of YOLOv3, YOLOv5-s, and YOLOX-s. A YOLOv5-s bird’s nest detection model optimized for bird’s nest real-time detection is proposed, and it is deployed to the onboard computer for real-time detection and veriﬁcation during ﬂight. The DJI M300 RTK UAV was used to conduct a test ﬂight in a natural power inspection environment. The test results show that the mAP of the UAV system designed in this paper for bird’s nest detection is 92.1%, and the real-time detection frame rate is 33.9 FPS. Compared with the previous research results, this paper proposes a new practice of using drones for bird’s nest detection, dramatically improving the real-time accuracy of bird’s nest detection. The UAV system can efﬁciently complete the task of bird’s nest detection in the process of electric power inspection, which can signiﬁcantly reduce manpower consumption in the power inspection process.


Introduction
Power line inspection effectively ensures safe transmission network operation, and transmission line tower inspection is an important part of the power line inspection work.In recent years, birds have frequently nested on transmission line towers, which seriously threatens safe power line operation.At present, the area of high-voltage overhead transmission lines is constantly expanding, and many lines are located in mountainous areas with complex terrain.In this environment, birds often build their nests on transmission line towers or insulators [1].
In the past, inspecting transmission lines was mainly performed manually.In environments with complex terrain, the efficiency of manual inspection was low.With the rapid development of UAV technology, the application of UAV power inspection technology can greatly improve transmission line inspection work efficiency.Traditional UAV power inspection mainly relies on manual UAV flight control, and the transmission line towers are photographed in the form of aerial photography.A large number of generated photos need to be manually checked and recorded after the flight.With the development of artificial intelligence technology, an increasing number of scholars are paying attention to UAV autonomous power inspection technology [2][3][4][5][6].
The core of realizing autonomous UAV power inspection is to perform automatic target detection on images.In recent years, target detection based on deep learning has become a popular research direction in the computer vision field.Many new target detection algorithms have been proposed and applied in the autonomous UAV power inspection field.From 2014 to the present, starting with the R-CNN algorithm proposed in [7], these algorithms have used deep learning technology to automatically extract hidden features in input images to classify and predict samples with a higher accuracy.With the continuous breakthroughs in deep learning and computer vision, many deep learning-based image target detection algorithms have emerged after R-CNN, such as Fast R-CNN [8], Faster R-CNN [9], and YOLO.
Deep learning is a class of multilayer neural network algorithms that can automatically learn the internal structure of the data hidden in the training data through supervised, semi-supervised, or unsupervised training methods.According to whether there is an explicit region proposal, the target detection algorithm can be divided into a two-stage target detection algorithm and a one-stage target detection algorithm.The two-stage target detection algorithm is also called the target detection algorithm based on the region proposal or the target detection algorithm based on the region of interest.This type of algorithm transforms the detection problem into a classification problem with pictures in the generated proposed region through explicit region proposals.Representative twostage object detection algorithms include R-CNN and Fast R-CNN.The one-stage target detection algorithm is also called the regression-based target detection algorithm.This type of algorithm does not directly generate the region of interest but regards the target detection task as a regression task for the entire image.Representative one-stage target detection algorithms include YOLO and SSD. Figure 1 shows the general framework of these two classes of object detection algorithms.
The core of realizing autonomous UAV power inspection is to perform automatic target detection on images.In recent years, target detection based on deep learning has become a popular research direction in the computer vision field.Many new target detection algorithms have been proposed and applied in the autonomous UAV power inspection field.From 2014 to the present, starting with the R-CNN algorithm proposed in [7], these algorithms have used deep learning technology to automatically extract hidden features in input images to classify and predict samples with a higher accuracy.With the continuous breakthroughs in deep learning and computer vision, many deep learningbased image target detection algorithms have emerged after R-CNN, such as Fast R-CNN [8], Faster R-CNN [9], and YOLO.
Deep learning is a class of multilayer neural network algorithms that can automatically learn the internal structure of the data hidden in the training data through supervised, semi-supervised, or unsupervised training methods.According to whether there is an explicit region proposal, the target detection algorithm can be divided into a two-stage target detection algorithm and a one-stage target detection algorithm.The two-stage target detection algorithm is also called the target detection algorithm based on the region proposal or the target detection algorithm based on the region of interest.This type of algorithm transforms the detection problem into a classification problem with pictures in the generated proposed region through explicit region proposals.Representative twostage object detection algorithms include R-CNN and Fast R-CNN.The one-stage target detection algorithm is also called the regression-based target detection algorithm.This type of algorithm does not directly generate the region of interest but regards the target detection task as a regression task for the entire image.Representative one-stage target detection algorithms include YOLO and SSD. Figure 1 shows the general framework of these two classes of object detection algorithms.At present, the application of deep learning in UAV power inspection is mainly for detecting insulators and U-bolt groups, and there are few studies on the automatic detection of bird's nests on transmission lines.Based on an enhanced Faster R-CNN for aerial images, Reference [10] proposed a detection example of a tower insulator and U-bolt group, which proved the detection effectiveness of deep learning on aerial images of overhead lines.Based on the YOLOv2 deep learning model, Reference [11] proposed an insulator detection and evaluation method.Based on aerial images captured by UAVs, it can detect insulators in clean backgrounds, with different object resolutions and lighting conditions.Experiments showed that the method can accurately locate the insulators in realtime UAV-based image data.The detected insulator images were then successfully evaluated for the insulator surface condition using different classifiers to assess the presence of ice, snow, and water.Reference [12] proposed a detection method for UAV electrical At present, the application of deep learning in UAV power inspection is mainly for detecting insulators and U-bolt groups, and there are few studies on the automatic detection of bird's nests on transmission lines.Based on an enhanced Faster R-CNN for aerial images, Reference [10] proposed a detection example of a tower insulator and U-bolt group, which proved the detection effectiveness of deep learning on aerial images of overhead lines.Based on the YOLOv2 deep learning model, Reference [11] proposed an insulator detection and evaluation method.Based on aerial images captured by UAVs, it can detect insulators in clean backgrounds, with different object resolutions and lighting conditions.Experiments showed that the method can accurately locate the insulators in real-time UAV-based image data.The detected insulator images were then successfully evaluated for the insulator surface condition using different classifiers to assess the presence of ice, snow, and water.Reference [12] proposed a detection method for UAV electrical components based on the YOLOv3 algorithm.On this basis, the super-resolution convolutional neural network (SRCNN) was used to realize the super-resolution reconstruction of the blurred image, Drones 2022, 6, 252 3 of 18 and the dataset was realized.The experiment proved that the technology improves the UAV recognition robustness of UAV power inspection systems.At present, there are few related studies on the use of drones to detect bird nests on transmission lines.a method for bird's nest detection on transmission lines using drone images was proposed by Reference [13].This paper proposes a deep learning-based bird's nest detection method.For the automatic detection framework, the prior dimension of anchors is obtained by using K-means clustering to improve the coordinate frame generation accuracy.The bird's nest automatic detection framework proposed in this work achieves high detection accuracy.
The above work studies the postdetection of the pictures taken by UAV power inspection based on deep learning.However, there are still many problems with how to perform the real-time detection of airborne images during the real-time UAV flight.On this basis, how to superimpose the detection results into the real-time video stream captured by the drone so that staff can view and process it in time still needs to be further studied and solved.

Materials and Methods
To improve the efficiency of UAV bird's nest inspection of transmission lines and to consider future practical applications, a complete workflow is formulated according to the functions to be realized by this system.The workflow of the system is shown in Figure 2.
Drones 2022, 6, x FOR PEER REVIEW 3 of 18 components based on the YOLOv3 algorithm.On this basis, the super-resolution convolutional neural network (SRCNN) was used to realize the super-resolution reconstruction of the blurred image, and the dataset was realized.The experiment proved that the technology improves the UAV recognition robustness of UAV power inspection systems.At present, there are few related studies on the use of drones to detect bird nests on transmission lines.a method for bird's nest detection on transmission lines using drone images was proposed by Reference [13].This paper proposes a deep learning-based bird's nest detection method.For the automatic detection framework, the prior dimension of anchors is obtained by using K-means clustering to improve the coordinate frame generation accuracy.The bird's nest automatic detection framework proposed in this work achieves high detection accuracy.
The above work studies the postdetection of the pictures taken by UAV power inspection based on deep learning.However, there are still many problems with how to perform the real-time detection of airborne images during the real-time UAV flight.On this basis, how to superimpose the detection results into the real-time video stream captured by the drone so that staff can view and process it in time still needs to be further studied and solved.

Materials and Methods
To improve the efficiency of UAV bird's nest inspection of transmission lines and to consider future practical applications, a complete workflow is formulated according to the functions to be realized by this system.The workflow of the system is shown in Figure 2. Before the UAV performs an inspection mission, it is necessary to set the waypoints and photo task points of the inspection line.This process relies on the generated power line laser point cloud model or manually setting the waypoints.According to the power inspection personnel's experience and actual situation, the drone's bird's nest usually Before the UAV performs an inspection mission, it is necessary to set the waypoints and photo task points of the inspection line.This process relies on the generated power line laser point cloud model or manually setting the waypoints.According to the power inspection personnel's experience and actual situation, the drone's bird's nest usually exists in the tower body and the power cable part of the transmission line tower.Therefore, when setting the shooting angle of the drone, the method shown in Figure 3 is usually used.The waypoint planning software can simulate and debug the expected photographing effect.
exists in the tower body and the power cable part of the transmission line tower.Therefore, when setting the shooting angle of the drone, the method shown in Figure 3 is usually used.The waypoint planning software can simulate and debug the expected photographing effect.After the UAV and the onboard computer are turned on, the inspectors first preset the waypoint information of the line to be inspected and upload it to the UAV, where the waypoint information includes the photos that the UAV needs to perform at each waypoint.According to the task and camera angle, the drone takes off autonomously according to the waypoint information of the inspection route obtained and automatically takes pictures after arriving at the waypoint.
During this process, the onboard computer pulls the video stream of the drone's camera gimbal in real time, obtains the photos taken, and performs bird's nest detection and identification for the video stream and photo synchronization.The map interface of the recognition software is displayed.After the drone completes a single inspection mission according to the waypoint and lands, the bird's nest detection software will automatically generate a KML location record file and a photo of the bird's nest detected to the U disk inserted into the onboard computer, which can then be viewed in the map software on the PC.

Hardware Design
This system uses the DJI M300 RTK UAV as the development test and flight test platform and the Nvidia Jetson Xavier NX as the onboard computer for developing and testing deep learning algorithms.The M300 RTK UAV provides the Onboard SDK for development, and using the SDK greatly improves the software development efficiency.To ensure the good operation of the entire hardware system, a 1080P video transmission module is also configured, which is used to transmit the software running picture of the airborne computer back to the ground for real-time monitoring.
The hardware schematic of the UAV system we designed for bird's nest detection is shown in Figure 4.After the UAV and the onboard computer are turned on, the inspectors first preset the waypoint information of the line to be inspected and upload it to the UAV, where the waypoint information includes the photos that the UAV needs to perform at each waypoint.According to the task and camera angle, the drone takes off autonomously according to the waypoint information of the inspection route obtained and automatically takes pictures after arriving at the waypoint.
During this process, the onboard computer pulls the video stream of the drone's camera gimbal in real time, obtains the photos taken, and performs bird's nest detection and identification for the video stream and photo synchronization.The map interface of the recognition software is displayed.After the drone completes a single inspection mission according to the waypoint and lands, the bird's nest detection software will automatically generate a KML location record file and a photo of the bird's nest detected to the U disk inserted into the onboard computer, which can then be viewed in the map software on the PC.

Hardware Design
This system uses the DJI M300 RTK UAV as the development test and flight test platform and the Nvidia Jetson Xavier NX as the onboard computer for developing and testing deep learning algorithms.The M300 RTK UAV provides the Onboard SDK for development, and using the SDK greatly improves the software development efficiency.To ensure the good operation of the entire hardware system, a 1080P video transmission module is also configured, which is used to transmit the software running picture of the airborne computer back to the ground for real-time monitoring.
The hardware schematic of the UAV system we designed for bird's nest detection is shown in Figure 4.The DJI M300 RTK UAV integrates a flight control system, binocular vision, and an FPV camera, with functions, such as six-direction positioning, obstacle avoidance, and precise reshooting.It not only ensures flight safety but also provides necessary functions suitable for power inspection applications [14].H20T was selected as the airborne gimbal during the development and implementation of this project.The functional parameters of the DJI M300 RTK UAV and H20T RGB camera are shown in Table 1.The DJI M300 RTK UAV integrates a flight control system, binocular vision, and an FPV camera, with functions, such as six-direction positioning, obstacle avoidance, and precise reshooting.It not only ensures flight safety but also provides necessary functions suitable for power inspection applications [14].H20T was selected as the airborne gimbal during the development and implementation of this project.The functional parameters of the DJI M300 RTK UAV and H20T RGB camera are shown in Table 1.In the design process of the UAV system, we also fully consider the influence of the magnetic field existing in the overhead transmission line on the navigation of the UAV.
According to the research results of several works of literature and national standard inquiries [15][16][17], it is found that for the operation of 500 kV and 750 kV AC transmission lines, the safety distance between the UAV and the transmission line should generally not be less than 5 m.In our implementation process, in order to ensure the safety of the experiment, Increase the safety distance to 20 m.The use of the H20T optical zoom camera can ensure the shooting effect of photos, while satisfying the need for drones to eliminate magnetic field interference.
To ensure that the deep learning algorithm can be applied on the airborne end of the UAV, the Jetson Xavier NX artificial intelligence development kit is used as the hardware terminal of the image recognition calculation, and on this basis, the development of the bird's nest positioning and recognition algorithm software is carried out.Nvidia Jetson devices are embedded AI computing platforms that provide high-performance, low-power computing support for deep learning and computer vision [18][19][20].The specifications of the onboard computer are shown in Table 2.

Software Design
The software implementation process of this system is based on the Nvidia Jetpack development environment, using Qt as the application development framework, integrating the DJI Onboard SDK and the PaddlePaddle deep learning framework, and completing the software development on this basis.
The DJI Onboard SDK is an open-source software library that allows a computer to communicate directly with the DJI M300 RTK drone through the serial port.The Onboard SDK provides access to aircraft telemetry, flight control, and other drone functions, and developers can use the Onboard SDK to connect external controllers to the drone and use it to control the flight.The SDK contains an open-source C++ library, which can be used to control the M300 RTK drone through the serial interface, support Linux, ARM, and STM32, and it has a drone simulator and visualization tools that can be used for real-time simulation tests, while satisfying program debugging.This is convenient for developers to start quickly and to carry out secondary development [22,23].
During the implementation of this project, the DJI Onboard SDK was used to obtain the real-time UAV flight position, flight waypoint information, video stream, and captured photos.Based on the above information as software input, the software processing architecture is shown in Figure 5.

Navigation and Localization Module
The autonomous inspection process of this system relies on the accurate reshooting function of the UAV.Since the M300 RTK UAV has an RTK GPS module, it can achieve high-precision and accurate positioning.
Before the task starts, the operator plans the route based on the point cloud of the line to be inspected or the information of past inspection waypoints; the specific position target can be set in flight and saved as a route task, which is imported into the unmanned aerial vehicle.After the aircraft tasks are completed, the drone conducts a fully autonomous waypoint flight for the route.For each mission of the same line, the UAV can automatically capture the latest image of the same position, which is used as the image input source of the bird's nest recognition module.
The bird's nest recognition software module is based on the above image and performs real-time bird's nest detection on the image during the flight of the drone.If a bird's nest is found, it further extracts the photo point location data in the EXIF information of the image, marks it on the software interface, and detects the bird's nest.The pictures and location information of the bird's nest are written to the output folder.

Bird's Nest Detection Module
The bird's nest detection software module proposed in this paper is implemented by three different YOLO detection algorithms, including the MobileNetv3-Large improved YOLOv3 algorithm, YOLOv5 algorithm, and YOLOX algorithm, and the experimental results of the three algorithms are compared.

Improved YOLOv3 detection algorithm based on MobileNetv3-Large
YOLOv3 is an improved version of the YOLO series algorithm proposed by Redmon et al. in 2018 [24].YOLOv3 uses a deep residual network to extract image features and achieves multiscale prediction, achieving a better balance between detection accuracy and speed.The YOLOv3 algorithm is divided into two parts: the backbone network and the prediction network from the network architecture.The backbone network of the traditional YOLOv3 algorithm is Darknet-53, which can be divided into 5 stages according to the size of the feature map.In the third, fourth, and fifth stages, the output feature map of the stage is fed into the prediction network.
The prediction network fuses multiple scale feature maps for multiscale prediction, assuming that the input image size is 608 × 608 pixels, and the feature sizes output by the prediction network are 19 × 19, 38 × 38, and 76 × 76.This paper improves the YOLOv3 network structure.As shown in Figure 6, the lightweight backbone network MobileNetV3-Large [25] is used to replace Darknet-53 as the backbone network to improve the YOLOv3 network.MobileNetV3-Large is the third version of MobileNet.Based on MobileNetv2, it adds network structure search, compressed excitation module (SE), and activation functions hard-swish and hard-sigmoid.

Navigation and Localization Module
The autonomous inspection process of this system relies on the accurate reshooting function of the UAV.Since the M300 RTK UAV has an RTK GPS module, it can achieve high-precision and accurate positioning.
Before the task starts, the operator plans the route based on the point cloud of the line to be inspected or the information of past inspection waypoints; the specific position target can be set in flight and saved as a route task, which is imported into the unmanned aerial vehicle.After the aircraft tasks are completed, the drone conducts a fully autonomous waypoint flight for the route.For each mission of the same line, the UAV can automatically capture the latest image of the same position, which is used as the image input source of the bird's nest recognition module.
The bird's nest recognition software module is based on the above image and performs real-time bird's nest detection on the image during the flight of the drone.If a bird's nest is found, it further extracts the photo point location data in the EXIF information of the image, marks it on the software interface, and detects the bird's nest.The pictures and location information of the bird's nest are written to the output folder.

Bird's Nest Detection Module
The bird's nest detection software module proposed in this paper is implemented by three different YOLO detection algorithms, including the MobileNetv3-Large improved YOLOv3 algorithm, YOLOv5 algorithm, and YOLOX algorithm, and the experimental results of the three algorithms are compared.

Improved YOLOv3 detection algorithm based on MobileNetv3-Large
YOLOv3 is an improved version of the YOLO series algorithm proposed by Redmon et al. in 2018 [24].YOLOv3 uses a deep residual network to extract image features and achieves multiscale prediction, achieving a better balance between detection accuracy and speed.The YOLOv3 algorithm is divided into two parts: the backbone network and the prediction network from the network architecture.The backbone network of the traditional YOLOv3 algorithm is Darknet-53, which can be divided into 5 stages according to the size of the feature map.In the third, fourth, and fifth stages, the output feature map of the stage is fed into the prediction network.
The prediction network fuses multiple scale feature maps for multiscale prediction, assuming that the input image size is 608 × 608 pixels, and the feature sizes output by the prediction network are 19 × 19, 38 × 38, and 76 × 76.
This paper improves the YOLOv3 network structure.As shown in Figure 6, the lightweight backbone network MobileNetV3-Large [25] is used to replace Darknet-53 as the backbone network to improve the YOLOv3 network.MobileNetV3-Large is the third version of MobileNet.Based on MobileNetv2, it adds network structure search, compressed excitation module (SE), and activation functions hard-swish and hard-sigmoid.The MobileNetV3-Large network structure is shown in Table 3.The units in the table number the input data according to the processing order.In Table 3, each row lists the specification configuration of each layer in the model network.The first row in the table corresponds to the first network layer.The data transmission in the network is top-down, that is, the upper layer is processed.The feature map output is used as the input to the next layer, Conv2d is the normal convolution layer, Pool is the pooling layer, and bneck is the basic building block.SE indicates whether to add a compressed excitation module (SE) in the basic building block, and √ indicates that a compressed excitation module (SE) is added to the basic building block of this layer.NL represents the nonlinear activation function in the network, including HS and RE, where HS represents the hard-swish function and RE represents the ReLU function.NBN indicates that the batch normalization layer is not added after the convolutional layer.As seen in the table, the number of filters of the previous convolutional layer is the same as the number of channels of the input feature map of the next layer.When the step size is 2, the size of the output feature map is halved.It is worth noting that the MobileNetV3-Large model in Table 3 is used for the k-class image classification task, which replaces the fully connected layer with a pooling layer and two 1 × 1 convolutional layers in the last 3 layers.The layer outputs a feature map of size 1 2 × k, which is used to represent the 1 × k classification result vector.The MobileNetV3-Large network structure is shown in Table 3.The units in the table number the input data according to the processing order.In Table 3, each row lists the specification configuration of each layer in the model network.The first row in the table corresponds to the first network layer.The data transmission in the network is top-down, that is, the upper layer is processed.The feature map output is used as the input to the next layer, Conv2d is the normal convolution layer, Pool is the pooling layer, and bneck is the basic building block.SE indicates whether to add a compressed excitation module (SE) in the basic building block, and √ indicates that a compressed excitation module (SE) is added to the basic building block of this layer.NL represents the nonlinear activation function in the network, including HS and RE, where HS represents the hard-swish function and RE represents the ReLU function.NBN indicates that the batch normalization layer is not added after the convolutional layer.As seen in the table, the number of filters of the previous convolutional layer is the same as the number of channels of the input feature map of the next layer.When the step size is 2, the size of the output feature map is halved.It is worth noting that the MobileNetV3-Large model in Table 3 is used for the k-class image classification task, which replaces the fully connected layer with a pooling layer and two 1 × 1 convolutional layers in the last 3 layers.The layer outputs a feature map of size 1 2 × k, which is used to represent the 1 × k classification result vector.Structurally, YOLOv5 can be divided into three parts.The focus + CSPDarknet structure is used on the backbone feature extraction network, the neck is the enhanced feature pyramid network PAN, and the final prediction is the YOLO head.Currently, there are four versions of YOLOv5, namely, YOLOv5-s, YOLOv5-m, YOLOv5-l, and YOLOv5-x.The weights of the four versions and the width and depth of the model are sequentially increased.In this paper, YOLOv5-s is selected.Although its AP accuracy is lower than those of the other three models, its depth is the smallest, and it is more suitable for the lightweight deployment of UAV airborne AI computing platforms [26,27].
Compared with YOLOv3, the convolution block structure DarkNetConv of YOLOv5 has changed, and its activation function uses SiLU improved from Sigmoid and ReLU, which is better than ReLU on the deep neural network model; in addition, YOLOv5 will also replace the YOLOv3 backbone network The first three layers are replaced by the focus network, as shown in Figure 8.The network structure performs a slicing operation similar to downsampling on the input image, takes a value of every other pixel in the image to obtain four independent feature layers, and then stacks these four independent feature layers to concentrate the width and height information into the channel.The input channel is expanded four times, and the newly spliced feature layer is equivalent to the original 3channel RGB and becomes 12 channels.After using the focus network, the number of network parameters and computations can be reduced, thereby improving the training speed [28].Structurally, YOLOv5 can be divided into three parts.The focus + CSPDarknet structure is used on the backbone feature extraction network, the neck is the enhanced feature pyramid network PAN, and the final prediction is the YOLO head.Currently, there are four versions of YOLOv5, namely, YOLOv5-s, YOLOv5-m, YOLOv5-l, and YOLOv5-x.The weights of the four versions and the width and depth of the model are sequentially increased.In this paper, YOLOv5-s is selected.Although its AP accuracy is lower than those of the other three models, its depth is the smallest, and it is more suitable for the lightweight deployment of UAV airborne AI computing platforms [26,27].
Compared with YOLOv3, the convolution block structure DarkNetConv of YOLOv5 has changed, and its activation function uses SiLU improved from Sigmoid and ReLU, which is better than ReLU on the deep neural network model; in addition, YOLOv5 will also replace the YOLOv3 backbone network The first three layers are replaced by the focus network, as shown in Figure 8.The network structure performs a slicing operation similar to downsampling on the input image, takes a value of every other pixel in the image to obtain four independent feature layers, and then stacks these four independent feature layers to concentrate the width and height information into the channel.The input channel is expanded four times, and the newly spliced feature layer is equivalent to the original 3-channel RGB and becomes 12 channels.After using the focus network, the number of network parameters and computations can be reduced, thereby improving the training speed [28].

YOLOX-s detection algorithm
YOLOX was released by Megvii Technology in July 2021.When choosing the benchmark model of YOLOX, the author did not choose the YOLOv4 and YOLOv5 series but based on the anchor frame algorithm, YOLOv3, which is currently widely used in the industrial field, was selected.SPP differs from YOLOv3 in that YOLOv3-SPP adds SPP components behind the backbone network of YOLOv3 [29,30].
To facilitate analysis, the YOLOX network structure can also be divided into three

YOLOX-s detection algorithm
YOLOX was released by Megvii Technology in July 2021.When choosing the benchmark model of YOLOX, the author did not choose the YOLOv4 and YOLOv5 series but based on the anchor frame algorithm, YOLOv3, which is currently widely used in the industrial field, was selected.SPP differs from YOLOv3 in that YOLOv3-SPP adds SPP components behind the backbone network of YOLOv3 [29,30].
To facilitate analysis, the YOLOX network structure can also be divided into three modules: backbone, neck, and prediction layers.
The input uses two powerful data enhancement technologies: mixup and mosaic.Mosaic enhancement, which can effectively improve the detection effect of small targets, is also widely used in the YOLOv4 and YOLOv5 algorithms.Mixup is an addition to mosaic.The backbone network is consistent with the original YOLOv3 backbone network and adopts the DarkNet53 network.The neck is also fused using the FPN structure.The prediction layer consists of the following parts: decoupling head, anchor-free detector, label assignment strategy and loss computation.The author made the network structure into an optional configuration.According to the width and height of the network, it is divided into YOLOX-s, YOLOX-m, YOLOX-l, YOLOX-x, and other versions.This paper uses the standard network structure YOLOX-s model [31,32], and its network structure is shown in Figure 9.

YOLOX-s detection algorithm
YOLOX was released by Megvii Technology in July 2021.When choosing the benchmark model of YOLOX, the author did not choose the YOLOv4 and YOLOv5 series but based on the anchor frame algorithm, YOLOv3, which is currently widely used in the industrial field, was selected.SPP differs from YOLOv3 in that YOLOv3-SPP adds SPP components behind the backbone network of YOLOv3 [29,30].
To facilitate analysis, the YOLOX network structure can also be divided into three modules: backbone, neck, and prediction layers.
The input uses two powerful data enhancement technologies: mixup and mosaic.Mosaic enhancement, which can effectively improve the detection effect of small targets, is also widely used in the YOLOv4 and YOLOv5 algorithms.Mixup is an addition to mosaic.The backbone network is consistent with the original YOLOv3 backbone network and adopts the DarkNet53 network.The neck is also fused using the FPN structure.The prediction layer consists of the following parts: decoupling head, anchor-free detector, label assignment strategy and loss computation.The author made the network structure into an optional configuration.According to the width and height of the network, it is divided into YOLOX-s, YOLOX-m, YOLOX-l, YOLOX-x, and other versions.This paper uses the standard network structure YOLOX-s model [31,32], and its network structure is shown in Figure 9.

Results
In the development and implementation process of this project, the test is divided into two stages.First, the aerial photo dataset of power towers containing the bird's nest targets is collected and the bird's nest marking, training, and testing are performed based on the detection model proposed in this paper.Then, after deploying the software and onboard computer to the UAV, field flight tests are carried out to evaluate the bird's nest detection and positioning effect of the whole system in the actual operating environment.
During the bird's nest detection dataset labelling process, a total of 600 images containing the bird's nest targets were collected, including aerial pylon images of the drone

Results
In the development and implementation process of this project, the test is divided into two stages.First, the aerial photo dataset of power towers containing the bird's nest targets is collected and the bird's nest marking, training, and testing are performed based on the detection model proposed in this paper.Then, after deploying the software and onboard computer to the UAV, field flight tests are carried out to evaluate the bird's nest detection and positioning effect of the whole system in the actual operating environment.
During the bird's nest detection dataset labelling process, a total of 600 images containing the bird's nest targets were collected, including aerial pylon images of the drone and bird's nest images from the network.These images were used to mark the bird's nest data for training and testing.In addition, by randomly cropping, flipping, and stretching the training dataset, a total of approximately 2000 images were obtained for training.The above datasets were randomly divided into 10 nonoverlapping subdatasets, 9 subdatasets were selected as the training set, and 1 subdataset was used as the validation set.
Figure 10 shows a schematic diagram of image annotation for the training set. Figure 10a is the original image, and the rest of the images are based on the expanded image data obtained after image processing.
data for training and testing.In addition, by randomly cropping, flipping, and stretching the training dataset, a total of approximately 2000 images were obtained for training.The above datasets were randomly divided into 10 nonoverlapping subdatasets, 9 subdatasets were selected as the training set, and 1 subdataset was used as the validation set.
Figure 10 shows a schematic diagram of image annotation for the training set. Figure 10a is the original image, and the rest of the images are based on the expanded image data obtained after image processing.

Bird's Nest Detection Module Test
For target detection algorithms, it is usually necessary to use certain evaluation criteria to measure the algorithm model performance.For deep learning algorithms, a variety of evaluation indicators need to be considered to evaluate the model algorithm.In this experiment, in the process of training the model, by visualizing the change in the loss curve of the training output parameters, we set different parameters to obtain multiple training models.The precision rate (Precision), the recall rate (Recall), the average precision (AP), and the mean average precision (mAP) were calculated as the indicators to measure the model.In addition, since this project needs to realize real-time bird's nest target detection on UAVs, it is very important to meet the real-time detection requirements, so the indicator of detection speed also needs to be considered.
In order to calculate the above evaluation indicators, Table 4 defines positive and negative samples.

Statistical Classification Definition
True Positive(TP) A test result that correctly indicates the presence of a condition or characteristic True Negative(TN) A test result that correctly indicates the absence of a condition or characteristic False Positive(FP) A test result that indirectly indicates that a particular condition or attribute is present False Negative(FN) A test result that indirectly indicates that a particular condition or attribute is absent

Loss Function
In statistics, loss functions are often used for parameter estimation, expressing the difference between the estimated and true values of a data instance.Deep learning relies on statistical theory as support, so in deep learning, the loss function is used to estimate the degree of inconsistency between the training model and the test f(x) and the true value Y, usually L(Y, f(x)) is expressed.It is generally believed that the smaller the loss function, the better the robustness of the model.
The loss function of the YOLO algorithm consists of three parts: coordinate error, IOU error, and classification error.Through calculating network output S × S × (B × 5 + C)  The precision rate (Precision), the recall rate (Recall), the average precision (AP), and the mean average precision (mAP) were calculated as the indicators to measure the model.In addition, since this project needs to realize real-time bird's nest target detection on UAVs, it is very important to meet the real-time detection requirements, so the indicator of detection speed also needs to be considered.
In order to calculate the above evaluation indicators, Table 4 defines positive and negative samples.In statistics, loss functions are often used for parameter estimation, expressing the difference between the estimated and true values of a data instance.Deep learning relies on statistical theory as support, so in deep learning, the loss function is used to estimate the degree of inconsistency between the training model and the test f (x) and the true value Y, usually L(Y, f (x)) is expressed.It is generally believed that the smaller the loss function, the better the robustness of the model.
The loss function of the YOLO algorithm consists of three parts: coordinate error, IOU error, and classification error.Through calculating network output S × S × (B × 5 + C) dimension vector and real target input S × S × (B × 5 + C) dimensional vector and the mean square and error to optimize the model parameters.Its loss function is as follows: Drones 2022, 6, 252 12 of 18 The first two parts of Formula 1 calculate the coordinate error, the middle two parts calculate the IOU error, and the last part calculates the classification error, where 1 obj ij indicates that if the target appears in cell i, it is marked as 1; otherwise, it is 0, and 1 obj ij and 1 noobj ij indicate that the target appears or does not appear, respectively, in the j-th border of the i-th cell [33].

Precision
In the field of object detection, the accuracy rate is the ratio of the number of correct objects detected to the sum of the number of detected objects, which measures the precision of the detection model [34].The formula for calculating the accuracy rate is:

Recall
Recall refers to the ratio of the number of correct objects detected to the sum of the number of true objects annotated manually and measures the recall of the detection model [35].The formula for calculating recall is:

Mean Average Precision
The precision-recall (PR) curve is drawn based on the corresponding precision and recall curves.The QP is the area under the PR curve, which can be obtained by calculating the function integral of the precision rate and the recall rate.Average precision refers to averaging the correct objects detected over multiple test sets.
The average precision AP calculation formula is: It can be seen from the above that AP calculates the average precision of a certain target category.To evaluate the overall model performance, it is necessary to average the APs of all targets to obtain the mAP.
The mAP calculation formula is shown in the following formula, where N is the number of target categories to be detected.

Detection Speed
The task of bird's nest target detection requires real-time detection of the bird's nest targets.Therefore, for the detection algorithm model, the detection speed is a particularly important indicator.In this experiment, the detection speed of the model is calculated by recording the time used for detection on the test set in units of frames per second (FPS).

Bird Nest Detection Module Test Results
This experiment builds the experimental models of three algorithms based on the PaddlePaddle framework [36], and the training configuration parameters are shown in Table 5. Figure 11 shows the relationship between loss and epoch obtained from the training of the three models.The changes in the loss curves all show a downwards trend.This is due to the backpropagation of the deep neural network.In the repeated training process, the error continues to decrease, so the loss value continues to decrease.In the training process of the three models, YOLOv3, YOLOv5-s, and YOLOX-s, it can be seen in the figure that when the epoch reaches approximately 450, 160, and 280, respectively, the loss value achieves a better convergence effect.
important indicator.In this experiment, the detection speed of the model is calculated by recording the time used for detection on the test set in units of frames per second (FPS).

Bird Nest Detection Module Test Results
This experiment builds the experimental models of three algorithms based on the PaddlePaddle framework [36], and the training configuration parameters are shown in Table 5.According to the training results, the following statistical results of the three models can be obtained, as shown in Table 6.Among them, the mAP of YOLOv5-s is 92.1%, which is better than that of YOLOX-s and YOLOv3.Then, the three models were deployed to Jetson Xavier NX to test the frame rate of bird's nest detection, and the following experimental results were obtained, as shown in Table 7.Among them, YOLOv5-s has the highest detection frame rate, which is better than YOLOv3 and YOLOX-s.According to the training results, the following statistical results of the three models can be obtained, as shown in Table 6.Among them, the mAP of YOLOv5-s is 92.1%, which is better than that of YOLOX-s and YOLOv3.Then, the three models were deployed to Jetson Xavier NX to test the frame rate of bird's nest detection, and the following experimental results were obtained, as shown in Table 7.Among them, YOLOv5-s has the highest detection frame rate, which is better than YOLOv3 and YOLOX-s.As shown in Figure 12, the bird's nest detection results after the three detection models are deployed to Jetson Xavier NX are, respectively, shown.Among them, for the same test images, YOLOv5-s has the highest detection accuracy.As shown in Figure 12, the bird's nest detection results after the three detection models are deployed to Jetson Xavier NX are, respectively, shown.Among them, for the same test images, YOLOv5-s has the highest detection accuracy.According to the experimental results, to ensure the real-time detection frame rate and accuracy of the bird's nest of the airborne computer, YOLOv5-s was finally selected as the bird's nest detection model deployed by the airborne computer and then the test flight in the real field environment was carried out.

Flight Test
To test the performance of the system in the real power inspection scene, the onboard computer was mounted on the UAV and carried out the field flight test.The test selected a continuous transmission line with 6 towers and set up a total of 18 aerial photos.According to the naked eye observation results, there are bird nests on three of the tower structures.
The following is the field test data.The UAV flight speed is set to 5 m/s.After the UAV takes off autonomously, it takes fixed photos of the photo waypoints one by one.The onboard computer processes the photos in real time.The entire inspection flight According to the experimental results, to ensure the real-time detection frame rate and accuracy of the bird's nest of the airborne computer, YOLOv5-s was finally selected as the bird's nest detection model deployed by the airborne computer and then the test flight in the real field environment was carried out.

Flight Test
To test the performance of the system in the real power inspection scene, the onboard computer was mounted on the UAV and carried out the field flight test.The test selected a continuous transmission line with 6 towers and set up a total of 18 aerial photos.According to the naked eye observation results, there are bird nests on three of the tower structures.
The following is the field test data.The UAV flight speed is set to 5 m/s.After the UAV takes off autonomously, it takes fixed photos of the photo waypoints one by one.The onboard computer processes the photos in real time.The entire inspection flight process consumes time.The flight status display and processing results during the flight are shown in Figure 13.In addition, in the actual application process of the UAV system, a target with a detection accuracy rate exceeding 80% can be set as required to prompt and export the results, and the power inspection staff can view the detection results during the flight.After the drone flight mission is over, when the photos and location information of the bird's nest are checked later through the software, the bird's nest detection result can be rechecked to further ensure the accuracy of the bird's nest detection result.
According to the field test results, it can be proven that the system has a certain application value in the actual environment.Its operational robustness and the accuracy of bird's nest data detection still need to be optimized through a large number of flight tests in the real environment.After the dataset is further expanded, the accuracy of bird's nest detection will improve.Figure 14 shows the results of exporting the KML file generated by the onboard computer containing the bird's nest picture and position coordinates after the drone lands.In addition, in the actual application process of the UAV system, a target with a detection accuracy rate exceeding 80% can be set as required to prompt and export the results, and the power inspection staff can view the detection results during the flight.After the drone flight mission is over, when the photos and location information of the bird's nest are checked later through the software, the bird's nest detection result can be rechecked to further ensure the accuracy of the bird's nest detection result.
According to the field test results, it can be proven that the system has a certain application value in the actual environment.Its operational robustness and the accuracy of bird's nest data detection still need to be optimized through a large number of flight tests in the real environment.After the dataset is further expanded, the accuracy of bird's nest detection will improve.In addition, in the actual application process of the UAV system, a target with a detection accuracy rate exceeding 80% can be set as required to prompt and export the results, and the power inspection staff can view the detection results during the flight.After the drone flight mission is over, when the photos and location information of the bird's nest are checked later through the software, the bird's nest detection result can be re-checked to further ensure the accuracy of the bird's nest detection result.
According to the field test results, it can be proven that the system has a certain application value in the actual environment.Its operational robustness and the accuracy of bird's nest data detection still need to be optimized through a large number of flight tests in the real environment.After the dataset is further expanded, the accuracy of bird's nest detection will improve.

Conclusions
In this paper, a UAV system for automatic inspection of bird's nest transmission lines is designed to improve the efficiency of bird's nest inspection.In the implementation process, based on technologies, such as autonomous navigation and deep learning, a UAV system with functions, such as autonomous flight inspection, real-time automatic detection of the bird's nest, and position export for transmission lines, is realized.
In order to improve the detection accuracy of the bird's nest detection model, we took and collected 2000 bird's nest images in the actual power inspection environment to create a dataset.The parameters optimization and test comparison for bird's nest detection are based on the three target detection models of YOLOv3, YOLOv5-s, and YOLOX-s.According to the test results, the YOLOv5-s bird's nest detection model optimized for bird's nest real-time detection has a higher mAP and detection frame rate than the other two models.It was deployed to the onboard computer for real-time detection and verification during flight.The optimized YOLOv5-s bird's nest detection model can meet the daily inspection needs of transmission lines.Its mAP for bird's nest detection is 92.1%, and the real-time detection frame rate is 33.9 FPS, which will significantly shorten the time for exporting transmission line inspection results.The test results proved that the UAV system could efficiently complete the power inspection bird's nest detection task.The system has reasonable practicability and can greatly reduce labor consumption in the power inspection process.
In the future, we plan to test the system in more complex power inspection scenarios.We will iterate the robustness of the bird's nest detection model by expanding the bird's nest image dataset.We will also consider how to use lower-cost artificial intelligence onboard computers to achieve high-accuracy and high-speed real-time detection of bird nests so that this system can be more widely used.

Figure 1 .
Figure 1.An overview of object detection models.(a) Two-stage object detection; (b) One-stage object detection.

Figure 1 .
Figure 1.An overview of object detection models.(a) Two-stage object detection; (b) One-stage object detection.

Figure 3 .
Figure 3. Schematic diagram of the location where the drone takes pictures and the camera angle setting.

Figure 3 .
Figure 3. Schematic diagram of the location where the drone takes pictures and the camera angle setting.

Figure 4 .
Figure 4. Hardware schematic of the UAV system for bird's nest inspection.

Figure 4 .
Figure 4. Hardware schematic of the UAV system for bird's nest inspection.

2 .
YOLOv5-s detection algorithm YOLOv5 is the fifth-generation model of the YOLO series.It is a target detection model based on the PyTorch framework.It is improved from the YOLOv3 model.Its structure and process are shown in Figure 7.
fifth-generation model of the YOLO series.It is a target detection model based on the PyTorch framework.It is improved from the YOLOv3 model.Its structure and process are shown in Figure 7.

3. 1 .
Bird's Nest Detection Module Test For target detection algorithms, it is usually necessary to use certain evaluation criteria to measure the algorithm model performance.For deep learning algorithms, a variety of evaluation indicators need to be considered to evaluate the model algorithm.In this experiment, in the process of training the model, by visualizing the change in the loss curve of the training output parameters, we set different parameters to obtain multiple training models.

Figure 11 Figure 11 .
Figure11shows the relationship between loss and epoch obtained from the training of the three models.The changes in the loss curves all show a downwards trend.This is due to the backpropagation of the deep neural network.In the repeated training process, the error continues to decrease, so the loss value continues to decrease.In the training process of the three models, YOLOv3, YOLOv5-s, and YOLOX-s, it can be seen in the figure that when the epoch reaches approximately 450, 160, and 280, respectively, the loss value achieves a better convergence effect.
process consumes time.The flight status display and processing results during the flight are shown in Figure13.

Figure 13 .
Figure 13.Display of flight status and bird's nest detection results.

Figure 14
Figure14shows the results of exporting the KML file generated by the onboard computer containing the bird's nest picture and position coordinates after the drone lands.

Figure 14 .
Figure 14.Display of flight status and bird's nest detection results.

Figure 13 .
Figure 13.Display of flight status and bird's nest detection results.

Figure 14
Figure14shows the results of exporting the KML file generated by the onboard computer containing the bird's nest picture and position coordinates after the drone lands.

Figure 13 .
Figure 13.Display of flight status and bird's nest detection results.

Figure 14 .
Figure 14.Display of flight status and bird's nest detection results.

Figure 14 .
Figure 14.Display of flight status and bird's nest detection results.

Table 5 .
Parameter settings for model training.

Table 5 .
Parameter settings for model training.