Vehicle detection in aerial images has attracted great attention as an approach to providing the necessary information for transportation road network planning and traffic management. However, because of the low resolution, complex scene, occlusion, shadows, and high requirement for detection efficiency, implementing vehicle detection in aerial images is challenging. Therefore, we propose an efficient and scene-adaptive algorithm for vehicle detection in aerial images using an improved YOLOv3 framework, and it is applied to not only aerial still images but also videos composed of consecutive frame images. First, rather than directly using the traditional YOLOv3 network, we construct a new structure with fewer layers to improve the detection efficiency. Then, since complex scenes in aerial images can cause the partial occlusion of vehicles, we construct a context-aware-based feature map fusion to make full use of the information in the adjacent frames and accurately detect partially occluded vehicles. The traditional YOLOv3 network adopts a horizontal bounding box, which can attain the expected detection effects only for vehicles with small length–width ratio. Moreover, vehicles that are close to each other are liable to cause lower accuracy and a higher detection error rate. Hence, we design a sloping bounding box attached to the angle of the target vehicles. This modification is conducive to predicting not only the position but also the angle. Finally, two data sets were used to perform extensive experiments and comparisons. The results show that the proposed algorithm generates the desired and excellent performance.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited