Abstract
Small object detection in remote sensing images provides significant value for urban monitoring, aerospace reconnaissance, and other fields. However, detection accuracy still faces multiple challenges including limited target information, weak feature representation, and complex backgrounds. This research aims to improve the performance of the YOLO11 model for small object detection in remote sensing imagery by addressing key issues in long-distance spatial dependency modeling, multi-scale feature adaptive fusion, and computational efficiency. We constructed a specialized Remote Sensing Airport-Plane Detection (RS-APD) dataset and used the public VisDrone2019 dataset for generalization verification. Based on the YOLO11 architecture, we proposed the DAE-YOLO model with three innovative modules: Dynamic Spatial Sequence Module (DSSM) for enhanced long-distance spatial dependency capture; Adaptive Multi-scale Feature Enhancement (AMFE) for multi-scale feature adaptive receptive field adjustment; and Efficient Dual-level Attention Mechanism (EDAM) to reduce computational complexity while maintaining feature expression capability. Experimental results demonstrate that compared to the baseline YOLO11, our proposed model improved mAP50 and mAP50:95 on the RS-APD dataset by 2.1% and 2.5%, respectively, with APs increasing by 2.8%. This research provides an efficient and reliable small object detection solution for remote sensing applications.