Aiming at the real-time detection of multiple objects and micro-objects in large-scene remote sensing images, a cascaded convolutional neural network real-time object-detection framework for remote sensing images is proposed, which integrates visual perception and convolutional memory network reasoning. The detection framework is composed of two fully convolutional networks, namely, the strengthened object self-attention pre-screening fully convolutional network (SOSA-FCN) and the object accurate detection fully convolutional network (AD-FCN). SOSA-FCN introduces a self-attention module to extract attention feature maps and constructs a depth feature pyramid to optimize the attention feature maps by combining convolutional long-term and short-term memory networks. It guides the acquisition of potential sub-regions of the object in the scene, reduces the computational complexity, and enhances the network’s ability to extract multi-scale object features. It adapts to the complex background and small object characteristics of a large-scene remote sensing image. In AD-FCN, the object mask and object orientation estimation layer are designed to achieve fine positioning of candidate frames. The performance of the proposed algorithm is compared with that of other advanced methods on NWPU_VHR-10, DOTA, UCAS-AOD, and other open datasets. The experimental results show that the proposed algorithm significantly improves the efficiency of object detection while ensuring detection accuracy and has high adaptability. It has extensive engineering application prospects.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited