Previous Article in Journal
In-Hover Quadrotor Rotor Degradation Monitoring Using Null-Space Excitation and Lock-In Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

RSMamDet: Efficient UAV Remote Sensing Vehicle Detection via Linear State Space Models and Adaptive Multi-Level Feature Fusion

1
School of Information and Communication Engineering, Hainan University, Haikou 570228, China
2
School of Computer Science and Technology, Hainan University, Haikou 570228, China
3
Institute of Unmanned System Research, Beihang University, Beijing 100191, China
*
Author to whom correspondence should be addressed.
Drones 2026, 10(5), 396; https://doi.org/10.3390/drones10050396
Submission received: 11 April 2026 / Revised: 14 May 2026 / Accepted: 18 May 2026 / Published: 21 May 2026
(This article belongs to the Topic Advances in Autonomous Vehicles, Automation, and Robotics)

Abstract

Accurate and efficient vehicle detection from unmanned aerial vehicle (UAV) imagery is essential for intelligent transportation, urban monitoring, and public safety, yet this task remains challenging due to high target density, extreme scale variation, complex backgrounds, and stringent onboard computational constraints. Existing DETR-based detectors model global context through self-attention but incur quadratic O(N2) complexity that is prohibitive for high-resolution UAV images, while CNN-based methods lack the long-range contextual awareness needed for dense small-object scenarios. We propose RSMamDet, an efficient end-to-end detection framework built upon RT-DETR that replaces quadratic self-attention with linear O(N) State Space Model scanning. The framework integrates a MobileMamba backbone with a Selective Feature Scanning module for efficient global context modeling, a Dimension-Aware Selective Integration module for adaptive cross-scale feature fusion, a Poly Kernel Inception Network encoder for multi-receptive-field feature enrichment, and an Adaptive Multi-Level Feature Fusion module for content-aware dynamic upsampling, complemented by an Uncertainty-Minimal Composite loss for stable query selection in cluttered aerial scenes. Experiments on DroneVehicle and VisDrone2019 demonstrate that RSMamDet achieves mAP50 of 72.6% and 40.2%, surpassing state-of-the-art methods by 4.1% and 2.2%, respectively, while maintaining real-time inference at 186.2 FPS with only 19.8M parameters and 42.3 GFLOPs, representing a 6.14× reduction in computational cost and a 3.86× reduction in model parameters compared to the strongest baseline.
Keywords: UAV vehicle detection; remote sensing; state space model; Mamba; feature pyramid; DETR; real-time detection UAV vehicle detection; remote sensing; state space model; Mamba; feature pyramid; DETR; real-time detection

Share and Cite

MDPI and ACS Style

Wu, M.; Liu, X.; Li, X.; Gan, W. RSMamDet: Efficient UAV Remote Sensing Vehicle Detection via Linear State Space Models and Adaptive Multi-Level Feature Fusion. Drones 2026, 10, 396. https://doi.org/10.3390/drones10050396

AMA Style

Wu M, Liu X, Li X, Gan W. RSMamDet: Efficient UAV Remote Sensing Vehicle Detection via Linear State Space Models and Adaptive Multi-Level Feature Fusion. Drones. 2026; 10(5):396. https://doi.org/10.3390/drones10050396

Chicago/Turabian Style

Wu, Man, Xiaozhang Liu, Xiulai Li, and Wenbiao Gan. 2026. "RSMamDet: Efficient UAV Remote Sensing Vehicle Detection via Linear State Space Models and Adaptive Multi-Level Feature Fusion" Drones 10, no. 5: 396. https://doi.org/10.3390/drones10050396

APA Style

Wu, M., Liu, X., Li, X., & Gan, W. (2026). RSMamDet: Efficient UAV Remote Sensing Vehicle Detection via Linear State Space Models and Adaptive Multi-Level Feature Fusion. Drones, 10(5), 396. https://doi.org/10.3390/drones10050396

Article Metrics

Back to TopTop