Previous Article in Journal
Automated Detection of Submerged Sandbar Crest Using Sentinel-2 Imagery
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

HA-Tracker: A Hybrid Architecture Tracker with Spatiotemporal Mamba Motion Model for UAV-Based Video Multi-Object Tracking

1
Institute of Geographical Sciences, Hebei Academy of Sciences, Shijiazhuang 050011, China
2
Hebei Technology Innovation Center for Geographic Information Application, Shijiazhuang 050011, China
3
Key Laboratory for Geographical Process Analysis & Simulation of Hubei Province, College of Urban and Environmental Science, Central China Normal University, Wuhan 430079, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Remote Sens. 2026, 18(1), 133; https://doi.org/10.3390/rs18010133 (registering DOI)
Submission received: 14 October 2025 / Revised: 17 December 2025 / Accepted: 22 December 2025 / Published: 30 December 2025

Abstract

UAV-based video multi-object tracking (MOT) is a significant task in the field of remote sensing. However, current research still faces critical issues: (1) the limitations of the single architecture of DNNs inherently hinder performance improvement of object detection; and (2) current linear modeling approaches for spatiotemporal relationships fail to capture complex motion patterns in the real world. To overcome the aforementioned issues, a hybrid architecture tracker (HA-Tracker) with a spatiotemporal Mamba motion model for UAV-based video MOT is the first to be proposed, which has the following innovations and contributions: (1) a CNN–Transformer–Mamba detector (CTM detector) is proposed to enhance the capability of object detection, which is a novel synergistic fusion framework for simultaneously fusing the local details of a CNN, the global context of a Transformer, and the long-range dependency of Mamba; and (2) a spatiotemporal Mamba motion model (STM3) is proposed to improve tracking accuracy by modeling the nonlinear spatiotemporal motion relationships of object trajectories. Extensive experimental results indicate that our HA-Tracker achieved outstanding performance, with multiple object tracking accuracy (MOTA) metrics of 44.76% and 52.22% and identity F1 scores (IDF1) of 60.33% and 72.34% on the Visdrone and UAVDT datasets, respectively. These results validate the effectiveness of HA-Tracker, which outperforms the existing MOT networks.
Keywords: multi-object tracking; convolution neural network; Transformer network; Mamba network; motion model multi-object tracking; convolution neural network; Transformer network; Mamba network; motion model

Share and Cite

MDPI and ACS Style

Zhang, P.; Sun, L.; Li, C.; Wang, Q.; Hao, Q.; Lu, J.; Zuo, L.; Ma, X. HA-Tracker: A Hybrid Architecture Tracker with Spatiotemporal Mamba Motion Model for UAV-Based Video Multi-Object Tracking. Remote Sens. 2026, 18, 133. https://doi.org/10.3390/rs18010133

AMA Style

Zhang P, Sun L, Li C, Wang Q, Hao Q, Lu J, Zuo L, Ma X. HA-Tracker: A Hybrid Architecture Tracker with Spatiotemporal Mamba Motion Model for UAV-Based Video Multi-Object Tracking. Remote Sensing. 2026; 18(1):133. https://doi.org/10.3390/rs18010133

Chicago/Turabian Style

Zhang, Pengfei, Leigang Sun, Chang Li, Qinyi Wang, Qingtao Hao, Junjing Lu, Lu Zuo, and Xiaoqian Ma. 2026. "HA-Tracker: A Hybrid Architecture Tracker with Spatiotemporal Mamba Motion Model for UAV-Based Video Multi-Object Tracking" Remote Sensing 18, no. 1: 133. https://doi.org/10.3390/rs18010133

APA Style

Zhang, P., Sun, L., Li, C., Wang, Q., Hao, Q., Lu, J., Zuo, L., & Ma, X. (2026). HA-Tracker: A Hybrid Architecture Tracker with Spatiotemporal Mamba Motion Model for UAV-Based Video Multi-Object Tracking. Remote Sensing, 18(1), 133. https://doi.org/10.3390/rs18010133

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop