MemLoTrack: Enhancing TIR Anti-UAV Tracking with Memory-Integrated Low-Rank Adaptation
Abstract
1. Introduction
- Memory-Integrated PEFT Tracker: We introduce a parameter-efficient ViT architecture featuring a Memory Attention Layer (MAL) that leverages long-range temporal context while preserving the efficiency of LoRA-based fine-tuning.
- Dual-Gated Memory Write Policy: During inference, a compact FIFO memory bank is governed by a novel dual-gated policy that admits frames only upon passing both a confidence check and a Kalman-based motion consistency check. This strategy robustly mitigates memory contamination under occlusion (OC), out-of-view (OV), and Dynamic Background Clutter (DBC) situations.
- State-of-the-Art Robustness on Anti-UAV410: MemLoTrack achieves state-of-the-art (SOTA) performance on the Anti-UAV410 benchmark, attaining an AUC of 63.6 and an SA of 64.0. These metrics, which reflect overall success and re-acquisition capabilities, establish a new benchmark for tracking robustness. While FocusTrack [3] demonstrates superior precision-based metrics (P/P-Norm), our method presents a notable advancement in robustness. This advancement is primarily attributed to our dual-gated memory policy, which effectively curates the temporal context, prevents memory contamination, and significantly enhances target re-acquisition during critical failure modes such as occlusion (OC), out-of-view (OV), and Dynamic Background Clutter (DBC).
2. Related Work
2.1. Transformer-Based SOT and Parameter-Efficient Adaptation
2.2. Anti-UAV Tracking in TIR: Benchmarks, Protocols, and Specialized Designs
2.3. Memory-Augmented Tracking and Motion-Aware Memory Selection
3. Method
3.1. Backbone and Trainable Scope
3.2. Inputs and Tokens
3.3. Training-Time Memory Frame Sampling
3.4. Memory Attention Layer (MAL)
3.5. Inference-Time Memory Bank (MB) with Dual Gating
3.6. Loss and Optimization
3.7. Prediction Head and Targets
3.8. Inference Procedure
4. Experiments and Results
4.1. Anti-UAV410 Protocol and Metrics
4.2. Anti-UAV300 Training and Evaluation
| Method | AUC | P | SA |
|---|---|---|---|
| ATOM [30] | 49.2 | 68.0 | 50.0 |
| TransT [31] | 51.3 | 68.1 | 52.1 |
| Super DiMP [32] | 55.8 | 77.2 | 56.7 |
| STARK [33] | 58.2 | 76.8 | 59.1 |
| GlobalTrack [4] | 63.3 | 85.5 | 64.3 |
| LoRAT-B [12] | 66.4 | 85.6 | 64.8 |
| MemLoTrack (MB = 7) | 66.6 | 86.3 | 66.4 |
| MemLoTrack (MB = 11) | 66.9 | 86.7 | 66.6 |
4.3. Overall Comparison on Anti-UAV410 and Cross Dataset Evaluation
| Method | Size | Anti-UAV410 | SA | ||
|---|---|---|---|---|---|
| AUC | P | ||||
| Without training on Anti-UAV410 | |||||
| ETTrack [35] | 256 | 41.5 | 59.7 | 54.8 | 41.6 |
| GRM [15] | 256 | 42.3 | 58.5 | 55.1 | 42.2 |
| MixFormerV2-S [7] | 224 | 45.6 | 64.1 | 60.0 | 46.1 |
| ARTrack [16] | 256 | 48.2 | 67.2 | 62.9 | 48.5 |
| TransT [31] | 256 | 48.2 | 67.7 | 64.1 | 48.9 |
| JointNLT [36] | 320 | 48.4 | 69.0 | 64.5 | 48.9 |
| PromptVT [37] | 320 | 50.5 | 71.5 | 65.6 | 51.2 |
| SeqTrack [8] | 256 | 52.2 | 73.8 | 70.0 | 52.9 |
| LoRAT-B [12] | 224 | 57.2 | 77.1 | 74 | 57.1 |
| MemLoTrack (Ours) | 224 | 57.1 | 77.2 | 73.8 | 56.6 |
| After re-training on Anti-UAV410 | |||||
| TCTrack [38] | 287 | 41.1 | 60.4 | 56.0 | 41.6 |
| SwinTrack-Tiny [9] | 224 | 53.0 | 71.4 | 68.1 | 53.1 |
| OSTrack [6] | 256 | 53.7 | 73.9 | 70.9 | 54.7 |
| ToMP50 [32] | 288 | 54.1 | 73.8 | 70.2 | 55.1 |
| ToMP101 [32] | 288 | 54.2 | 75.0 | 70.5 | 55.1 |
| ROMTrack [39] | 256 | 54.7 | 74.5 | 71.7 | 55.7 |
| SwinTrack-Base [9] | 384 | 55.9 | 76.4 | 72.3 | 55.7 |
| Stark-ST101 [33] | 320 | 56.2 | 78.5 | 74.6 | 57.1 |
| ZoomTrack [40] | 256 | 58.4 | 81.2 | 77.4 | 59.4 |
| AiATrack [41] | 320 | 58.6 | 82.3 | 78.0 | 59.6 |
| MixFormerV2-B [7] | 288 | 58.7 | 80.5 | 76.8 | 59.6 |
| DropTrack [42] | 256 | 59.2 | 82.2 | 78.2 | 60.2 |
| LoRAT-B [12] | 224 | 62.2 | 80.9 | 78.1 | 62.5 |
| FocusTrack [3] | 256 | 62.8 | 86.2 | 82.8 | 63.9 |
| MemLoTrack (Ours) | 224 | 63.6 | 82.7 | 79.8 | 64 |
4.4. Attribute-Wise Analysis
4.5. Ablations: Memory and Thresholding
4.6. Qualitative Evaluation
4.7. Model Size and Computation
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Jiang, N.; Wang, K.; Peng, X.; Yu, X.; Wang, Q.; Xing, J.; Li, G.; Guo, G.; Zhao, J.; Han, Z. Anti-UAV: A Large Multi-Modal Benchmark for UAV Tracking. arXiv 2021, arXiv:2101.08466. [Google Scholar]
- Huang, B.; Li, J.; Chen, J.; Wang, G.; Zhao, J.; Xu, T. Anti-UAV410: A Thermal Infrared Benchmark and Customized Scheme for Tracking Drones in the Wild. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 2852–2865. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Xu, T.; Li, J. FocusTrack: A Self-Adaptive Local Sampling Algorithm for Efficient Anti-UAV Tracking. IEEE Trans. Geosci. Remote Sens. 2025, 63, 1–14. [Google Scholar] [CrossRef]
- Huang, L.; Zhao, X.; Huang, K. Globaltrack: A simple and strong baseline for long-term tracking. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 11037–11044. [Google Scholar]
- Voigtlaender, P.; Luiten, J.; Torr, P.H.S.; Leibe, B. Siam R-CNN: Visual Tracking by Re-Detection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 6577–6587. [Google Scholar]
- Ye, B.; Chang, H.; Ma, B.; Shan, S. Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022. [Google Scholar]
- Cui, Y.; Song, T.S.; Wu, G.; Wang, L. MixFormerV2: Efficient Fully Transformer Tracking. arXiv 2023, arXiv:2305.15896. [Google Scholar]
- Chen, X.; Peng, H.; Wang, D.; Lu, H.; Hu, H. Seqtrack: Sequence to sequence learning for visual object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 14572–14581. [Google Scholar]
- Lin, L.; Fan, H.; Xu, Y.; Ling, H. SwinTrack: A Simple and Strong Baseline for Transformer Tracking. arXiv 2021, arXiv:2112.00995. [Google Scholar]
- Hu, E.J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. Lora: Low-rank adaptation of large language models. ICLR 2022, 1, 3. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998. [Google Scholar]
- Lin, L.; Fan, H.; Zhang, Z.; Wang, Y.; Xu, Y.; Ling, H. Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance. In Proceedings of the European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024. [Google Scholar]
- Yang, C.Y.; Huang, H.W.; Chai, W.; Jiang, Z.; Hwang, J.N. SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory. arXiv 2024, arXiv:2411.11922. [Google Scholar]
- Oquab, M.; Darcet, T.; Moutakanni, T.; Vo, H.; Szafraniec, M.; Khalidov, V.; Fernandez, P.; Haziza, D.; Massa, F.; El-Nouby, A.; et al. Dinov2: Learning robust visual features without supervision. arXiv 2023, arXiv:2304.07193. [Google Scholar]
- Gao, S.; Zhou, C.; Zhang, J. Generalized Relation Modeling for Transformer Tracking. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 18686–18695. [Google Scholar] [CrossRef]
- Bai, Y.; Zhao, Z.; Gong, Y.; Wei, X. ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe. arXiv 2023, arXiv:2312.17133. [Google Scholar]
- Jia, M.; Tang, L.; Chen, B.C.; Cardie, C.; Belongie, S.; Hariharan, B.; Lim, S.N. Visual prompt tuning. In Computer Vision—ECCV 2022, Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 709–727. [Google Scholar]
- Houlsby, N.; Giurgiu, A.; Jastrzebski, S.; Morrone, B.; De Laroussilhe, Q.; Gesmundo, A.; Attariyan, M.; Gelly, S. Parameter-efficient transfer learning for NLP. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 2790–2799. [Google Scholar]
- Liu, H.; Tam, D.; Muqeeth, M.; Mohta, J.; Huang, T.; Bansal, M.; Raffel, C.A. Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning. Adv. Neural Inf. Process. Syst. 2022, 35, 1950–1965. [Google Scholar]
- Fan, H.; Bai, H.; Lin, L.; Yang, F.; Chu, P.; Deng, G.; Yu, S.; Harshit; Huang, M.; Liu, J.; et al. Lasot: A high-quality large-scale single object tracking benchmark. Int. J. Comput. Vis. 2021, 129, 439–461. [Google Scholar] [CrossRef]
- Muller, M.; Bibi, A.; Giancola, S.; Alsubaihi, S.; Ghanem, B. Trackingnet: A large-scale dataset and benchmark for object tracking in the wild. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 300–317. [Google Scholar]
- Huang, L.; Zhao, X.; Huang, K. Got-10k: A large high-diversity benchmark for generic object tracking in the wild. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 1562–1577. [Google Scholar]
- Wang, X.; Shu, X.; Zhang, Z.; Jiang, B.; Wang, Y.; Tian, Y.; Wu, F. Towards more flexible and accurate object tracking with natural language: Algorithms and benchmark. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13763–13773. [Google Scholar]
- Zhao, J.; Zhang, J.; Li, D.; Wang, D. Vision-based anti-uav detection and tracking. IEEE Trans. Intell. Transp. Syst. 2022, 23, 25323–25334. [Google Scholar] [CrossRef]
- Liu, Q.; He, Z.; Li, X.; Zheng, Y. PTB-TIR: A thermal infrared pedestrian tracking benchmark. IEEE Trans. Multimed. 2019, 22, 666–675. [Google Scholar]
- Liu, Q.; Li, X.; He, Z.; Li, C.; Li, J.; Zhou, Z.; Yuan, D.; Li, J.; Yang, K.; Fan, N.; et al. LSOTB-TIR: A Large-Scale High-Diversity Thermal Infrared Object Tracking Benchmark. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12–16 October 2020. [Google Scholar]
- Liao, D.; Shu, X.; Li, Z.; Liu, Q.; Yuan, D.; Chang, X.; He, Z. Fine-Grained Feature and Template Reconstruction for TIR Object Tracking. IEEE Trans. Circuits Syst. Video Technol. 2025, 35, 9276–9286. [Google Scholar] [CrossRef]
- Ravi, N.; Gabeur, V.; Hu, Y.T.; Hu, R.; Ryali, C.K.; Ma, T.; Khedr, H.; Rädle, R.; Rolland, C.; Gustafson, L.; et al. SAM 2: Segment Anything in Images and Videos. arXiv 2024, arXiv:2408.00714. [Google Scholar]
- Zheng, Y.; Zhong, B.; Liang, Q.; Mo, Z.; Zhang, S.; Li, X. Odtrack: Online dense temporal token learning for visual tracking. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; Volume 38, pp. 7588–7596. [Google Scholar]
- Danelljan, M.; Bhat, G.; Khan, F.S.; Felsberg, M. Atom: Accurate tracking by overlap maximization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4660–4669. [Google Scholar]
- Chen, X.; Yan, B.; Zhu, J.; Wang, D.; Yang, X.; Lu, H. Transformer Tracking. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 8122–8131. [Google Scholar]
- Mayer, C.; Danelljan, M.; Bhat, G.; Paul, M.; Paudel, D.P.; Yu, F.; Van Gool, L. Transforming Model Prediction for Tracking. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 8721–8730. [Google Scholar] [CrossRef]
- Yan, B.; Peng, H.; Fu, J.; Wang, D.; Lu, H. Learning Spatio-Temporal Transformer for Visual Tracking. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 10428–10437. [Google Scholar]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; Springer: Cham, Switzerland, 2014; pp. 740–755. [Google Scholar]
- Han, X.; Oishi, N.; Tian, Y.; Ucurum, E.; Young, R.; Chatwin, C.; Birch, P. ETTrack: Enhanced temporal motion predictor for multi-object tracking. arXiv 2024, arXiv:2405.15755. [Google Scholar] [CrossRef]
- Zhou, L.; Zhou, Z.; Mao, K.; He, Z. Joint Visual Grounding and Tracking with Natural Language Specification. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 23151–23160. [Google Scholar]
- Zhang, M.; Zhang, Q.; Song, W.; Huang, D.; He, Q. PromptVT: Prompting for Efficient and Accurate Visual Tracking. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 7373–7385. [Google Scholar] [CrossRef]
- Cao, Z.; Huang, Z.; Pan, L.; Zhang, S.; Liu, Z.; Fu, C. TCTrack: Temporal Contexts for Aerial Tracking. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 14778–14788. [Google Scholar]
- Cai, Y.; Liu, J.; Tang, J.; Wu, G. Robust Object Modeling for Visual Tracking. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 9555–9566. [Google Scholar]
- Kou, Y.; Gao, J.; Li, B.; Wang, G.; Hu, W.; Wang, Y.; Li, L. ZoomTrack: Target-aware Non-uniform Resizing for Efficient Visual Tracking. arXiv 2023, arXiv:2310.10071. [Google Scholar]
- Gao, S.; Zhou, C.; Ma, C.; Wang, X.; Yuan, J. AiATrack: Attention in Attention for Transformer Visual Tracking. arXiv 2022, arXiv:2207.09603. [Google Scholar] [CrossRef]
- Wu, Q.; Yang, T.; Liu, Z.; Lin, W.; Wu, B.; Chan, A.B. DropMAE: Learning Representations via Masked Autoencoders with Spatial-Attention Dropout for Temporal Matching Tasks. arXiv 2023, arXiv:2304.00571. [Google Scholar] [CrossRef]
- Bertinetto, L.; Valmadre, J.; Henriques, J.F.; Vedaldi, A.; Torr, P.H. Fully-convolutional siamese networks for object tracking. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–16 October 2016; Springer: Cham, Switzerland, 2016; pp. 850–865. [Google Scholar]
- Bhat, G.; Danelljan, M.; Gool, L.V.; Timofte, R. Learning discriminative model prediction for tracking. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6182–6191. [Google Scholar]










| Variant | AUC | P | P-Norm | SA | fps |
|---|---|---|---|---|---|
| MB = OFF | 62.9 | 81.8 | 79.2 | 63.3 | 424 |
| MB = 1 | 63.5 | 82.5 | 79.8 | 63.5 | 248 |
| MB = 3 | 63.4 | 82.4 | 79.7 | 63.7 | 207 |
| MB = 7 | 63.6 | 82.7 | 79.8 | 64.0 | 153 |
| MB = 11 | 62.5 | 81.1 | 78.5 | 64.3 | 127 |
| Attribute | MB = 1 | MB = 3 | MB = 7 | MB = 11 | ||||
|---|---|---|---|---|---|---|---|---|
| AUC | P | AUC | P | AUC | P | AUC | P | |
| Thermal Crossover | 59.7 | 79.7 | 60.0 | 80.2 | 60.5 | 81.0 | 60.7 | 81.1 |
| Out-of-View | 46.2 | 67.6 | 45.6 | 66.8 | 47.1 | 69.2 | 48.0 | 70.0 |
| Scale Variation | 53.2 | 73.4 | 52.7 | 72.9 | 54.1 | 75.0 | 53.1 | 73.3 |
| Fast Motion | 49.5 | 68.3 | 48.8 | 67.6 | 49.5 | 68.7 | 50.6 | 69.8 |
| Occlusion | 26.9 | 50.1 | 26.9 | 50.0 | 26.9 | 50.1 | 31.2 | 55.8 |
| Dynamic Background Clutter | 47.9 | 69.8 | 47.4 | 69.1 | 49.0 | 71.7 | 50.0 | 72.5 |
| Tiny Size | 47.8 | 68.8 | 46.8 | 67.5 | 48.7 | 70.3 | 49.5 | 71.2 |
| Small Size | 56.3 | 76.2 | 56.7 | 76.8 | 57.1 | 77.4 | 57.5 | 77.8 |
| Medium Size | 66.7 | 86.9 | 67.4 | 88.0 | 67.0 | 87.3 | 67.1 | 87.4 |
| Normal Size | 72.5 | 93.3 | 73.4 | 94.8 | 72.2 | 93.1 | 72.4 | 93.2 |
| Attribute | AUC | P | ||||
|---|---|---|---|---|---|---|
| OFF | ON | (ON-OFF) | OFF | ON | (ON-OFF) | |
| Thermal Crossover | 59.7 | 60.5 | +0.7 | 79.7 | 81.0 | +1.3 |
| Out-of-View | 43.0 | 47.1 | +4.1 | 63.3 | 69.2 | +5.9 |
| Scale Variation | 50.7 | 54.1 | +3.3 | 70.3 | 75.0 | +4.7 |
| Fast Motion | 49.6 | 49.5 | −0.1 | 68.8 | 68.7 | −0.1 |
| Occlusion | 18.8 | 26.9 | +8.2 | 38.6 | 50.1 | +11.4 |
| Dynamic Background Clutter | 42.5 | 49.0 | +6.5 | 62.2 | 71.7 | +9.5 |
| Tiny Size | 43.5 | 48.7 | +5.1 | 62.8 | 70.3 | +7.5 |
| Small Size | 56.3 | 57.1 | +0.8 | 76.2 | 77.4 | +1.2 |
| Medium Size | 66.9 | 67.0 | +0.1 | 87.2 | 87.3 | +0.1 |
| Normal Size | 72.1 | 72.2 | +0.1 | 92.8 | 93.1 | +0.3 |
| Threshold () | Total Memory Bank Updates |
|---|---|
| 0.1 | 42,648 |
| 0.2 | 42,105 |
| 0.3 | 41,532 |
| 0.4 | 40,911 |
| 0.5 | 40,250 |
| 0.6 | 39,683 |
| 0.7 | 39,017 |
| 0.8 | 38,439 |
| 0.9 | 101 |
| Ours (dual, ) | Strict () | Loose () | Fixed (No Adapt.) | Kalman-Only | Conf-Only | |
|---|---|---|---|---|---|---|
| State Accuracy (SA) | ||||||
| SA | 64.0 | 63.1 | 63.7 | 63.1 | 63.0 | 62.8 |
| Success AUC | ||||||
| DBC | 49.0 | 47.1 | 47.4 | 47.1 | 47.5 | 46.0 |
| FM | 49.5 | 49.2 | 49.7 | 49.2 | 48.5 | 48.8 |
| OC | 26.9 | 26.4 | 27.0 | 26.4 | 26.9 | 24.3 |
| OV | 47.1 | 45.8 | 46.3 | 45.8 | 44.7 | 45.0 |
| SV | 54.1 | 53.2 | 53.6 | 53.2 | 50.9 | 51.5 |
| TC | 60.4 | 59.1 | 59.6 | 59.1 | 59.0 | 58.7 |
| Tiny | 48.7 | 47.6 | 47.8 | 47.6 | 45.6 | 46.1 |
| Small | 57.1 | 55.8 | 56.4 | 55.8 | 55.7 | 55.5 |
| Medium | 67.0 | 66.1 | 66.7 | 66.1 | 66.7 | 66.4 |
| Normal | 72.2 | 72.4 | 72.6 | 72.4 | 71.7 | 71.9 |
| Precision (P) | ||||||
| DBC | 71.7 | 68.7 | 69.2 | 68.7 | 69.4 | 67.1 |
| FM | 68.7 | 68.0 | 68.7 | 68.0 | 67.1 | 67.5 |
| OC | 50.1 | 49.5 | 50.3 | 49.5 | 50.3 | 46.4 |
| OV | 69.2 | 67.0 | 67.7 | 67.0 | 65.6 | 66.0 |
| SV | 75.0 | 73.6 | 74.1 | 73.6 | 70.1 | 71.0 |
| TC | 81.0 | 79.1 | 79.9 | 79.0 | 78.9 | 78.3 |
| Tiny | 70.3 | 68.6 | 69.0 | 68.6 | 65.7 | 66.5 |
| Small | 77.4 | 75.7 | 76.5 | 75.7 | 75.3 | 75.1 |
| Medium | 87.3 | 86.2 | 87.0 | 86.2 | 87.0 | 86.4 |
| Normal | 93.1 | 93.3 | 93.6 | 93.3 | 92.1 | 92.5 |
| Metric | Value |
|---|---|
| Total parameters (B) | 0.12 |
| Trainable parameters (M) | 41.34 |
| MACs (G) | 37.5 |
| Tracker | Speed (fps) | MACs (G) | AUC | P | SA |
|---|---|---|---|---|---|
| OSTrack [6] | 267 | 29.1 | 53.7 | 73.9 | 54.7 |
| ROMTrack [39] | 9 | 34.5 | 54.7 | 74.5 | 55.7 |
| ZoomTrack [40] | 276 | 29.1 | 58.4 | 81.2 | 59.4 |
| DropTrack [42] | 154 | 48.4 | 59.2 | 82.2 | 60.2 |
| FocusTrack [3] | 28 | 30.1 | 62.8 | 86.2 | 63.9 |
| LoRAT-B [12] | 700 | 30.0 | 62.2 | 80.9 | 62.5 |
| MemLoTrack (MB = 7) | 153 | 37.5 | 63.6 | 82.7 | 64.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Park, J.K.; Han, J.-H. MemLoTrack: Enhancing TIR Anti-UAV Tracking with Memory-Integrated Low-Rank Adaptation. Sensors 2025, 25, 7359. https://doi.org/10.3390/s25237359
Park JK, Han J-H. MemLoTrack: Enhancing TIR Anti-UAV Tracking with Memory-Integrated Low-Rank Adaptation. Sensors. 2025; 25(23):7359. https://doi.org/10.3390/s25237359
Chicago/Turabian StylePark, Jae Kwan, and Ji-Hyeong Han. 2025. "MemLoTrack: Enhancing TIR Anti-UAV Tracking with Memory-Integrated Low-Rank Adaptation" Sensors 25, no. 23: 7359. https://doi.org/10.3390/s25237359
APA StylePark, J. K., & Han, J.-H. (2025). MemLoTrack: Enhancing TIR Anti-UAV Tracking with Memory-Integrated Low-Rank Adaptation. Sensors, 25(23), 7359. https://doi.org/10.3390/s25237359

