Near-Infrared Hyperspectral Target Tracking Based on Background Information and Spectral Position Prediction
Abstract
:1. Introduction
- We propose a historical frame background extraction module (HFBE) to overcome the complex background in NIR hyperspectral target tracking. This module uses spectral information from different historical frames to construct background template features. A mask is generated by averaging spectral data from adjacent areas surrounding the target, enabling coarse separation between the target and background.
- A background target routing algorithm (BT) is proposed, which combines traditional capsule network algorithms with spectral information from target and background capsules. By computing feature correlations between foreground-background capsule pairs, the framework dynamically increases the weights of capsules with the same features as the background representation. After generating the background response map, the inverse is taken to get the target response map, thus achieving target localization.
- A spectral information position prediction module (SIP) is designed to determine the search area for the next frame. By integrating the spectral data of adjacent frames and the current prediction data, the spectral changes of the target can be dynamically adapted to improve the robustness of tracking.
2. Related Work
2.1. Visual Object Tracking
2.2. Capsule Network
2.3. Background Information for Target Tracking
3. Proposed Algorithm
3.1. Historical Frame Background Extraction Module
3.2. Background Target Routing Algorithm
3.3. Spectral Information Position Prediction
3.4. Experimental Setup
3.5. Evaluation Metrics
4. Results and Analysis
4.1. Comparative Experiments
4.2. Ablation Experiments
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Chen, X.; Yan, B.; Zhu, J.; Wang, D.; Yang, X.; Lu, H. Transformer tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 8126–8135. [Google Scholar]
- Chen, X.; Peng, H.; Wang, D.; Lu, H.; Hu, H. Seqtrack: Sequence to sequence learning for visual object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 14572–14581. [Google Scholar]
- Yu, Y.; Xiong, Y.; Huang, W.; Scott, M.R. Deformable siamese attention networks for visual object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 6728–6737. [Google Scholar]
- Feng, W.; Meng, F.; Yu, C.; You, A. Fusion of Multiple Attention Mechanisms and Background Feature Adaptive Update Strategies in Siamese Networks for Single-Object Tracking. Appl. Sci. 2024, 14, 8199. [Google Scholar] [CrossRef]
- Kristan, M.; Leonardis, A.; Matas, J.; Felsberg, M.; Pflugfelder, R.; Kämäräinen, J.K.; Danelljan, M.; Zajc, L.Č.; Lukežič, A.; Drbohlav, O.; et al. The eighth visual object tracking VOT2020 challenge results. In Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020; pp. 547–601. [Google Scholar]
- Zheng, L.; Tang, M.; Chen, Y.; Zhu, G.; Wang, J.; Lu, H. Improving multiple object tracking with single object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 2453–2462. [Google Scholar]
- Voulodimos, A.; Doulamis, N.; Doulamis, A.; Protopapadakis, E. Deep learning for computer vision: A brief review. Comput. Intell. Neurosci. 2018, 2018, 7068349. [Google Scholar] [CrossRef]
- Bolme, D.S.; Beveridge, J.R.; Draper, B.A.; Lui, Y.M. Visual object tracking using adaptive correlation filters. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, 13–18 June 2010; pp. 2544–2550. [Google Scholar]
- Kim, C.; Fuxin, L.; Alotaibi, M.; Rehg, J.M. Discriminative appearance modeling with multi-track pooling for real-time multi-object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 9553–9562. [Google Scholar]
- Olabintan, A.B.; Abdullahi, A.S.; Yusuf, B.O.; Ganiyu, S.A.; Saleh, T.A.; Basheer, C. Prospects of polymer Nanocomposite-Based electrochemical sensors as analytical devices for environmental Monitoring: A review. Microchem. J. 2024, 204, 111053. [Google Scholar] [CrossRef]
- Miranda, V.R.; Rezende, A.M.; Rocha, T.L.; Azpúrua, H.; Pimenta, L.C.; Freitas, G.M. Autonomous navigation system for a delivery drone. J. Control Autom. Electr. Syst. 2022, 33, 141–155. [Google Scholar] [CrossRef]
- Shamshad, F.; Khan, S.; Zamir, S.W.; Khan, M.H.; Hayat, M.; Khan, F.S.; Fu, H. Transformers in medical imaging: A survey. Med. Image Anal. 2023, 88, 102802. [Google Scholar] [CrossRef]
- Kiani Galoogahi, H.; Sim, T.; Lucey, S. Correlation filters with limited boundaries. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 4630–4638. [Google Scholar]
- Fan, H.; Ling, H. Siamese cascaded region proposal networks for real-time visual tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 7952–7961. [Google Scholar]
- Chen, Z.; Zhong, B.; Li, G.; Zhang, S.; Ji, R. Siamese box adaptive network for visual tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 6668–6677. [Google Scholar]
- Liu, H.; Zhao, Y.; Dong, P.; Guo, X.; Wang, Y. IOF-Tracker: A Two-Stage Multiple Targets Tracking Method Using Spatial-Temporal Fusion Algorithm. Appl. Sci. 2025, 15, 107. [Google Scholar] [CrossRef]
- Uzkent, B.; Rangnekar, A.; Hoffman, M. Aerial vehicle tracking by adaptive fusion of hyperspectral likelihood maps. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 39–48. [Google Scholar]
- Jiang, X.; Wang, X.; Sun, C.; Zhu, Z.; Zhong, Y. A channel adaptive dual Siamese network for hyperspectral object tracking. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5403912. [Google Scholar] [CrossRef]
- Wan, X.; Chen, F.; Liu, W.; He, Y. Artificial gorilla troops optimizer enfolded broad learning system for spatial-spectral hyperspectral image classification. Infrared Phys. Technol. 2024, 138, 105220. [Google Scholar] [CrossRef]
- Hameed, A.A.; Jamil, A.; Seyyedabbasi, A. An optimized feature selection approach using sand Cat Swarm optimization for hyperspectral image classification. Infrared Phys. Technol. 2024, 141, 105449. [Google Scholar] [CrossRef]
- Liu, C.; Chen, H.; Deng, L.; Guo, C.; Lu, X.; Yu, H.; Zhu, L.; Dong, M. Modality specific infrared and visible image fusion based on multi-scale rich feature representation under low-light environment. Infrared Phys. Technol. 2024, 140, 105351. [Google Scholar] [CrossRef]
- Xiong, J.; Liu, G.; Tang, H.; Gu, X.; Bavirisetti, D.P. SeGFusion: A semantic saliency guided infrared and visible image fusion method. Infrared Phys. Technol. 2024, 140, 105344. [Google Scholar] [CrossRef]
- Xiong, C.; Hu, M.; Lu, H.; Zhao, F. Distributed Multi-Sensor Fusion for Multi-Group/Extended Target Tracking with Different Limited Fields of View. Appl. Sci. 2024, 14, 9627. [Google Scholar] [CrossRef]
- Jia, L.; Yang, F.; Chen, Y.; Peng, L.; Leng, H.; Zu, W.; Zang, Y.; Gao, L.; Zhao, M. Prediction of wetland soil carbon storage based on near infrared hyperspectral imaging and deep learning. Infrared Phys. Technol. 2024, 139, 105287. [Google Scholar] [CrossRef]
- An, R.; Liu, G.; Qian, Y.; Xing, M.; Tang, H. MRASFusion: A multi-scale residual attention infrared and visible image fusion network based on semantic segmentation guidance. Infrared Phys. Technol. 2024, 139, 105343. [Google Scholar] [CrossRef]
- Shao, Y.; Kang, X.; Ma, M.; Chen, C.; Wang, D. Robust infrared small target detection with multi-feature fusion. Infrared Phys. Technol. 2024, 139, 104975. [Google Scholar] [CrossRef]
- Zhao, D.; Tang, L.; Arun, P.V.; Asano, Y.; Zhang, L.; Xiong, Y.; Tao, X.; Hu, J. City-scale distance estimation via near-infrared trispectral light extinction in bad weather. Infrared Phys. Technol. 2023, 128, 104507. [Google Scholar] [CrossRef]
- Amiri, I.; Houssien, F.M.A.M.; Rashed, A.N.Z.; Mohammed, A.E.N.A. Temperature effects on characteristics and performance of near-infrared wide bandwidth for different avalanche photodiodes structures. Results Phys. 2019, 14, 102399. [Google Scholar] [CrossRef]
- Renaud, D.; Assumpcao, D.R.; Joe, G.; Shams-Ansari, A.; Zhu, D.; Hu, Y.; Sinclair, N.; Loncar, M. Sub-1 Volt and high-bandwidth visible to near-infrared electro-optic modulators. Nat. Commun. 2023, 14, 1496. [Google Scholar] [CrossRef]
- Zhao, D.; Zhou, L.; Li, Y.; He, W.; Arun, P.V.; Zhu, X.; Hu, J. Visibility estimation via near-infrared bispectral real-time imaging in bad weather. Infrared Phys. Technol. 2024, 136, 105008. [Google Scholar] [CrossRef]
- Liu, Z.; Wang, X.; Zhong, Y.; Shu, M.; Sun, C. SiamHYPER: Learning a hyperspectral object tracker from an RGB-based tracker. IEEE Trans. Image Process. 2022, 31, 7116–7129. [Google Scholar] [CrossRef]
- Sarker, I.H. Deep learning: A comprehensive overview on techniques, taxonomy, applications and research directions. SN Comput. Sci. 2021, 2, 420. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Z.; Hu, B.; Wang, M.; Arun, P.V.; Zhao, D.; Zhu, X.; Hu, J.; Li, H.; Zhou, H.; Qian, K. Hyperspectral Video Tracker Based on Spectral Deviation Reduction and a Double Siamese Network. Remote Sens. 2023, 15, 1579. [Google Scholar] [CrossRef]
- Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6999–7019. [Google Scholar] [CrossRef]
- Li, Z.; Xiong, F.; Zhou, J.; Lu, J.; Qian, Y. Learning a deep ensemble network with band importance for hyperspectral object tracking. IEEE Trans. Image Process. 2023, 32, 2901–2914. [Google Scholar] [CrossRef] [PubMed]
- Tang, Y.; Liu, Y.; Huang, H. Target-aware and spatial-spectral discriminant feature joint correlation filters for hyperspectral video object tracking. Comput. Vis. Image Underst. 2022, 223, 103535. [Google Scholar] [CrossRef]
- Wei, X.; Bai, Y.; Zheng, Y.; Shi, D.; Gong, Y. Autoregressive visual tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 9697–9706. [Google Scholar]
- Zhu, Z.; Wu, W.; Zou, W.; Yan, J. End-to-end flow correlation tracking with spatial-temporal attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 548–557. [Google Scholar]
- Li, W.; Hou, Z.; Zhou, J.; Tao, R. SiamBAG: Band attention grouping-based Siamese object tracking network for hyperspectral videos. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5514712. [Google Scholar] [CrossRef]
- Sabour, S.; Frosst, N.; Hinton, G.E. Dynamic routing between capsules. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2017; Volume 30, pp. 3859–3869. [Google Scholar]
- Ma, D.; Wu, X. Capsule-based regression tracking via background inpainting. IEEE Trans. Image Process. 2023, 32, 2867–2878. [Google Scholar] [CrossRef]
- Paoletti, M.E.; Haut, J.M.; Fernandez-Beltran, R.; Plaza, J.; Plaza, A.; Li, J.; Pla, F. Capsule networks for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 57, 2145–2160. [Google Scholar] [CrossRef]
- Mei, Z.; Yin, Z.; Kong, X.; Wang, L.; Ren, H. Cascade residual capsule network for hyperspectral image classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 3089–3106. [Google Scholar] [CrossRef]
- Ma, D.; Wu, X. Cascaded Tracking via Pyramid Dense Capsules. In Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020; pp. 683–696. [Google Scholar]
- Guo, D.; Wang, J.; Cui, Y.; Wang, Z.; Chen, S. SiamCAR: Siamese fully convolutional classification and regression for visual tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 6269–6277. [Google Scholar]
- Guo, D.; Shao, Y.; Cui, Y.; Wang, Z.; Zhang, L.; Shen, C. Graph attention tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 9543–9552. [Google Scholar]
- Yan, B.; Peng, H.; Fu, J.; Wang, D.; Lu, H. Learning spatio-temporal transformer for visual tracking. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 10448–10457. [Google Scholar]
- Wang, Y.; Huo, L.; Fan, Y.; Wang, G. A thermal infrared target tracking based on multi-feature fusion and adaptive model update. Infrared Phys. Technol. 2024, 139, 105345. [Google Scholar] [CrossRef]
- Xiong, F.; Zhou, J.; Qian, Y. Material based object tracking in hyperspectral videos. IEEE Trans. Image Process. 2020, 29, 3719–3733. [Google Scholar] [CrossRef] [PubMed]
- Zhu, Z.; Wang, Q.; Li, B.; Wu, W.; Yan, J.; Hu, W. Distractor-aware siamese networks for visual object tracking. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 101–117. [Google Scholar]
- Bhat, G.; Danelljan, M.; Van Gool, L.; Timofte, R. Know your surroundings: Exploiting scene information for object tracking. In Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020; pp. 205–221. [Google Scholar]
- Yang, Z.; Wei, Y.; Yang, Y. Collaborative video object segmentation by multi-scale foreground-background integration. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 4701–4712. [Google Scholar] [CrossRef] [PubMed]
- Yang, Z.; Wei, Y.; Yang, Y. Associating objects with transformers for video object segmentation. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2021; Volume 34, pp. 2491–2502. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Chen, Y.; Yuan, Q.; Tang, Y.; Xiao, Y.; He, J.; Liu, Z. SENSE: Hyperspectral video object tracker via fusing material and motion cues. Inf. Fusion 2024, 109, 102395. [Google Scholar] [CrossRef]
- Chen, Y.; Yuan, Q.; Tang, Y.; Xiao, Y.; He, J.; Han, T.; Liu, Z.; Zhang, L. SSTtrack: A unified hyperspectral video tracking framework via modeling spectral-spatial-temporal conditions. Inf. Fusion 2025, 114, 102658. [Google Scholar] [CrossRef]
Sequences | car11 | car78 | rider11 | rider17 | board | kangaroo | ball&mirror9 | rainystreet10 |
---|---|---|---|---|---|---|---|---|
Frames | 101 | 119 | 142 | 165 | 471 | 117 | 250 | 275 |
Resolution | ||||||||
Initial Size | ||||||||
Challenges | OCC, SV, FM | OCC, SV, FM | SV | OCC, SV, LR | IPR, OPR, OCC, BC, SV | BC, SV, DEF, OPR, MB | REF | OCC |
Challenges | MHT | SiamBAN | SiamCAR | SiamGAT | STARK | TransT | Ours |
---|---|---|---|---|---|---|---|
BC | 0.019 | 0.043 | 0.021 | 0.582 | 0.300 | 0.030 | 0.071 |
FM | 0.444 | 0.609 | 0.661 | 0.649 | 0.643 | 0.659 | 0.768 |
LR | 0.262 | 0.407 | 0.454 | 0.512 | 0.527 | 0.443 | 0.651 |
OCC | 0.369 | 0.506 | 0.535 | 0.566 | 0.520 | 0.537 | 0.657 |
SV | 0.448 | 0.598 | 0.638 | 0.640 | 0.614 | 0.647 | 0.740 |
Challenges | MHT | SiamBAN | SiamCAR | SiamGAT | STARK | TransT | Ours |
---|---|---|---|---|---|---|---|
BC | 0.529 | 0.505 | 0.592 | 0.504 | 0.531 | 0.565 | 0.602 |
CM | 0.422 | 0.501 | 0.537 | 0.521 | 0.646 | 0.596 | 0.540 |
DEF | 0.662 | 0.664 | 0.719 | 0.688 | 0.701 | 0.724 | 0.723 |
FM | 0.544 | 0.614 | 0.662 | 0.675 | 0.613 | 0.631 | 0.674 |
IPR | 0.666 | 0.649 | 0.676 | 0.627 | 0.691 | 0.725 | 0.743 |
IV | 0.368 | 0.409 | 0.441 | 0.464 | 0.490 | 0.488 | 0.463 |
LC | 0.036 | 0.165 | 0.176 | 0.181 | 0.397 | 0.337 | 0.219 |
LL | 0.093 | 0.152 | 0.123 | 0.147 | 0.130 | 0.172 | 0.161 |
LR | 0.452 | 0.535 | 0.553 | 0.562 | 0.569 | 0.550 | 0.594 |
MB | 0.564 | 0.672 | 0.665 | 0.687 | 0.655 | 0.679 | 0.666 |
OCC | 0.449 | 0.472 | 0.510 | 0.531 | 0.547 | 0.551 | 0.558 |
OPR | 0.617 | 0.639 | 0.667 | 0.628 | 0.680 | 0.694 | 0.715 |
OV | 0.326 | 0.397 | 0.409 | 0.391 | 0.602 | 0.559 | 0.531 |
REF | 0.594 | 0.733 | 0.726 | 0.713 | 0.709 | 0.605 | 0.722 |
SC | 0.642 | 0.722 | 0.679 | 0.729 | 0.595 | 0.666 | 0.669 |
ST | 0.427 | 0.671 | 0.624 | 0.660 | 0.594 | 0.515 | 0.649 |
SV | 0.555 | 0.577 | 0.590 | 0.579 | 0.610 | 0.663 | 0.646 |
TLO | 0.213 | 0.336 | 0.321 | 0.381 | 0.212 | 0.511 | 0.229 |
TOl | 0.273 | 0.178 | 0.261 | 0.177 | 0.189 | 0.224 | 0.222 |
Method | Success Rate | Precision | FPS |
---|---|---|---|
Ours | 0.699 | 0.872 | 42.6 |
SiamGAT | 0.637 | 0.857 | 27.8 |
TransT | 0.641 | 0.823 | 36.8 |
SiamCAR | 0.623 | 0.830 | 78.6 |
STARK | 0.596 | 0.822 | 64.5 |
SiamBAN | 0.584 | 0.845 | 71.3 |
MHT | 0.512 | 0.789 | 3.2 |
Variations | HFBE | BT | SIP | MobileNet | Success Rate | Precision | FPS |
---|---|---|---|---|---|---|---|
i | ✓ | ✓ | ✓ | ✗ | 0.703 | 0.884 | 40.5 |
ii | ✓ | ✓ | ✗ | ✗ | 0.657 | 0.803 | 13.8 |
iii | ✓ | ✗ | ✓ | ✗ | 0.513 | 0.624 | 50.2 |
iv | ✓ | ✗ | ✗ | ✗ | 0.501 | 0.622 | 36.7 |
v | ✗ | ✓ | ✓ | ✗ | 0.563 | 0.612 | 69.0 |
vi | ✗ | ✓ | ✗ | ✗ | 0.526 | 0.637 | 39.3 |
vii | ✗ | ✗ | ✓ | ✗ | 0.326 | 0.412 | 112.6 |
viii | ✓ | ✗ | ✓ | ✓ | 0.498 | 0.610 | 132.8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, L.; Wang, M.; Zhong, W.; Huang, K.; Jiang, W.; Li, J.; Zhao, D. Near-Infrared Hyperspectral Target Tracking Based on Background Information and Spectral Position Prediction. Appl. Sci. 2025, 15, 4275. https://doi.org/10.3390/app15084275
Wu L, Wang M, Zhong W, Huang K, Jiang W, Li J, Zhao D. Near-Infrared Hyperspectral Target Tracking Based on Background Information and Spectral Position Prediction. Applied Sciences. 2025; 15(8):4275. https://doi.org/10.3390/app15084275
Chicago/Turabian StyleWu, Li, Mengyuan Wang, Weixiang Zhong, Kunpeng Huang, Wenhao Jiang, Jia Li, and Dong Zhao. 2025. "Near-Infrared Hyperspectral Target Tracking Based on Background Information and Spectral Position Prediction" Applied Sciences 15, no. 8: 4275. https://doi.org/10.3390/app15084275
APA StyleWu, L., Wang, M., Zhong, W., Huang, K., Jiang, W., Li, J., & Zhao, D. (2025). Near-Infrared Hyperspectral Target Tracking Based on Background Information and Spectral Position Prediction. Applied Sciences, 15(8), 4275. https://doi.org/10.3390/app15084275