Learning to Track Aircraft in Infrared Imagery
Abstract
:1. Introduction
- We propose a new approach to automatically learn features online that can be adapted to the current video domain without pre-training on large datasets.
- The general feature representations and the domain-specific features learned online are integrated into a unified framework to ensure the tracking performance.
- The proposed method can be embedded in a framework based on correlation filters as a flexible module to improve the performance.
- We carry out experiments on airborne infrared imagery to demonstrate that the proposed tracking algorithm achieves competitive performance compared with benchmark trackers.
2. Related Work
3. Proposed Algorithm
Algorithm 1: Proposed tracking algorithm. |
Input: Initial position and size of the aircraft [x0 y0 w0 h0]. Output: Estimated aircraft states [xi yi wi hi]. |
|
3.1. Learning via Convolutional Regression
3.2. Network Architecture
3.3. Tracking Algorithm
4. Experiments
4.1. Experimental Setup
4.2. Ablation Studies
4.3. Evaluating the Tracking Benchmark
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Li, X.; Liu, Q.; Fan, N.; He, Z.; Wang, H. Hierarchical spatial-aware Siamese network for thermal infrared object tracking. Knowl.-Based Syst. 2019, 166, 71–81. [Google Scholar] [CrossRef] [Green Version]
- Liu, Q.; Lu, X.; He, Z.; Zhang, C.; Chen, W.S. Deep convolutional neural networks for thermal infrared object tracking. Knowl.-Based Syst. 2017, 134, 189–198. [Google Scholar] [CrossRef]
- Zaveri, M.A.; Merchant, S.N.; Desai, U.B. Air-borne approaching target detection and tracking in infrared image sequence. In Proceedings of the 2004 International Conference on Image Processing (ICIP’04), Singapore, 24–27 October 2004; Volume 2, pp. 1025–1028. [Google Scholar] [CrossRef]
- Wan, M.; Gu, G.; Qian, W.; Ren, K.; Chen, Q.; Zhang, H.; Maldague, X. Total Variation Regularization Term-Based Low-Rank and Sparse Matrix Representation Model for Infrared Moving Target Tracking. Remote Sens. 2018, 10, 510. [Google Scholar] [CrossRef] [Green Version]
- Wan, M.; Gu, G.; Qian, W.; Ren, K.; Chen, Q.; Maldague, X. Infrared Image Enhancement Using Adaptive Histogram Partition and Brightness Correction. Remote Sens. 2018, 10, 682. [Google Scholar] [CrossRef] [Green Version]
- del Blanco, C.R.; Jaureguizar, F.; García, N.; Salgado, L. Robust automatic target tracking based on a Bayesian ego-motion compensation framework for airborne FLIR imagery. In Proceedings of the Automatic Target Recognition XIX, International Society for Optics and Photonics, Orlando, FL, USA, 13–14 April 2009; Volume 7335, p. 733514. [Google Scholar] [CrossRef] [Green Version]
- Wang, N.; Shi, J.; Yeung, D.Y.; Jia, J. Understanding and diagnosing visual tracking systems. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 3101–3109. [Google Scholar]
- Dai, K.; Wang, D.; Lu, H.; Sun, C.; Li, J. Visual Tracking via Adaptive Spatially-Regularized Correlation Filters. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4670–4679. [Google Scholar]
- Sun, Y.; Sun, C.; Wang, D.; He, Y.; Lu, H. ROI Pooled Correlation Filters for Visual Tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 5783–5791. [Google Scholar]
- Zhang, M.; Wang, Q.; Xing, J.; Gao, J.; Peng, P.; Hu, W.; Maybank, S. Visual tracking via spatially aligned correlation filters network. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 469–485. [Google Scholar]
- Li, F.; Tian, C.; Zuo, W.; Zhang, L.; Yang, M.H. Learning spatial-temporal regularized correlation filters for visual tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4904–4913. [Google Scholar]
- Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 583–596. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, Y.; Zhu, J. A scale adaptive kernel correlation filter tracker with feature integration. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 254–265. [Google Scholar]
- Bolme, D.S.; Beveridge, J.R.; Draper, B.A.; Lui, Y.M. Visual object tracking using adaptive correlation filters. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 2544–2550. [Google Scholar]
- Ma, C.; Huang, J.B.; Yang, X.; Yang, M.H. Robust visual tracking via hierarchical convolutional features. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 2709–2723. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Danelljan, M.; Hager, G.; Shahbaz Khan, F.; Felsberg, M. Convolutional features for correlation filter based visual tracking. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Santiago, Chile, 7–13 December 2015; pp. 58–66. [Google Scholar]
- Danelljan, M.; Hager, G.; Shahbaz Khan, F.; Felsberg, M. Learning spatially regularized correlation filters for visual tracking. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 4310–4318. [Google Scholar]
- Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. Exploiting the circulant structure of tracking-by-detection with kernels. In Proceedings of the European Conference on Computer Vision, Florence, Italy, 7–13 October 2012; pp. 702–715. [Google Scholar]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), San Diego, CA, USA, 20–26 June 2005; Volume 1, pp. 886–893. [Google Scholar]
- Danelljan, M.; Häger, G.; Khan, F.; Felsberg, M. Accurate scale estimation for robust visual tracking. In Proceedings of the British Machine Vision Conference, Nottingham, UK, 1–5 September 2014. [Google Scholar]
- Kiani Galoogahi, H.; Sim, T.; Lucey, S. Correlation filters with limited boundaries. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 4630–4638. [Google Scholar]
- Kiani Galoogahi, H.; Fagg, A.; Lucey, S. Learning background-aware correlation filters for visual tracking. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1135–1143. [Google Scholar]
- Song, Y.; Ma, C.; Gong, L.; Zhang, J.; Lau, R.W.; Yang, M.H. Crest: Convolutional residual learning for visual tracking. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2555–2564. [Google Scholar]
- Danelljan, M.; Bhat, G.; Khan, F.S.; Felsberg, M. ECO: Efficient Convolution Operators for Tracking. In Proceedings of the CVPR, Honolulu, HI, USA, 21–26 July 2017; Volume 1, p. 3. [Google Scholar]
- Ma, C.; Huang, J.B.; Yang, X.; Yang, M.H. Hierarchical convolutional features for visual tracking. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 3074–3082. [Google Scholar]
- Danelljan, M.; Robinson, A.; Khan, F.S.; Felsberg, M. Beyond correlation filters: Learning continuous convolution operators for visual tracking. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–16 October 2016; pp. 472–488. [Google Scholar]
- He, Z.; Fan, Y.; Zhuang, J.; Dong, Y.; Bai, H. Correlation filters with weighted convolution responses. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Venice, Italy, 22–29 October 2017; pp. 1992–2000. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Xu, T.; Feng, Z.H.; Wu, X.J.; Kittler, J. Joint group feature selection and discriminative filter learning for robust visual object tracking. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27 October 27–2 November 2019; pp. 7950–7960. [Google Scholar]
- Chen, K.; Tao, W. Convolutional regression for visual tracking. IEEE Trans. Image Process. 2018, 27, 3611–3620. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Mahendran, A.; Vedaldi, A. Understanding deep image representations by inverting them. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 5188–5196. [Google Scholar]
- Mahendran, A.; Vedaldi, A. Visualizing deep convolutional neural networks using natural pre-images. Int. J. Comput. Vis. 2016, 120, 233–255. [Google Scholar] [CrossRef] [Green Version]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Tomasi, C. Histograms of oriented gradients. Comput. Vis. Sampl. 2012, 1, 1–6. [Google Scholar]
- Wu, Y.; Lim, J.; Yang, M.H. Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1834–1848. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Liu, S.; Liu, D.; Srivastava, G.; Połap, D.; Woźniak, M. Overview of correlation filter based algorithms in object tracking. Complex Intell. Syst. 2020, 1, 1–23. [Google Scholar]
- Wu, S.; Zhang, K.; Niu, S.; Yan, J. Anti-Interference Aircraft-Tracking Method in Infrared Imagery. Sensors 2019, 19, 1289. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lepage, J.F.; Labrie, M.A.; Rouleau, E.; Richard, J.; Ross, V.; Dion, D.; Haarison, N. DRDC’s approach to IR scene generation for IRCM simulation. In Proceedings of the Technologies for Synthetic Environments: Hardware-in-the-Loop XVI, International Society for Optics and Photonics, Orlando, FL, USA, 27–28 April 2011; Volume 8015, p. 80150F. [Google Scholar] [CrossRef]
- Le Goff, A.; Cathala, T.; Latger, J. New impressive capabilities of SE-workbench for EO/IR real-time rendering of animated scenarios including flares. In Proceedings of the Target and Background Signatures. International Society for Optics and Photonics, Toulouse, France, 23–24 September 2015; Volume 9653, p. 965307. [Google Scholar] [CrossRef]
- Willers, C.J.; Willers, M.S.; Lapierre, F. Signature modelling and radiometric rendering equations in infrared scene simulation systems. In Proceedings of the Technologies for Optical Countermeasures VIII. International Society for Optics and Photonics, Prague, Czech Republic, 21–22 September 2011; Volume 8187, p. 81870R. [Google Scholar] [CrossRef]
- Li, B.; Yan, J.; Wu, W.; Zhu, Z.; Hu, X. High performance visual tracking with siamese region proposal network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8971–8980. [Google Scholar]
- Bertinetto, L.; Valmadre, J.; Henriques, J.F.; Vedaldi, A.; Torr, P.H. Fully-convolutional siamese networks for object tracking. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–16 October 2016; pp. 850–865. [Google Scholar]
- Grabner, H.; Grabner, M.; Bischof, H. Real-time tracking via on-line boosting. Bmvc 2006, 1, 6. [Google Scholar]
- Grabner, H.; Leistner, C.; Bischof, H. Semi-supervised on-line boosting for robust tracking. In Proceedings of the European Conference on Computer Vision, Marseille, France, 12–18 October 2008; pp. 234–247. [Google Scholar]
- Kalal, Z.; Matas, J.; Mikolajczyk, K. Pn learning: Bootstrapping binary classifiers by structural constraints. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 49–56. [Google Scholar]
- Hare, S.; Golodetz, S.; Saffari, A.; Vineet, V.; Cheng, M.M.; Hicks, S.L.; Torr, P.H. Struck: Structured output tracking with kernels. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 2096–2109. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bao, C.; Wu, Y.; Ling, H.; Ji, H. Real time robust l1 tracker using accelerated proximal gradient approach. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 1830–1837. [Google Scholar]
- Jia, X.; Lu, H.; Yang, M.H. Visual tracking via adaptive structural local sparse appearance model. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 1822–1829. [Google Scholar]
- Real, E.; Shlens, J.; Mazzocchi, S.; Pan, X.; Vanhoucke, V. Youtube-boundingboxes: A large high-precision human-annotated data set for object detection in video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 22 October 2017; pp. 5296–5305. [Google Scholar]
Layer | Kernel Size | Channel | Stride | Padding |
---|---|---|---|---|
Conv1 | 3 × 3 | 18 | 1 | 1 |
Conv2 | 8 × 8 | 18 | 4 | 2 |
Conv3 | 2 × 2 | 108 | 1 | 0 |
Conv4 | 2 × 2 | 27 | 1 | 1 |
Datasets | Number of Sequences | Max Frames | Min Frames | Total Frames | Bit Depth | Resolution |
---|---|---|---|---|---|---|
Synthetic imagery | 6 | 450 | 316 | 2304 | 8 | 128 × 128 |
Real imagery | 4 | 2831 | 821 | 8832 | 8 | 640 × 512 |
Tracker | Feature | Search | MU | Precision | Overlap | ||
---|---|---|---|---|---|---|---|
DM | CF based | KCF | HOG | DS | Y | 0.668 | 0.348 |
HCFT | Pretrained CNN | DS | Y | 0.741 | 0.377 | ||
CFOL | CNN Learned online | DS | Y | 0.748 | 0.439 | ||
ECO | Pretrained CNN | DS | Y | 0.898 | 0.764 | ||
ECOOL | CNN Learned online | DS | Y | 0.932 | 0.834 | ||
Boosting based | OAB | Haar, BP, OH | DS | Y | 0.701 | 0.408 | |
SemiT | Haar | DS | Y | 0.652 | 0.359 | ||
Struck | Haar | DS | Y | 0.749 | 0.471 | ||
TLD | BP | DS | Y | 0.841 | 0.434 | ||
GM | ASLA | Sparse | PF | Y | 0.489 | 0.384 | |
L1APG | Sparse | PF | Y | 0.515 | 0.356 | ||
Siamese | SiamRPN | Pretrained CNN | DS | N | 0.638 | 0.448 | |
SiamFC | Pretrained CNN | DS | N | 0.643 | 0.471 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, S.; Zhang, K.; Li, S.; Yan, J. Learning to Track Aircraft in Infrared Imagery. Remote Sens. 2020, 12, 3995. https://doi.org/10.3390/rs12233995
Wu S, Zhang K, Li S, Yan J. Learning to Track Aircraft in Infrared Imagery. Remote Sensing. 2020; 12(23):3995. https://doi.org/10.3390/rs12233995
Chicago/Turabian StyleWu, Sijie, Kai Zhang, Shaoyi Li, and Jie Yan. 2020. "Learning to Track Aircraft in Infrared Imagery" Remote Sensing 12, no. 23: 3995. https://doi.org/10.3390/rs12233995
APA StyleWu, S., Zhang, K., Li, S., & Yan, J. (2020). Learning to Track Aircraft in Infrared Imagery. Remote Sensing, 12(23), 3995. https://doi.org/10.3390/rs12233995