Semantic Space Analysis for Zero-Shot Learning on SAR Images
Abstract
:1. Introduction
- We systematically investigate the effects of three popular semantic spaces on SAR ZSL—a first attempt in the literature to our knowledge;
- We introduce three benchmarks for SAR ZSL. These datasets are constructed with multiple semantic features and multiple data settings that are helpful for future research on SAR ZSL.
2. Materials and Methods
2.1. Semantic Spaces and Baseline Method
2.1.1. Semantic Spaces
2.1.2. Baseline Method
2.2. Datasets
2.2.1. Unicorn-ZSL
2.2.2. COS10-ZSL
2.2.3. COS15-ZSL
2.3. Experimental Setup
3. Results
ZSL Performance
4. Discussion
4.1. Factors to Improve Semantic Features
4.2. Bias Phenomenon
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
SAR | Synthetic aperture radar |
ZSL | Zero-shot learning |
DNNs | Deep neural networks |
VGG11/VGG16/ResNet18 | Each one is a deep neural network |
RS | Remote sensing |
Unicorn-ZSL | A dataset for SAR ZSL consisting of 10 classes collected from the open Unicorn dataset |
COS10-ZSL | A dataset for SAR ZSL consisting of five classes collected by ourselves and five classes collected from the open Unicorn dataset |
COS15-ZSL | A dataset for SAR ZSL combining Unicorn-ZSL with COS10-ZSL |
OA | Overall accuracy of all testing samples |
APA | Average per-class accuracy; average of accuracies of all testing classes |
References
- Gui, S.; Song, S.; Qin, R.; Tang, Y. Remote sensing object detection in the deep learning era—A review. Remote Sens. 2024, 16, 327. [Google Scholar] [CrossRef]
- Cheng, G.; Han, J.; Lu, X. Remote sensing image scene classification: Benchmark and state of the art. Proc. IEEE 2017, 105, 1865–1883. [Google Scholar] [CrossRef]
- Zhang, T.; Zhang, X.; Ke, X.; Liu, C.; Xu, X.; Zhan, X.; Wang, C.; Ahmad, I.; Zhou, Y.; Pan, D.; et al. Hog-shipclsnet: A novel deep learning network with hog feature fusion for sar ship classification. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5210322. [Google Scholar] [CrossRef]
- Qian, X.; Liu, F.; Jiao, L.; Zhang, X.; Chen, P.; Li, L.; Gu, J.; Cui, Y. A hybrid network with structural constraints for sar image scene classification. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5202717. [Google Scholar] [CrossRef]
- Wang, L.; Qi, Y.; Mathiopoulos, P.T.; Zhao, C.; Mazhar, S. An improved sar ship classification method using text-to-image generation-based data augmentation and squeeze and excitation. Remote Sens. 2024, 16, 1299. [Google Scholar] [CrossRef]
- Wang, Y.; Zhang, W.; Chen, W.; Chen, C. Bsdsnet: Dual-stream feature extraction network based on segment anything model for synthetic aperture radar land cover classification. Remote Sens. 2024, 16, 1150. [Google Scholar] [CrossRef]
- Ren, S.; Zhou, F.; Bruzzone, L. Transfer-aware graph u-net with cross-level interactions for polsar image semantic segmentation. Remote Sens. 2024, 16, 1428. [Google Scholar] [CrossRef]
- Zhang, S.; Li, W.; Wang, R.; Liang, C.; Feng, X.; Hu, Y. Daliws: A high-resolution dataset with precise annotations for water segmentation in synthetic aperture radar images. Remote Sens. 2024, 16, 720. [Google Scholar] [CrossRef]
- Zhang, H.; Jian, Y.; Zhang, J.; Li, X.; Zhang, X.; Wu, J. Moving target shadow detection in video sar based on multi-frame images and deep learning. In Proceedings of the IGARSS 2022–2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; pp. 2666–2669. [Google Scholar]
- Li, C.; Yang, Y.; Yang, X.; Chu, D.; Cao, W. A novel multi-scale feature map fusion for oil spill detection of sar remote sensing. Remote Sens. 2024, 16, 1684. [Google Scholar] [CrossRef]
- Wei, Q.-R.; Chen, C.-Y.; He, M.; He, H.-M. Zero-shot sar target recognition based on classification assistance. IEEE Geosci. Remote Sens. Lett. 2023, 20, 4003705. [Google Scholar] [CrossRef]
- Yan, K.; Sun, Y.; Li, W. Feature generation-aided zero-shot fast sar target recognition with semantic attributes. IEEE Geosci. Remote Sens. Lett. 2024, 21, 4006805. [Google Scholar] [CrossRef]
- Wei, H.; Wang, Z.; Hua, G.; Ni, Y. A zero-shot nas method for sar ship detection under polynomial search complexity. IEEE Signal Process. Lett. 2024, 31, 1329–1333. [Google Scholar] [CrossRef]
- Guo, Q.; Xu, H.; Xu, F. Causal adversarial autoencoder for disentangled sar image representation and few-shot target recognition. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5221114. [Google Scholar] [CrossRef]
- Zheng, J.; Li, M.; Li, X.; Zhang, P.; Wu, Y. Revisiting local and global descriptor-based metric network for few-shot sar target classification. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5205814. [Google Scholar] [CrossRef]
- Zhao, Y.; Zhao, L.; Ding, D.; Hu, D.; Kuang, G.; Liu, L. Few-shot class-incremental sar target recognition via cosine prototype learning. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5212718. [Google Scholar]
- Frome, A.; Corrado, G.S.; Shlens, J.; Bengio, S.; Dean, J.; Mikolov, T. Devise: A deep visual-semantic embedding model. In Advances in Neural Information Processing Systems, NIPS; NeurIPS: San Diego, CA, USA, 2013; pp. 2121–2129. [Google Scholar]
- Song, Q.; Xu, F. Zero-shot learning of sar target feature space with deep generative neural networks. IEEE Geosci. Remote Sens. Lett. 2017, 14, 2245–2249. [Google Scholar] [CrossRef]
- Toizumi, T.; Sagi, K.; Senda, Y. Automatic association between sar and optical images based on zero-shot learning. In Proceedings of the IGARSS 2018–2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 17–20. [Google Scholar]
- Gui, R.; Xu, X.; Wang, L.; Yang, R.; Pu, F. A generalized zero-shot learning framework for polsar land cover classification. Remote Sens. 2018, 10, 1307. [Google Scholar] [CrossRef]
- Wei, Q.-R.; He, H.; Zhao, Y.; Li, J.-A. Learn to recognize unknown sar targets from reflection similarity. IEEE Geosci. Remote Sens. Lett. 2020, 19, 4002205. [Google Scholar] [CrossRef]
- Song, Q.; Chen, H.; Xu, F.; Cui, T.J. Em simulation-aided zero-shot learning for sar automatic target recognition. IEEE Geosci. Remote Sens. Lett. 2019, 17, 1092–1096. [Google Scholar] [CrossRef]
- Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.S.; Dean, J. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, NIPS; NeurIPS: San Diego, CA, USA, 2013; pp. 3111–3119. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- Liu, J.; Inkawhich, N.; Nina, O.; Timofte, R. Ntire 2021 multi-modal aerial view object classification challenge. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 588–595. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Huang, G.; Liu, Z.; Maaten, L.V.D.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Xian, Y.; Lampert, C.H.; Schiele, B.; Akata, Z. Zero-shot learning—A comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 41, 2251–2265. [Google Scholar] [CrossRef] [PubMed]
- Liu, B.; Hu, L.; Hu, Z.; Dong, Q. Hardboost: Boosting zero-shot learning with hard classes. arXiv 2022, arXiv:2201.05479. [Google Scholar]
- Cohen-Wang, B.; Vendrow, J.; Madry, A. Ask your distribution shift if pre-training is right for you. arXiv 2024, arXiv:2403.00194. [Google Scholar]
- Hsu, W.-N.; Sriram, A.; Baevski, A.; Likhomanenko, T.; Xu, Q.; Pratap, V.; Kahn, J.; Lee, A.; Collobert, R.; Synnaeve, G.; et al. Robust wav2vec 2.0: Analyzing domain shift in self-supervised pre-training. arXiv 2021, arXiv:2104.01027. [Google Scholar]
Dataset | Sedan | SUV | Pickup Truck | Van | Box Truck | Motorcycle | Flatbed Truck | Bus | Pickup Truck with Trailer | Flatbed Truck with Trailer |
---|---|---|---|---|---|---|---|---|---|---|
Unicorn | 234,209 | 28,089 | 15,301 | 10,655 | 1741 | 852 | 828 | 624 | 840 | 633 |
Unicorn-ZSL | 2000 | 2000 | 2000 | 2000 | 1741 | 852 | 828 | 624 | 840 | 633 |
Unicorn-ZSL-RS | 1000 | 1000 | 1000 | 1000 | 1000 | 852 | 828 | 624 | 840 | 633 |
Unicorn-ZSL-Web | 60 | 60 | 60 | 60 | 60 | 60 | 60 | 60 | 60 | 60 |
Dataset | Split | Unseen Classes |
---|---|---|
Unicorn-ZSL | 2-1 | Pickup truck with trailer, flatbed truck with trailer |
Unicorn-ZSL | 2-2 | SUV, flatbed truck |
Unicorn-ZSL | 2-3 | SUV, bus |
Unicorn-ZSL | 3-1 | Sedan, pickup truck with trailer, flatbed truck with trailer |
Unicorn-ZSL | 3-2 | Motorcycle, pickup truck with trailer, flatbed truck with trailer |
Unicorn-ZSL | 3-3 | SUV, bus, pickup truck with trailer |
COS10-ZSL | 2-1 | Flyover, flatbed truck |
COS10-ZSL | 2-2 | Flyover, pickup truck |
COS10-ZSL | 2-3 | Flyover, SUV |
COS10-ZSL | 3-1 | Flyover, sedan, flatbed truck |
COS10-ZSL | 3-2 | Flyover, sedan, pickup truck |
COS10-ZSL | 3-3 | Flyover, pickup truck, bus |
COS15-ZSL | 3-1 | Flyover, pickup truck with trailer, flatbed truck with trailer |
COS15-ZSL | 3-2 | Flyover, SUV, pickup truck with trailer |
COS15-ZSL | 3-3 | Flyover, sedan, flatbed truck with trailer |
Dataset | Airplane | Bridge | Building | Flyover | Oiltank | Sedan | SUV | Pickup Truck | Flatbed Truck | Bus |
---|---|---|---|---|---|---|---|---|---|---|
COS10-ZSL | 190 | 400 | 500 | 495 | 655 | 800 | 800 | 800 | 800 | 624 |
COS10-ZSL-RS | 700 | 700 | 700 | 700 | 700 | 800 | 800 | 800 | 800 | 624 |
COS10-ZSL-Web | 60 | 60 | 60 | 60 | 60 | 60 | 60 | 60 | 60 | 60 |
Semantic Space | Splits | |||||
---|---|---|---|---|---|---|
2-1 | 2-2 | 2-3 | 3-1 | 3-2 | 3-3 | |
Word vector | 50.0 | 50.0 | 66.2 | 33.3 | 33.3 | 37.5 |
RS optical image | 95.1 | 71.7 | 82.8 | 51.2 | 65.1 | 60.9 |
Web optical image | 52.4 | 65.5 | 68.1 | 39.2 | 36.5 | 51.0 |
Semantic Space | Splits | |||||
---|---|---|---|---|---|---|
2-1 | 2-2 | 2-3 | 3-1 | 3-2 | 3-3 | |
Word vector | 100.0 | 100.0 | 100.0 | 66.7 | 66.7 | 68.0 |
RS optical image | 100.0 | 100.0 | 100.0 | 66.7 | 73.5 | 76.4 |
Web optical image | 100.0 | 100.0 | 100.0 | 66.7 | 66.7 | 67.2 |
Semantic Space | Splits | ||
---|---|---|---|
3-1 | 3-2 | 3-3 | |
Word vector | 66.8 | 75.4 | 74.8 |
RS optical image | 79.1 | 66.7 | 84.0 |
Web optical image | 82.1 | 90.0 | 71.1 |
Setting | Splits | |||||||
---|---|---|---|---|---|---|---|---|
2-1 | 2-2 | 2-3 | 2-ave | 3-1 | 3-2 | 3-3 | 3-ave | |
Default | 52.4 | 65.5 | 68.1 | 62.0 | 39.2 | 36.5 | 51.0 | 42.2 |
VGG11 | 54.0 | 50.0 | 83.6 | 62.5 | 44.1 | 57.0 | 58.5 | 53.2 |
VGG16 | 53.7 | 50.0 | 76.0 | 59.9 | 45.5 | 33.3 | 33.3 | 37.4 |
Scale () | 59.5 | 50.0 | 66.8 | 58.8 | 41.9 | 51.2 | 41.7 | 44.9 |
Scale () | 68.3 | 50.4 | 69.6 | 62.8 | 47.3 | 50.4 | 61.6 | 53.1 |
w/o PT | 50.0 | 50.0 | 58.5 | 52.8 | 33.3 | 33.3 | 43.5 | 36.7 |
Setting | Splits | |||||||
---|---|---|---|---|---|---|---|---|
2-1 | 2-2 | 2-3 | 2-ave | 3-1 | 3-2 | 3-3 | 3-ave | |
Default | 95.1 | 71.7 | 82.8 | 83.2 | 56.3 | 41.8 | 58.7 | 52.3 |
VGG16 | 74.0 | 60.5 | 61.1 | 65.2 | 52.5 | 65.4 | 33.3 | 50.4 |
ResNet18 | 67.1 | 75.6 | 71.7 | 71.5 | 41.5 | 33.3 | 65.1 | 46.6 |
Scale () | 94.7 | 68.5 | 86.9 | 83.4 | 48.4 | 50.9 | 51.0 | 50.1 |
Scale () | 50.3 | 71.8 | 66.6 | 62.9 | 43.3 | 47.9 | 41.5 | 44.2 |
PT | 62.6 | 68.9 | 77.0 | 69.5 | 44.2 | 35.9 | 46.6 | 42.2 |
Setting | 2-1 | 2-2 | 2-3 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
C1(168) | C2(126) | OA | APA | C1(400) | C2(165) | OA | APA | C1(400) | C2(124) | OA | APA | |
Word vector | 100.0 | 0.0 | 57.1 | 50.0 | 100.0 | 0.0 | 70.8 | 50.0 | 47.8 | 84.7 | 56.5 | 66.2 |
RS optical image | 91.1 | 99.2 | 94.6 | 95.1 | 68.3 | 75.2 | 70.3 | 71.7 | 90.0 | 75.8 | 86.4 | 82.8 |
Web optical image | 100.0 | 4.8 | 59.2 | 52.4 | 89.3 | 41.8 | 75.4 | 65.5 | 95.0 | 41.1 | 82.3 | 68.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, B.; Xu, J.; Zeng, H.; Dong, Q.; Hu, Z. Semantic Space Analysis for Zero-Shot Learning on SAR Images. Remote Sens. 2024, 16, 2627. https://doi.org/10.3390/rs16142627
Liu B, Xu J, Zeng H, Dong Q, Hu Z. Semantic Space Analysis for Zero-Shot Learning on SAR Images. Remote Sensing. 2024; 16(14):2627. https://doi.org/10.3390/rs16142627
Chicago/Turabian StyleLiu, Bo, Jiping Xu, Hui Zeng, Qiulei Dong, and Zhanyi Hu. 2024. "Semantic Space Analysis for Zero-Shot Learning on SAR Images" Remote Sensing 16, no. 14: 2627. https://doi.org/10.3390/rs16142627
APA StyleLiu, B., Xu, J., Zeng, H., Dong, Q., & Hu, Z. (2024). Semantic Space Analysis for Zero-Shot Learning on SAR Images. Remote Sensing, 16(14), 2627. https://doi.org/10.3390/rs16142627