A Novel Ensemble Architecture of Residual Attention-Based Deep Metric Learning for Remote Sensing Image Retrieval
Abstract
:1. Introduction
- A novel architecture of residual attention-based deep metric learning is developed, in which residual attention is improved in number and position as the Residual Attention Branch obtains more distinctive features. Meanwhile, global features extracted from the Main Branch are maintained to learn more similarity relationships among remote sensing images without extra parameters. Additionally, the dynamic weighted feature vectors in the subsequent similarity metric stage are also conditional on the feature discrimination, optimizing the retrieval results.
- The traditional ensemble method is improved by merging multiple descriptors rather than embedding the subspace to further encourage the distinguishability of remote sensing features to ultimately improve the RSIR performance. Merging descriptors decrease the computational complexity in the embedding space and significantly reduces the time and memory consumption during the training phase.
- By training the proposed model in an end-to-end manner, exhaustive experiments are conducted on three remote sensing benchmark datasets: UCMD [31], SIRI-WHU [32], and AID [33]. Comparisons of the proposed method with the SOTA methods BIER [34], A-BIER [35], DCES [36], and ABE [37] demonstrate that EARA reduces the retrieval time and FLOPs to nearly 20% and 8% of ABE on AID.
2. Related Work
2.1. Deep Metric Learning for RSIR
2.1.1. Loss-Based Methods
2.1.2. Ensemble-Based Methods
- Embedding Ensemble Method: Embedding ensemble methods, which represent conventional ensemble methods, aim to divide the last embedding layer of a CNN into multiple embedding spaces to train corresponding sub-learners individually and then concatenate the sub-learners to improve the performance for image retrieval. Opitz et al. [34] used online gradient boosting to train each nonoverlapping learner in ensemble, called BIER, to get a higher image retrieval accuracy. Training BIER requires a high learning rate; however, the lack of auxiliary loss functions in BIER results in the decline. With the consideration of network attenuation, in their subsequent work [35], they combined BIER with an adversarial loss to make the network more stable. Sanakoyeu et al. [37] jointly divided the embedding space and data into K smaller subproblems to reduce the correlation of sub-learners and increase the convergence speed, compared with A-BIER [35]. However, additional parameters would be unavoidably introduced in A-BIER to yield the sub-learners, especially in high-dimensional embedding, which requires long training time and high computation cost [44].
- Descriptor Ensemble Method: Descriptor ensemble methods were proposed to avoid high computational complexity in embedding induced by embedding ensemble methods. The effectiveness of descriptor ensemble methods has been seen in the field of natural image retrieval. For example, Zehang et al. [45] boosted the image retrieval performance by combining different global descriptors that were trained individually. However, descriptor ensemble methods have not yet been widely applied to RSIR. There are mainly two reasons for the limited use of descriptor ensemble methods in RSIR. First, the descriptor ensemble methods are not trained in an end-to-end manner. It might lead to multistage goal deviations on remote sensing images, which are characterized by the phenomenon of the inter-class similarities and intraclass differences. Second, the lack of constrains on descriptors leads to inconspicuous improvement in the feature discrimination of remote sensing images.
2.2. Attention Mechanism for RSIR
3. Methodology
3.1. Submodule
3.1.1. Main Branch
3.1.2. Residual Attention Branch
3.2. Loss Function
Algorithm 1: Merged submodules for our architecture |
1: Input: , 2: Output: 3: */forward propagation: 4: The submodule: 5: for i = 1 to n do 6: 7: In Residual Attention Branch: 8: 9: 10: 11: end for 12: for do j = 1 to L 13: 14: 15: end for |
16: Calculation of |
4. Experiments
4.1. Datasets
- UCMD: The UCMerced Land Use Database (UCMD) [31] is a land cover or land use dataset used as the RSIR benchmark dataset, which is a highly challenging dataset with some highly overlapping categories, such as the dense residential and intersection. It contains 21 classes, and each class has 100 images of 256 × 256 pixels with a spatial resolution of approximately 0.3 m. The images were downloaded from the United States Geological Survey (USGS) by the team at the University of California Merced from various US urban areas.
- SIRI-WHU: Google Image Dataset of SIRI-WHU [32] contains 2400 remote sensing images with a size of 200 × 200 pixels and a spatial resolution of 2 m. This dataset contains 12 geographic categories, and there are 200 images in each category. The number of images per category is twice than that in UCMD, while the number of categories is approximately same as UCMD, which would show the impact of the quantity per class on the discrimination of the features extracted by the architecture.
- AID: Aerial Image Dataset (AID) [33] is a dataset specifically designed for remote sensing image classification and retrieval tasks. It contains a total of 10,000 images divided into 30 semantic classes, such as commercial, dense residential, and viaduct. All the images have a size of 600 × 600 pixels in the RGB space, with a spatial resolution ranging from 8 to 0.5 m, and the number of each semantic class varies from 220 to 420 images. The number of images of AID is four times the size of UCMD and twice SIRI-WHU.
4.2. Configurations of Architecture
4.3. Ablation Experiments
4.3.1. Activation Functions of Attention in Residual Attention Branch
4.3.2. Impact of Type and Numbers of Descriptors on RSIR Performance
4.3.3. Comparison with SOTA Ranking-Motivated Losses
4.4. Comparative Experiments with SOTA Methods
4.4.1. Comparison with Multiple DML-Based Ensemble Methods in RSIR
4.4.2. Comparison in Overall Results and Per-Class Results
4.4.3. Comparison with DML-Based Ensemble Methods in Retrieval Execution Complexity
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Peijun, D.U.; Yunhao, C.; Hong, T.; Tao, F. Study on Content-Based Remote Sensing Image Retrieval. In Proceedings of the 2005 IEEE International Geoscience and Remote Sensing Symposium, IGARSS’05, Seoul, Korea, 29 July 2005; Volume 2, p. 4. [Google Scholar]
- Özkan, S.; Ateş, T.; Tola, E.; Soysal, M.; Esen, E. Performance Analysis of State-of-the-Art Representation Methods for Geographical Image Retrieval and Categorization. IEEE Geosci. Remote Sens. Lett. 2014, 11, 1996–2000. [Google Scholar] [CrossRef]
- Li, D.; Tian, Y. Survey and Experimental Study on Metric Learning Methods. Neural Netw. 2018, 105, 447–462. [Google Scholar] [CrossRef] [PubMed]
- Fernandez-Beltran, R.; Latorre-Carmona, P.; Pla, F. Single-Frame Super-Resolution in Remote Sensing: A Practical Overview. Int. J. Remote Sens. 2017, 38, 314–354. [Google Scholar] [CrossRef]
- Zhang, B.; Chen, Z.; Peng, D.; Benediktsson, J.A.; Liu, B.; Zou, L.; Li, J.; Plaza, A. Remotely Sensed Big Data: Evolution in Model Development for Information Extraction [Point of View]. Proc. IEEE 2019, 107, 2294–2301. [Google Scholar] [CrossRef]
- Yang, Y.; Newsam, S. Geographic Image Retrieval Using Local Invariant Features. IEEE Trans. Geosci. Remote Sens. 2012, 51, 818–832. [Google Scholar] [CrossRef]
- Li, E.; Du, P.; Samat, A.; Meng, Y.; Che, M. Mid-Level Feature Representation via Sparse Autoencoder for Remotely Sensed Scene Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 10, 1068–1081. [Google Scholar] [CrossRef]
- Chen, Y.; Jiang, H.; Li, C.; Jia, X.; Ghamisi, P. Deep Feature Extraction and Classification of Hyperspectral Images Based on Convolutional Neural Networks. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6232–6251. [Google Scholar] [CrossRef] [Green Version]
- Manjunath, B.S.; Ma, W.-Y. Texture Features for Browsing and Retrieval of Image Data. IEEE Trans. Pattern Anal. Mach. Intell. 1996, 18, 837–842. [Google Scholar] [CrossRef] [Green Version]
- Bretschneider, T.; Cavet, R.; Kao, O. Retrieval of Remotely Sensed Imagery Using Spectral Information Content. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Toronto, ON, Canada, 24–28 June 2002; Volume 4, pp. 2253–2255. [Google Scholar]
- Bratasanu, D.; Nedelcu, I.; Datcu, M. Bridging the Semantic Gap for Satellite Image Annotation and Automatic Mapping Applications. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2010, 4, 193–204. [Google Scholar] [CrossRef] [Green Version]
- Ma, Y.; Wu, H.; Wang, L.; Huang, B.; Ranjan, R.; Zomaya, A.; Jie, W. Remote Sensing Big Data Computing: Challenges and Opportunities. Future Gener. Comput. Syst. 2015, 51, 47–60. [Google Scholar] [CrossRef] [Green Version]
- Li, Y.; Zhang, Y.; Tao, C.; Zhu, H. Content-Based High-Resolution Remote Sensing Image Retrieval via Unsupervised Feature Learning and Collaborative Affinity Metric Fusion. Remote Sens. 2016, 8, 709. [Google Scholar] [CrossRef] [Green Version]
- Ge, Y.; Jiang, S.; Xu, Q.; Jiang, C.; Ye, F. Exploiting Representations from Pre-Trained Convolutional Neural Networks for High-Resolution Remote Sensing Image Retrieval. Multimed. Tools Appl. 2018, 77, 17489–17515. [Google Scholar] [CrossRef]
- Pires de Lima, R.; Marfurt, K. Convolutional Neural Network for Remote-Sensing Scene Classification: Transfer Learning Analysis. Remote Sens. 2020, 12, 86. [Google Scholar] [CrossRef] [Green Version]
- Yang, L.; Jin, R. Distance Metric Learning: A Comprehensive Survey. Mich. State Univ. 2006, 2, 4. [Google Scholar]
- Ye, F.; Xiao, H.; Zhao, X.; Dong, M.; Luo, W.; Min, W. Remote Sensing Image Retrieval Using Convolutional Neural Network Features and Weighted Distance. IEEE Geosci. Remote Sens. Lett. 2018, 15, 1535–1539. [Google Scholar] [CrossRef]
- Hu, J.; Lu, J.; Tan, Y.-P. Deep Transfer Metric Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 325–333. [Google Scholar]
- Schroff, F.; Kalenichenko, D.; Philbin, J. Facenet: A Unified Embedding for Face Recognition and Clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 815–823. [Google Scholar]
- Chopra, S.; Hadsell, R.; LeCun, Y. Learning a Similarity Metric Discriminatively, with Application to Face Verification. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Boston, MA, USA, 7–12 June 2015; Volume 1, pp. 539–546. [Google Scholar]
- Oh Song, H.; Xiang, Y.; Jegelka, S.; Savarese, S. Deep Metric Learning via Lifted Structured Feature Embedding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 4004–4012. [Google Scholar]
- Cheng, G.; Yang, C.; Yao, X.; Guo, L.; Han, J. When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNs. IEEE Trans. Geosci. Remote Sens. 2018, 56, 2811–2821. [Google Scholar] [CrossRef]
- Roy, S.; Sangineto, E.; Demir, B.; Sebe, N. Deep Metric and Hash-Code Learning for Content-Based Retrieval of Remote Sensing Images. In Proceedings of the IGARSS 2018—2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 4539–4542. [Google Scholar]
- Cao, R.; Zhang, Q.; Zhu, J.; Li, Q.; Qiu, G. Enhancing Remote Sensing Image Retrieval with Triplet Deep Metric Learning Network. arXiv 2019, arXiv:1902.05818. [Google Scholar]
- Hadsell, R.; Chopra, S.; LeCun, Y. Dimensionality Reduction by Learning an Invariant Mapping. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), New York, NY, USA, 17–22 June 2006; Volume 2, pp. 1735–1742. [Google Scholar]
- Law, M.T.; Thome, N.; Cord, M. Quadruplet-Wise Image Similarity Learning. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, NSW, Australia, 1–8 December 2013; pp. 249–256. [Google Scholar]
- Liu, P.; Gou, G.; Shan, X.; Tao, D.; Zhou, Q. Global Optimal Structured Embedding Learning for Remote Sensing Image Retrieval. Sensors 2020, 20, 291. [Google Scholar] [CrossRef] [Green Version]
- Zhao, H.; Yuan, L.; Zhao, H. Similarity Retention Loss (SRL) Based on Deep Metric Learning for Remote Sensing Image Retrieval. ISPRS Int. J. Geo Inf. 2020, 9, 61. [Google Scholar] [CrossRef] [Green Version]
- Chi, M.; Plaza, A.; Benediktsson, J.A.; Sun, Z.; Shen, J.; Zhu, Y. Big Data for Remote Sensing: Challenges and Opportunities. Proc. IEEE 2016, 104, 2207–2219. [Google Scholar] [CrossRef]
- Wang, F.; Jiang, M.; Qian, C.; Yang, S.; Li, C.; Zhang, H.; Wang, X.; Tang, X. Residual Attention Network for Image Classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3156–3164. [Google Scholar]
- Yang, Y.; Newsam, S. Bag-of-Visual-Words and Spatial Extensions for Land-Use Classification. In Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA, 2–5 November 2010; pp. 270–279. [Google Scholar]
- Zhu, Q.; Zhong, Y.; Zhao, B.; Xia, G.-S.; Zhang, L. Bag-of-Visual-Words Scene Classifier with Local and Global Features for High Spatial Resolution Remote Sensing Imagery. IEEE Geosci. Remote Sens. Lett. 2016, 13, 747–751. [Google Scholar] [CrossRef]
- Xia, G.-S.; Hu, J.; Hu, F.; Shi, B.; Bai, X.; Zhong, Y.; Zhang, L.; Lu, X. AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3965–3981. [Google Scholar] [CrossRef] [Green Version]
- Opitz, M.; Possegger, H.; Bischof, H. Efficient Model Averaging for Deep Neural Networks. In Proceedings of the Asian Conference on Computer Vision, Taipei, Taiwan, 20–24 November 2016; pp. 205–220. [Google Scholar]
- Opitz, M.; Waltner, G.; Possegger, H.; Bischof, H. Deep Metric Learning with Bier: Boosting Independent Embeddings Robustly. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 42, 276–290. [Google Scholar] [CrossRef] [Green Version]
- Sanakoyeu, A.; Tschernezki, V.; Buchler, U.; Ommer, B. Divide and Conquer the Embedding Space for Metric Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 471–480. [Google Scholar]
- Kim, W.; Goyal, B.; Chawla, K.; Lee, J.; Kwon, K. Attention-Based Ensemble for Deep Metric Learning. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 736–751. [Google Scholar]
- Ba, J.; Mnih, V.; Kavukcuoglu, K. Multiple Object Recognition with Visual Attention. arXiv 2014, arXiv:1412.7755. [Google Scholar]
- Sohn, K. Improved Deep Metric Learning with Multi-Class n-Pair Loss Objective. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 1857–1865. [Google Scholar]
- Kim, S.; Seo, M.; Laptev, I.; Cho, M.; Kwak, S. Deep Metric Learning beyond Binary Supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2288–2297. [Google Scholar]
- Movshovitz-Attias, Y.; Toshev, A.; Leung, T.K.; Ioffe, S.; Singh, S. No Fuss Distance Metric Learning Using Proxies. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 29 October 2017; pp. 360–368. [Google Scholar]
- Fan, L.; Zhao, H.; Zhao, H.; Liu, P.; Hu, H. Distribution Structure Learning Loss (DSLL) Based on Deep Metric Learning for Image Retrieval. Entropy 2019, 21, 1121. [Google Scholar] [CrossRef] [Green Version]
- Sudha, S.K.; Aji, S. A Review on Recent Advances in Remote Sensing Image Retrieval Techniques. J. Indian Soc. Remote Sens. 2019, 47, 2129–2139. [Google Scholar] [CrossRef]
- Zhu, S.; Dong, X.; Su, H. Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4923–4932. [Google Scholar]
- Lin, Z.; Yang, Z.; Huang, F.; Chen, J. Regional Maximum Activations of Convolutions with Attention for Cross-Domain Beauty and Personal Care Product Retrieval. In Proceedings of the 26th ACM international conference on Multimedia, Seoul, Korea, 22–26 October 2018; pp. 2073–2077. [Google Scholar]
- Mnih, V.; Heess, N.; Graves, A. Recurrent Models of Visual Attention. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; pp. 2204–2212. [Google Scholar]
- Sermanet, P.; Frome, A.; Real, E. Attention for Fine-Grained Categorization. arXiv 2014, arXiv:1412.7054. [Google Scholar]
- Zhao, B.; Wu, X.; Feng, J.; Peng, Q.; Yan, S. Diversified Visual Attention Networks for Fine-Grained Object Classification. IEEE Trans. Multimed. 2017, 19, 1245–1256. [Google Scholar] [CrossRef] [Green Version]
- Jaderberg, M.; Simonyan, K.; Zisserman, A. Spatial Transformer Networks. Adv. Neural Inf. Process. Syst. 2015, 28, 2017–2025. [Google Scholar]
- Chen, L.-C.; Yang, Y.; Wang, J.; Xu, W.; Yuille, A.L. Attention to Scale: Scale-Aware Semantic Image Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 3640–3649. [Google Scholar]
- Sumbul, G.; Demir, B. A Novel Multi-Attention Driven System for Multi-Label Remote Sensing Image Classification. In Proceedings of the IGARSS 2019—2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 5726–5729. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Activation Function | Attention Type | mAP | ||
---|---|---|---|---|
UCMD | SIRI-WHU | AID | ||
Channel attention | 97.68 | 95.18 | 93.51 | |
Spatial attention | 97.56 | 95.09 | 93.36 | |
Mixed attention | 97.77 | 95.25 | 93.84 |
Configuration | Dimension | Recall@1 | ||
---|---|---|---|---|
UCMD | SIRI-WHU | AID | ||
S | 1536 | 96.83 | 94.73 | 92.99 |
M | 1536 | 96.69 | 94.56 | 92.67 |
G | 1536 | 96.91 | 94.68 | 92.86 |
SM | 768 + 768 | 97.38 | 94.89 | 93.28 |
SG | 768 + 768 | 97.77 | 95.25 | 93.84 |
MG | 768 + 768 | 97.52 | 95.17 | 93.55 |
SGM | 512 + 512 + 512 | 97.65 | 95.21 | 93.78 |
Loss | mAP | Recall@1 | ||||
---|---|---|---|---|---|---|
UCMD | SIRI-WHU | AID | UCMD | SIRI-WHU | AID | |
N-pairs Loss | 95.22 | 94.78 | 92.83 | 95.38 | 92.15 | 90.93 |
Proxy-NCA Loss | 96.36 | 96.01 | 93.66 | 96.39 | 93.06 | 91.85 |
Lifted Struct Loss | 97.13 | 96.52 | 94.95 | 97.12 | 94.29 | 92.77 |
Batch Hard Triplet Loss | 97.25 | 96.81 | 95.37 | 97.77 | 95.25 | 93.84 |
Dataset | Methods | mAP | R@1 | R@2 | R@4 | R@8 |
---|---|---|---|---|---|---|
UCMD | BIER | 85.79 | 80.31 | 85.28 | 90.11 | 91.65 |
A-BIER | 90.32 | 86.52 | 89.96 | 92.61 | 94.76 | |
DCES | 93.53 | 87.45 | 91.02 | 94.27 | 96.32 | |
ABE | 96.53 | 93.71 | 95.57 | 96.96 | 98.32 | |
Ours | 97.25 | 97.77 | 98.57 | 98.89 | 99.21 | |
SIRI-WHU | BIER | 82.09 | 81.32 | 82.63 | 87.29 | 90.1 |
A-BIER | 85.37 | 83.67 | 86.83 | 90.17 | 95.08 | |
DCES | 86.06 | 86.8 | 92.04 | 95.11 | 97.29 | |
ABE | 86.22 | 87.35 | 92.93 | 96.02 | 97.45 | |
Ours | 96.81 | 95.25 | 97.58 | 98.5 | 99.08 | |
AID | BIER | 79.92 | 80.72 | 86.39 | 92.01 | 95.38 |
A-BIER | 82.33 | 82.28 | 89.51 | 93.55 | 96.37 | |
DCES | 85.06 | 85.39 | 91.02 | 95.27 | 96.63 | |
ABE | 88.75 | 88.33 | 91.39 | 95.56 | 96.89 | |
Ours | 95.37 | 93.84 | 96.28 | 97.84 | 98.52 |
Categories | BIER | A-BIER | DCES | ABE | Ours |
---|---|---|---|---|---|
Agricultural | 94.94 | 94.32 | 98.08 | 98.55 | 99.45 |
Airplane | 88.65 | 88.87 | 92.63 | 97.82 | 98.76 |
Baseball Diamond | 87.11 | 90.42 | 93.63 | 96.66 | 98.89 |
Beach | 89.17 | 98.48 | 98.24 | 98.24 | 99.61 |
Buildings | 75.52 | 84.43 | 86.19 | 88.15 | 93.96 |
Chaparral | 87.72 | 98.05 | 99.81 | 99.69 | 99.18 |
Dense Residential | 74.46 | 84.05 | 87.81 | 89.85 | 89.18 |
Forest | 84.41 | 94.89 | 98.65 | 98.98 | 99.25 |
Freeway | 78.61 | 85.82 | 89.68 | 95.64 | 93.79 |
Golf Course | 83.90 | 87.92 | 91.68 | 98.64 | 97.79 |
Harbor | 83.96 | 88.25 | 91.39 | 95.38 | 96.72 |
Intersection | 89.66 | 85.67 | 89.07 | 92.06 | 93.82 |
Medium Residential | 88.05 | 90.77 | 94.53 | 98.56 | 98.44 |
Mobile Home Park | 97.02 | 92.01 | 95.72 | 96.75 | 99.83 |
Overpass | 98.42 | 97.59 | 99.35 | 99.35 | 99.72 |
Parking Lot | 88.66 | 85.33 | 89.09 | 97.12 | 97.54 |
River | 79.45 | 92.45 | 96.21 | 97.24 | 97.58 |
Runway | 85.09 | 93.56 | 97.32 | 98.35 | 98.69 |
Sparse Residential | 79.81 | 86.32 | 90.08 | 96.17 | 96.45 |
Storage Tanks | 87.64 | 83.48 | 87.24 | 96.33 | 94.61 |
Tennis Court | 79.39 | 94.07 | 97.67 | 97.67 | 99.04 |
Average | 85.79 | 90.32 | 93.53 | 96.53 | 97.25 |
Categories | BIER | A-BIER | DCES | ABE | Ours |
---|---|---|---|---|---|
Agriculture | 92.68 | 90.21 | 91.58 | 91.78 | 97.68 |
Commercial | 82.58 | 82.77 | 82.45 | 84.36 | 94.79 |
Harbor | 80.54 | 85.69 | 86.62 | 87.12 | 98.43 |
Idle land | 82.07 | 88.52 | 89.04 | 88.93 | 99.18 |
Industrial | 82.66 | 85.85 | 87.25 | 87.02 | 99.63 |
Meadow | 82.74 | 86.09 | 87.94 | 86.52 | 96.73 |
Overpass | 76.63 | 81.43 | 83.07 | 83.25 | 95.03 |
Park | 82.94 | 87.74 | 86.02 | 86.31 | 96.39 |
Pond | 84.61 | 86.85 | 87.16 | 86.19 | 97.33 |
Residential | 74.81 | 80.51 | 81.66 | 81.87 | 93.94 |
River | 84.82 | 87.05 | 87.35 | 87.75 | 96.93 |
Water | 77.99 | 81.78 | 82.56 | 83.58 | 95.65 |
Average | 82.09 | 85.37 | 86.06 | 86.22 | 96.81 |
Categories | BIER | A-BIER | DCES | ABE | Ours |
---|---|---|---|---|---|
Airport | 85.22 | 85.22 | 90.48 | 91.87 | 96.89 |
Bare land | 83.22 | 83.35 | 88.48 | 91.26 | 98.28 |
Baseball field | 79.39 | 79.39 | 84.65 | 88.82 | 96.33 |
Beach | 81.82 | 81.82 | 85.08 | 89.25 | 97.25 |
Bridge | 82.41 | 85.41 | 86.67 | 90.84 | 95.86 |
Center | 75.41 | 79.08 | 80.67 | 82.06 | 90.87 |
Church | 75.42 | 78.36 | 80.68 | 80.68 | 90.21 |
Commercial | 74.75 | 76.92 | 80.01 | 83.23 | 90.05 |
Dense Residential | 75.68 | 77.67 | 83.94 | 83.94 | 94.96 |
Desert | 81.05 | 82.29 | 87.09 | 91.69 | 98.71 |
Farmland | 80.93 | 86.33 | 85.97 | 92.18 | 98.11 |
Forest | 84.73 | 88.36 | 91.14 | 92.53 | 97.55 |
Industria | 75.41 | 79.04 | 81.82 | 83.21 | 93.84 |
Meadow | 79.54 | 83.17 | 85.95 | 87.34 | 95.97 |
Medium Residential | 77.55 | 81.18 | 83.96 | 85.35 | 90.37 |
Mountain | 74.71 | 79.97 | 81.12 | 87.67 | 92.69 |
Park | 90.54 | 93.88 | 96.95 | 96.95 | 95.97 |
Parking | 78.72 | 83.35 | 85.13 | 89.57 | 98.59 |
Playground | 75.06 | 80.92 | 85.47 | 89.64 | 94.66 |
Pond | 78.16 | 79.79 | 81.01 | 88.62 | 93.64 |
Port | 80.91 | 82.54 | 85.69 | 89.86 | 95.43 |
Railway station | 80.28 | 82.91 | 85.06 | 90.53 | 95.55 |
Resort | 78.91 | 80.54 | 83.69 | 87.86 | 94.88 |
River | 89.79 | 91.42 | 91.79 | 98.74 | 97.37 |
School | 80.21 | 81.84 | 83.84 | 90.79 | 97.03 |
Sparse Residential | 75.88 | 77.51 | 79.51 | 83.67 | 97.13 |
Square | 76.86 | 78.49 | 80.49 | 84.66 | 95.13 |
Stadium | 78.98 | 80.61 | 82.61 | 86.78 | 95.63 |
Storage tanks | 83.88 | 85.07 | 86.58 | 92.75 | 96.38 |
Viaduct | 82.11 | 83.53 | 86.09 | 90.26 | 95.89 |
Average | 79.92 | 82.33 | 85.06 | 88.75 | 95.37 |
Datasets | Retrieval Process | BIER | A-BIER | DCES | ABE | Ours |
UCMD | deep features extraction | 55 | 48 | 62 | 69 | 25 |
similarity metric (1536) | 29.12 | 28.78 | 18.72 | 12.08 | 0.57 | |
SIRI-WHU | deep features extraction | 61 | 60 | 69 | 88 | 29 |
similarity metric (1536) | 32.52 | 30.06 | 20.75 | 15.63 | 0.93 | |
AID | deep features extraction | 89 | 83 | 75 | 96 | 39 |
similarity metric (1536) | 58.96 | 51.66 | 39.05 | 29.72 | 2.37 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cheng, Q.; Gan, D.; Fu, P.; Huang, H.; Zhou, Y. A Novel Ensemble Architecture of Residual Attention-Based Deep Metric Learning for Remote Sensing Image Retrieval. Remote Sens. 2021, 13, 3445. https://doi.org/10.3390/rs13173445
Cheng Q, Gan D, Fu P, Huang H, Zhou Y. A Novel Ensemble Architecture of Residual Attention-Based Deep Metric Learning for Remote Sensing Image Retrieval. Remote Sensing. 2021; 13(17):3445. https://doi.org/10.3390/rs13173445
Chicago/Turabian StyleCheng, Qimin, Deqiao Gan, Peng Fu, Haiyan Huang, and Yuzhuo Zhou. 2021. "A Novel Ensemble Architecture of Residual Attention-Based Deep Metric Learning for Remote Sensing Image Retrieval" Remote Sensing 13, no. 17: 3445. https://doi.org/10.3390/rs13173445
APA StyleCheng, Q., Gan, D., Fu, P., Huang, H., & Zhou, Y. (2021). A Novel Ensemble Architecture of Residual Attention-Based Deep Metric Learning for Remote Sensing Image Retrieval. Remote Sensing, 13(17), 3445. https://doi.org/10.3390/rs13173445