Next Article in Journal
Improved Empirical Coefficients for Estimating Water Vapor Weighted Mean Temperature over Europe for GNSS Applications
Previous Article in Journal
Bringing Lunar LiDAR Back Down to Earth: Mapping Our Industrial Heritage through Deep Transfer Learning
Open AccessArticle

Attention-Based Deep Feature Fusion for the Scene Classification of High-Resolution Remote Sensing Images

School of Geodesy and Geomatics, Wuhan University, 129 Luoyu Road, Wuhan 430079, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2019, 11(17), 1996; https://doi.org/10.3390/rs11171996
Received: 12 July 2019 / Revised: 20 August 2019 / Accepted: 21 August 2019 / Published: 23 August 2019
(This article belongs to the Section Remote Sensing Image Processing)
Scene classification of high-resolution remote sensing images (HRRSI) is one of the most important means of land-cover classification. Deep learning techniques, especially the convolutional neural network (CNN) have been widely applied to the scene classification of HRRSI due to the advancement of graphic processing units (GPU). However, they tend to extract features from the whole images rather than discriminative regions. The visual attention mechanism can force the CNN to focus on discriminative regions, but it may suffer from the influence of intra-class diversity and repeated texture. Motivated by these problems, we propose an attention-based deep feature fusion (ADFF) framework that constitutes three parts, namely attention maps generated by Gradient-weighted Class Activation Mapping (Grad-CAM), a multiplicative fusion of deep features and the center-based cross-entropy loss function. First of all, we propose to make attention maps generated by Grad-CAM as an explicit input in order to force the network to concentrate on discriminative regions. Then, deep features derived from original images and attention maps are proposed to be fused by multiplicative fusion in order to consider both improved abilities to distinguish scenes of repeated texture and the salient regions. Finally, the center-based cross-entropy loss function that utilizes both the cross-entropy loss and center loss function is proposed to backpropagate fused features so as to reduce the effect of intra-class diversity on feature representations. The proposed ADFF architecture is tested on three benchmark datasets to show its performance in scene classification. The experiments confirm that the proposed method outperforms most competitive scene classification methods with an average overall accuracy of 94% under different training ratios. View Full-Text
Keywords: remote sensing; scene classification; attention maps; multiplicative fusion of deep feature; center loss remote sensing; scene classification; attention maps; multiplicative fusion of deep feature; center loss
Show Figures

Figure 1

MDPI and ACS Style

Zhu, R.; Yan, L.; Mo, N.; Liu, Y. Attention-Based Deep Feature Fusion for the Scene Classification of High-Resolution Remote Sensing Images. Remote Sens. 2019, 11, 1996.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop