A Novel Transformer Network with a CNN-Enhanced Cross-Attention Mechanism for Hyperspectral Image Classification
Abstract
:1. Introduction
- Taking blocks of different sizes from HSI, we employ a mixed fusion multi-scale extraction shallow spatial–spectral feature module to process shallow features. This module primarily consists of two multi-scale convolutional neural networks designed for different-sized data. The network utilizes convolutional kernels of varying sizes to extract shallow feature information at different scales.
- An efficient transformer encoder was designed in which we apply 2D convolution and dilated convolution to tokens to obtain two sets of Q, K, and V with different scale information. This enables the transformer architecture with cross-attention to not only learn deeper feature information and promote the interaction of deep semantic information but also effectively fuse feature information of different sizes from the two branches.
- We designed an innovative dual-branch network specifically for classification tasks in small-sample scenarios. This network efficiently integrates a multi-scale CNN with a transformer encoder to fully exploit the multi-scale spatial–spectral features of HSI. We validated this network on three datasets, and the experimental results indicated that our proposed network was competitive compared to state-of-the-art methods.
2. Materials and Methods
2.1. HSI Data Preprocessing
2.2. Dual-Branch Multi-Scale Shallow Feature Extraction Module
2.3. Feature-Maps-to-Tokens Conversion Module
2.4. Transformer with CNN-Enhanced Cross-Attention Module
2.5. Classifier Head
Algorithm 1 Multi-scale Feature Transformer with CNN-Enhanced Cross-Attention Model |
Input: Input HSI data and ground truth labels ; the original data are reduced in spectral dimension to r = 30 using PCA operation. A set of small cubes with sizes = 13 and = 7 is then extracted. Subsequently, the training set of the model is randomly sampled at a sampling rate of 1%. Output: Predicted labels for the test dataset.
|
NO. | Houston2013 Dataset | Trento Dataset | Pavia University Dataset | ||||||
---|---|---|---|---|---|---|---|---|---|
Class | Training (). | Test. | Class | Training (). | Test. | Class | Training (). | Test. | |
#1 | Healthy Grass | 13 | 1238 | Apple Trees | 40 | 3994 | Asphalt | 66 | 6565 |
#2 | Stressed Grass | 13 | 1241 | Buildings | 29 | 2874 | Meadows | 186 | 18,463 |
#3 | Synthetic Grass | 7 | 690 | Ground | 5 | 474 | Gravel | 21 | 2078 |
#4 | Tree | 12 | 1232 | Woods | 91 | 9032 | Trees | 31 | 3033 |
#5 | Soil | 12 | 1230 | Vineyard | 105 | 10,396 | Metal Sheets | 13 | 1332 |
#6 | Water | 3 | 322 | Roads | 31 | 3143 | Bare Soil | 50 | 4979 |
#7 | Residential | 13 | 1255 | Bitumen | 13 | 1317 | |||
#8 | Commercial | 12 | 1232 | Bricks | 37 | 3645 | |||
#9 | Road | 13 | 1239 | Shadows | 9 | 938 | |||
#10 | Highway | 12 | 1215 | ||||||
#11 | Railway | 12 | 1223 | ||||||
#12 | Parking Lot 1 | 12 | 1221 | ||||||
#13 | Parking Lot 2 | 5 | 464 | ||||||
#14 | Tennis Court | 4 | 424 | ||||||
#15 | Running Track | 7 | 653 | ||||||
Total | 150 | 14,879 | Total | 301 | 29,913 | Total | 426 | 42,350 |
3. Results
3.1. Data Description
3.2. Parameter Analysis
3.3. Classification Results and Analysis
3.4. Analysis of Inference Speed
3.5. Ablation Analysis
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
HSI | Hyperspectral image |
RF | Random Forest |
SVM | Support Vector Machine |
LDA | Linear Discriminant Analysis |
PCA | Principal Component Analysis |
CNN | Convolutional Neural Network |
GAN | Generative Adversarial Network |
GCN | Graph Convolutional Network |
RNN | Recurrent Neural Network |
ResNet | Residual Network |
TE | Transformer encoder |
Q | Queries |
K | Keys |
V | Values |
MLP | Multi-layer perceptron |
LN | Normalization layers |
References
- He, C.; Cao, Q.; Xu, Y.; Sun, L.; Wu, Z.; Wei, Z. Weighted Order-p Tensor Nuclear Norm Minimization and Its Application to Hyperspectral Image Mixed Denoising. IEEE Geosci. Remote Sens. Lett. 2023, 20, 5510505. [Google Scholar] [CrossRef]
- Sun, L.; Wang, Q.; Chen, Y.; Zheng, Y.; Wu, Z.; Fu, L.; Jeon, B. CRNet: Channel-Enhanced Remodeling-Based Network for Salient Object Detection in Optical Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5618314. [Google Scholar] [CrossRef]
- Gao, H.; Zhang, Y.; Chen, Z.; Xu, S.; Hong, D.; Zhang, B. A Multidepth and Multibranch Network for Hyperspectral Target Detection Based on Band Selection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5506818. [Google Scholar] [CrossRef]
- Gao, H.; Zhang, Y.; Chen, Z.; Xu, F.; Hong, D.; Zhang, B. Hyperspectral Target Detection via Spectral Aggregation and Separation Network With Target Band Random Mask. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5515516. [Google Scholar] [CrossRef]
- Gevaert, C.M.; Suomalainen, J.; Tang, J.; Kooistra, L. Generation of Spectral–Temporal Response Surfaces by Combining Multispectral Satellite and Hyperspectral UAV Imagery for Precision Agriculture Applications. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 3140–3146. [Google Scholar] [CrossRef]
- Gong, P.; Li, Z.; Huang, H.; Sun, G.; Wang, L. ICESat GLAS Data for Urban Environment Monitoring. IEEE Trans. Geosci. Remote Sens. 2011, 49, 1158–1172. [Google Scholar] [CrossRef]
- Wang, J.; Zhang, L.; Tong, Q.; Sun, X. The Spectral Crust project—Research on new mineral exploration technology. In Proceedings of the 2012 4th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS), Shanghai, China, 4–7 June 2012; pp. 1–4. [Google Scholar] [CrossRef]
- Ardouin, J.P.; Levesque, J.; Rea, T.A. A demonstration of hyperspectral image exploitation for military applications. In Proceedings of the 2007 10th International Conference on Information Fusion, Québec, QC, Canada, 9–12 July 2007; pp. 1–8. [Google Scholar] [CrossRef]
- Su, Y.; Gao, L.; Jiang, M.; Plaza, A.; Sun, X.; Zhang, B. NSCKL: Normalized Spectral Clustering With Kernel-Based Learning for Semisupervised Hyperspectral Image Classification. IEEE Trans. Cybern. 2023, 53, 6649–6662. [Google Scholar] [CrossRef] [PubMed]
- Su, Y.; Chen, J.; Gao, L.; Plaza, A.; Jiang, M.; Xu, X.; Sun, X.; Li, P. ACGT-Net: Adaptive Cuckoo Refinement-Based Graph Transfer Network for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5521314. [Google Scholar] [CrossRef]
- Yu, H.; Gao, L.; Liao, W.; Zhang, B.; Zhuang, L.; Song, M.; Chanussot, J. Global Spatial and Local Spectral Similarity-Based Manifold Learning Group Sparse Representation for Hyperspectral Imagery Classification. IEEE Trans. Geosci. Remote Sens. 2020, 58, 3043–3056. [Google Scholar] [CrossRef]
- Gao, H.; Yang, Y.; Li, C.; Gao, L.; Zhang, B. Multiscale Residual Network With Mixed Depthwise Convolution for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2021, 59, 3396–3408. [Google Scholar] [CrossRef]
- Yan, L.; Fan, B.; Liu, H.; Huo, C.; Xiang, S.; Pan, C. Triplet Adversarial Domain Adaptation for Pixel-Level Classification of VHR Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2020, 58, 3558–3573. [Google Scholar] [CrossRef]
- Melgani, F.; Bruzzone, L. Classification of hyperspectral remote sensing images with support vector machines. IEEE Trans. Geosci. Remote Sens. 2004, 42, 1778–1790. [Google Scholar] [CrossRef]
- Ye, Q.; Huang, P.; Zhang, Z.; Zheng, Y.; Fu, L.; Yang, W. Multiview Learning With Robust Double-Sided Twin SVM. IEEE Trans. Cybern. 2022, 52, 12745–12758. [Google Scholar] [CrossRef] [PubMed]
- Ham, J.; Chen, Y.; Crawford, M.; Ghosh, J. Investigation of the random forest framework for classification of hyperspectral data. IEEE Trans. Geosci. Remote Sens. 2005, 43, 492–501. [Google Scholar] [CrossRef]
- Guo, Y.; Han, S.; Li, Y.; Zhang, C.; Bai, Y. K-Nearest Neighbor combined with guided filter for hyperspectral image classification. Procedia Comput. Sci. 2018, 129, 159–165. [Google Scholar] [CrossRef]
- Bandos, T.V.; Bruzzone, L.; Camps-Valls, G. Classification of Hyperspectral Images With Regularized Linear Discriminant Analysis. IEEE Trans. Geosci. Remote Sens. 2009, 47, 862–873. [Google Scholar] [CrossRef]
- Dalla Mura, M.; Villa, A.; Benediktsson, J.A.; Chanussot, J.; Bruzzone, L. Classification of Hyperspectral Images by Using Extended Morphological Attribute Profiles and Independent Component Analysis. IEEE Geosci. Remote Sens. Lett. 2011, 8, 542–546. [Google Scholar] [CrossRef]
- Li, S.; Song, W.; Fang, L.; Chen, Y.; Ghamisi, P.; Benediktsson, J.A. Deep Learning for Hyperspectral Image Classification: An Overview. IEEE Trans. Geosci. Remote Sens. 2019, 57, 6690–6709. [Google Scholar] [CrossRef]
- Lu, W.; Wang, X.; Sun, L.; Zheng, Y. Spectral–Spatial Feature Extraction for Hyperspectral Image Classification Using Enhanced Transformer with Large-Kernel Attention. Remote Sens. 2024, 16, 67. [Google Scholar] [CrossRef]
- Hu, W.; Huang, Y.; Wei, L.; Zhang, F.; Li, H. Deep convolutional neural networks for hyperspectral image classification. J. Sens. 2015, 2015, 258619. [Google Scholar] [CrossRef]
- Zhao, W.; Du, S. Spectral–Spatial Feature Extraction for Hyperspectral Image Classification: A Dimension Reduction and Deep Learning Approach. IEEE Trans. Geosci. Remote Sens. 2016, 54, 4544–4554. [Google Scholar] [CrossRef]
- Chen, Y.; Jiang, H.; Li, C.; Jia, X.; Ghamisi, P. Deep Feature Extraction and Classification of Hyperspectral Images Based on Convolutional Neural Networks. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6232–6251. [Google Scholar] [CrossRef]
- He, M.; Li, B.; Chen, H. Multi-scale 3D deep convolutional neural network for hyperspectral image classification. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3904–3908. [Google Scholar] [CrossRef]
- Roy, S.K.; Krishna, G.; Dubey, S.R.; Chaudhuri, B.B. HybridSN: Exploring 3-D–2-D CNN Feature Hierarchy for Hyperspectral Image Classification. IEEE Geosci. Remote Sens. Lett. 2020, 17, 277–281. [Google Scholar] [CrossRef]
- Zhu, L.; Chen, Y.; Ghamisi, P.; Benediktsson, J.A. Generative Adversarial Networks for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2018, 56, 5046–5063. [Google Scholar] [CrossRef]
- Mou, L.; Ghamisi, P.; Zhu, X.X. Deep Recurrent Neural Networks for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3639–3655. [Google Scholar] [CrossRef]
- Wan, S.; Gong, C.; Zhong, P.; Du, B.; Zhang, L.; Yang, J. Multiscale Dynamic Graph Convolutional Network for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2020, 58, 3162–3177. [Google Scholar] [CrossRef]
- Haut, J.M.; Paoletti, M.E.; Plaza, J.; Plaza, A.; Li, J. Visual Attention-Driven Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2019, 57, 8065–8080. [Google Scholar] [CrossRef]
- Sun, H.; Zheng, X.; Lu, X.; Wu, S. Spectral–Spatial Attention Network for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2020, 58, 3232–3245. [Google Scholar] [CrossRef]
- Hang, R.; Li, Z.; Liu, Q.; Ghamisi, P.; Bhattacharyya, S.S. Hyperspectral Image Classification With Attention-Aided CNNs. IEEE Trans. Geosci. Remote Sens. 2021, 59, 2281–2293. [Google Scholar] [CrossRef]
- Ma, W.; Yang, Q.; Wu, Y.; Zhao, W.; Zhang, X. Double-Branch Multi-Attention Mechanism Network for Hyperspectral Image Classification. Remote Sens. 2019, 11, 1307. [Google Scholar] [CrossRef]
- Zhu, M.; Jiao, L.; Liu, F.; Yang, S.; Wang, J. Residual Spectral–Spatial Attention Network for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2021, 59, 449–462. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Sun, L.; Wang, X.; Zheng, Y.; Wu, Z.; Fu, L. Multiscale 3-D–2-D Mixed CNN and Lightweight Attention-Free Transformer for Hyperspectral and LiDAR Classification. IEEE Trans. Geosci. Remote Sens. 2024, 62, 2100116. [Google Scholar] [CrossRef]
- Hong, D.; Han, Z.; Yao, J.; Gao, L.; Zhang, B.; Plaza, A.; Chanussot, J. SpectralFormer: Rethinking Hyperspectral Image Classification With Transformers. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5518615. [Google Scholar] [CrossRef]
- He, X.; Chen, Y.; Lin, Z. Spatial-Spectral Transformer for Hyperspectral Image Classification. Remote Sensing 2021, 13, 498. [Google Scholar] [CrossRef]
- Sun, L.; Zhao, G.; Zheng, Y.; Wu, Z. Spectral–Spatial Feature Tokenization Transformer for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5522214. [Google Scholar] [CrossRef]
- Mei, S.; Song, C.; Ma, M.; Xu, F. Hyperspectral Image Classification Using Group-Aware Hierarchical Transformer. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5539014. [Google Scholar] [CrossRef]
- Fang, Y.; Ye, Q.; Sun, L.; Zheng, Y.; Wu, Z. Multiattention Joint Convolution Feature Representation With Lightweight Transformer for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5513814. [Google Scholar] [CrossRef]
- Roy, S.K.; Deria, A.; Shah, C.; Haut, J.M.; Du, Q.; Plaza, A. Spectral–Spatial Morphological Attention Transformer for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5503615. [Google Scholar] [CrossRef]
- Gao, H.; Chen, Z.; Xu, F. Adaptive spectral-spatial feature fusion network for hyperspectral image classification using limited training samples. Int. J. Appl. Earth Obs. Geoinf. 2022, 107, 102687. [Google Scholar] [CrossRef]
- Ben Hamida, A.; Benoit, A.; Lambert, P.; Ben Amar, C. 3-D Deep Learning Approach for Remote Sensing Image Classification. IEEE Trans. Geosci. Remote Sens. 2018, 56, 4420–4434. [Google Scholar] [CrossRef]
Instances | SVM [14] | 1D-CNN [22] | 3D-CNN [24] | M3D-CNN [25] | 3D-DLA [44] | Hybrid [26] | SSFTT [39] | morphFormer [42] | TNCCA |
---|---|---|---|---|---|---|---|---|---|
85.78 ± 0.00 | 85.70 ± 0.00 | 73.99 ± 6.96 | 94.74 ± 5.10 | 85.11 ± 0.28 | 89.68 ± 2.88 | 85.78 ± 6.71 | 96.66 ± 1.79 | 94.82 ± 2.49 | |
1.39 ± 2.41 | 0.00 ± 0.00 | 41.94 ± 0.17 | 81.35 ± 5.60 | 75.10 ± 7.70 | 83.42 ± 2.56 | 89.79 ± 6.94 | 96.21 ± 1.69 | 96.13 ± 1.78 | |
0.00 ± 0.00 | 0.00 ± 0.00 | 47.89 ± 4.20 | 90.33 ± 3.02 | 92.89 ± 1.01 | 73.04 ± 11.29 | 92.41 ± 10.15 | 98.26 ± 0.72 | 99.34 ± 0.32 | |
42.77 ± 28.19 | 37.85 ± 8.81 | 48.98 ± 16.58 | 84.68 ± 3.12 | 93.37 ± 3.11 | 72.64 ± 21.09 | 90.99 ± 3.37 | 93.15 ± 1.05 | 90.85 ± 2.58 | |
61.76 ± 52.71 | 95.09 ± 0.92 | 78.78 ± 2.06 | 87.66 ± 8.48 | 96.61 ± 1.74 | 99.72 ± 0.46 | 99.75 ± 0.21 | 92.62 ± 5.57 | 100 ± 0.00 | |
0.00 ± 0.00 | 0.00 ± 0.00 | 16.77 ± 2.63 | 38.61 ± 7.19 | 38.81 ± 16.97 | 78.98 ± 11.47 | 82.60 ± 1.35 | 79.60 ± 3.43 | 91.55 ± 3.31 | |
87.94 ± 1.98 | 95.75 ± 0.04 | 48.96 ± 2.08 | 53.30 ± 6.21 | 49.61 ± 3.50 | 53.01 ± 3.43 | 74.42 ± 3.87 | 77.71 ± 4.59 | 82.54 ± 2.94 | |
30.65 ± 12.64 | 0.00 ± 0.00 | 29.54 ± 3.55 | 54.49 ± 7.91 | 45.83 ± 2.60 | 70.94 ± 2.03 | 69.23 ± 2.59 | 67.28 ± 1.78 | 82.88 ± 3.53 | |
7.02 ± 6.74 | 86.54 ± 2.98 | 39.87 ± 16.20 | 58.59 ± 6.57 | 68.44 ± 0.42 | 55.82 ± 0.80 | 87.27 ± 3.51 | 88.86 ± 3.91 | 84.57 ± 4.02 | |
17.55 ± 30.41 | 0.21 ± 0.38 | 45.59 ± 7.91 | 60.90 ± 4.45 | 57.17 ± 33.02 | 77.91 ± 3.03 | 95.08 ± 1.07 | 87.57 ± 9.54 | 91.59 ± 3.60 | |
23.73 ± 28.90 | 0.00 ± 0.00 | 39.98 ± 10.29 | 38.37 ± 8.64 | 55.79 ± 30.36 | 72.03 ± 5.33 | 92.58 ± 5.11 | 87.18 ± 1.06 | 81.26 ± 5.85 | |
7.88 ± 13.66 | 2.48 ± 3.55 | 39.68 ± 13.03 | 73.16 ± 9.21 | 73.32 ± 16.06 | 89.62 ± 2.33 | 83.48 ± 5.40 | 74.50 ± 3.30 | 89.46 ± 2.26 | |
0.50 ± 0.69 | 0.00 ± 0.00 | 41.48 ± 4.72 | 39.87 ± 10.37 | 29.45 ± 8.02 | 52.94 ± 7.05 | 84.69 ± 5.40 | 84.77 ± 3.14 | 90.77 ± 3.21 | |
0.00 ± 0.00 | 0.00 ± 0.00 | 40.33 ± 26.68 | 51.02 ± 6.60 | 74.92 ± 11.14 | 100 ± 0.00 | 99.76 ± 0.23 | 90.09 ± 3.42 | 100 ± 0.00 | |
62.68 ± 54.52 | 0.00 ± 0.00 | 67.99 ± 15.59 | 85.96 ± 7.39 | 97.54 ± 1.86 | 100 ± 0.00 | 100 ± 0.00 | 97.65 ± 0.57 | 100 ± 0.00 | |
OA (%) | 33.24 ± 5.37 | 33.63 ± 0.67 | 48.39 ± 2.48 | 68.44 ± 1.81 | 70.49 ± 1.27 | 77.29 ± 1.19 | 87.85 ± 1.20 | 87.17 ± 0.80 | 90.72 ± 0.89 |
AA (%) | 28.64 ± 4.57 | 26.91 ± 0.54 | 46.78 ± 1.52 | 66.20 ± 1.54 | 68.93 ± 0.42 | 77.98 ± 1.21 | 88.52 ± 0.82 | 87.47 ± 0.79 | 91.72 ± 0.74 |
× 100 | 27.46 ± 5.69 | 27.58 ± 0.73 | 44.12 ± 2.63 | 65.82 ± 1.95 | 68.06 ± 1.37 | 75.44 ± 1.29 | 86.87 ± 1.29 | 86.13 ± 0.87 | 89.97 ± 0.97 |
Instances | SVM [14] | 1D-CNN [22] | 3D-CNN [24] | M3D-CNN [25] | 3D-DLA [44] | Hybrid [26] | SSFTT [39] | morphFormer [42] | TNCCA |
---|---|---|---|---|---|---|---|---|---|
0.37 ± 0.32 | 0.00 ± 0.00 | 78.58 ± 34.28 | 97.72 ± 0.55 | 86.28 ± 4.62 | 99.04 ± 0.56 | 99.64 ± 0.23 | 99.49 ± 0.23 | 99.58 ± 0.25 | |
66.96 ± 5.56 | 73.56 ± 0.67 | 75.59 ± 11.99 | 80.15 ± 3.32 | 82.65 ± 1.42 | 67.16 ± 12.93 | 98.08 ± 0.38 | 91.66 ± 1.63 | 98.32 ± 0.31 | |
0.00 ± 0.00 | 0.00 ± 0.00 | 45.44 ± 16.43 | 71.49 ± 13.08 | 57.63 ± 18.06 | 35.43 ± 14.97 | 51.26 ± 2.53 | 91.20 ± 5.05 | 97.79 ± 1.90 | |
92.87 ± 0.93 | 89.39 ± 0.53 | 98.11 ± 2.59 | 98.82 ± 0.55 | 97.75 ± 0.40 | 100 ± 0.00 | 100 ± 0.00 | 99.97 ± 0.01 | 100 ± 0.00 | |
75.15 ± 1.61 | 84.40 ± 1.00 | 99.54 ± 0.13 | 99.49 ± 0.45 | 99.49 ± 0.07 | 100 ± 0.00 | 99.91 ± 0.09 | 99.92 ± 0.11 | 100 ± 0.00 | |
67.53 ± 2.70 | 70.08 ± 1.67 | 81.61 ± 8.08 | 82.04 ± 4.15 | 80.40 ± 2.62 | 66.89 ± 2.86 | 89.71 ± 2.49 | 92.84 ± 1.33 | 93.17 ± 1.48 | |
OA (%) | 67.74 ± 0.52 | 70.75 ± 0.22 | 91.27 ± 6.45 | 94.91 ± 0.56 | 92.91 ± 0.65 | 92.21 ± 1.19 | 97.88 ± 0.25 | 98.20 ± 0.12 | 98.98 ± 0.22 |
AA (%) | 50.48 ± 1.17 | 52.90 ± 0.01 | 79.81 ± 10.23 | 88.28 ± 3.10 | 84.03 ± 2.45 | 78.09 ± 0.34 | 89.77 ± 0.61 | 95.85 ± 0.93 | 97.64 ± 0.62 |
× 100 | 55.45 ± 0.80 | 59.46 ± 0.29 | 88.22 ± 8.81 | 93.21 ± 0.76 | 90.49 ± 0.88 | 89.54 ± 1.59 | 97.17 ± 0.33 | 97.60 ± 0.16 | 98.64 ± 0.30 |
Instances | SVM [14] | 1D-CNN [22] | 3D-CNN [24] | M3D-CNN [25] | 3D-DLA [44] | Hybrid [26] | SSFTT [39] | morphFormer [42] | TNCCA |
---|---|---|---|---|---|---|---|---|---|
94.76 ± 0.61 | 91.32 ± 0.27 | 83.24 ± 3.03 | 94.44 ± 1.69 | 88.32 ± 5.04 | 92.46 ± 0.93 | 97.91 ± 0.66 | 96.75 ± 0.98 | 98.61 ± 0.57 | |
92.45 ± 1.20 | 95.58 ± 1.22 | 93.89 ± 4.27 | 98.14 ± 1.35 | 96.42 ± 1.06 | 99.95 ± 0.07 | 98.39 ± 0.33 | 99.75 ± 0.20 | 99.98 ± 0.02 | |
0.00 ± 0.00 | 0.00 ± 0.00 | 54.52 ± 20.93 | 68.65 ± 5.04 | 80.95 ± 1.39 | 94.80 ± 0.50 | 82.53 ± 1.10 | 82.17 ± 1.63 | 87.11 ± 0.87 | |
15.81 ± 2.28 | 60.44 ± 4.77 | 66.00 ± 21.73 | 95.57 ± 1.52 | 91.10 ± 1.73 | 76.81 ± 4.40 | 95.73 ± 1.67 | 96.03 ± 1.11 | 98.48 ± 0.55 | |
99.07 ± 0.18 | 99.44 ± 0.17 | 90.29 ± 15.20 | 99.62 ± 0.52 | 97.99 ± 1.36 | 86.76 ± 19.29 | 100 ± 0.00 | 99.82 ± 0.30 | 100 ± 0.00 | |
18.51 ± 6.58 | 9.58 ± 1.72 | 78.17 ± 8.13 | 77.51 ± 11.15 | 74.37 ± 2.27 | 99.43 ± 0.87 | 99.66 ± 0.42 | 99.16 ± 1.18 | 99.69 ± 0.14 | |
0.00 ± 0.00 | 0.00 ± 0.00 | 57.27 ± 5.24 | 81.87 ± 7.21 | 81.67 ± 6.23 | 81.67 ± 21.37 | 99.16 ± 0.62 | 79.87 ± 4.37 | 99.56 ± 0.31 | |
86.91 ± 2.98 | 92.42 ± 1.27 | 73.79 ± 8.03 | 92.83 ± 2.12 | 77.66 ± 10.19 | 72.84 ± 7.42 | 95.40 ± 1.81 | 95.70 ± 1.19 | 95.93 ± 1.50 | |
0.00 ± 0.00 | 98.36 ± 0.53 | 57.78 ± 21.11 | 96.97 ± 1.61 | 94.34 ± 2.30 | 64.81 ± 14.40 | 82.37 ± 7.04 | 93.85 ± 1.60 | 98.11 ± 0.70 | |
OA (%) | 68.90 ± 0.76 | 74.54 ± 0.28 | 82.68 ± 1.84 | 92.56 ± 1.48 | 89.36 ± 1.32 | 92.72 ± 1.96 | 96.96 ± 0.41 | 96.99 ± 0.47 | 98.59 ± 0.12 |
AA (%) | 45.28 ± 0.61 | 60.79 ± 0.23 | 72.77 ± 1.63 | 89.51 ± 2.76 | 86.98 ± 2.04 | 85.50 ± 6.27 | 94.57 ± 0.84 | 93.68 ± 0.97 | 97.50 ± 0.17 |
× 100 | 56.26 ± 0.98 | 64.42 ± 0.24 | 76.85 ± 2.48 | 90.04 ± 2.07 | 85.81 ± 1.77 | 90.29 ± 2.63 | 95.98 ± 0.54 | 96.01 ± 0.63 | 98.14 ± 0.16 |
Dataset | Houston2013 | Trento | Pavia University | |||
---|---|---|---|---|---|---|
Train. | Test. | Train. | Test. | Train. | Test. | |
Time (min) |
Cases | Components | Indicators | |||||
---|---|---|---|---|---|---|---|
3D-Conv | Ms2D-Conv | Tokenizer | TE | OA (%) | AA (%) | ||
1 | √ | × | × | × | 48.39 | 46.78 | 44.12 |
2 | √ | √ | √ | × | 85.81 | 85.63 | 84.65 |
3 | × | 2D-Conv | √ | √ | 89.58 | 90.06 | 88.73 |
4 | √ | 2D-Conv | √ | √ | 90.55 | 91.52 | 89.56 |
5 | √ | √ | √ | √ | 90.72 | 91.72 | 89.97 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, X.; Sun, L.; Lu, C.; Li, B. A Novel Transformer Network with a CNN-Enhanced Cross-Attention Mechanism for Hyperspectral Image Classification. Remote Sens. 2024, 16, 1180. https://doi.org/10.3390/rs16071180
Wang X, Sun L, Lu C, Li B. A Novel Transformer Network with a CNN-Enhanced Cross-Attention Mechanism for Hyperspectral Image Classification. Remote Sensing. 2024; 16(7):1180. https://doi.org/10.3390/rs16071180
Chicago/Turabian StyleWang, Xinyu, Le Sun, Chuhan Lu, and Baozhu Li. 2024. "A Novel Transformer Network with a CNN-Enhanced Cross-Attention Mechanism for Hyperspectral Image Classification" Remote Sensing 16, no. 7: 1180. https://doi.org/10.3390/rs16071180
APA StyleWang, X., Sun, L., Lu, C., & Li, B. (2024). A Novel Transformer Network with a CNN-Enhanced Cross-Attention Mechanism for Hyperspectral Image Classification. Remote Sensing, 16(7), 1180. https://doi.org/10.3390/rs16071180