Next Article in Journal
Intercomparison of Landsat OLI and JPSS VIIRS Using a Combination of RadCalNet Sites as a Common Reference
Previous Article in Journal
Automatic Targetless Monocular Camera and LiDAR External Parameter Calibration Method for Mobile Robots
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hyperspectral Image Classification Promotion Using Dynamic Convolution Based on Structural Re-Parameterization

1
School of Computer Science and Technology, Xi’an University of Posts and Telecommunications, Xi’an 710129, China
2
Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi’an 710121, China
3
Xi’an Key Laboratory of Big Data and Intelligent Computing, Xi’an 710121, China
4
Shaanxi Key Lab of Speech and Image Information Processing (SAIIP), School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an 710129, China
5
National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, Xi’an 710129, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2023, 15(23), 5561; https://doi.org/10.3390/rs15235561
Submission received: 20 September 2023 / Revised: 17 November 2023 / Accepted: 27 November 2023 / Published: 29 November 2023
(This article belongs to the Section Remote Sensing Image Processing)

Abstract

:
In a hyperspectral image classification (HSIC) task, manually labeling samples requires a lot of manpower and material resources. Therefore, it is of great significance to use small samples to achieve the HSIC task. Recently, convolutional neural networks (CNNs) have shown remarkable performance in HSIC, but they still have some areas for improvement. (1) Convolutional kernel weights are determined through initialization and cannot be adaptively adjusted based on the input data. Therefore, it is difficult to adaptively learn the structural features of the input data. (2) The convolutional kernel size is single per layer, which leads to the loss of local information for a large convolutional kernel or global information for a small convolutional kernel. In order to solve the above problems, we propose a plug-and-play method called dynamic convolution based on structural re-parameterization (DCSRP). The contributions of this method are as follows. Firstly, compared with traditional convolution, dynamic convolution is a non-linear function, so it has more representation power. In addition, it can adaptively capture the contextual information of input data. Secondly, the large convolutional kernel and the small convolutional kernel are integrated into a new large convolutional kernel. The large convolutional kernel shares the advantages of the two convolution kernels, which can capture global information and local information at the same time. The results in three publicly available HSIC datasets show the effectiveness of the DCSRP.

Graphical Abstract

1. Introduction

Hyperspectral images (HSIs) are the information of objects captured by hyperspectral imaging sensors in hundreds of spectral bands. They have a high spectral dimension and high spatial resolution. HSIs have both image information and spectral information. Image information can be used to distinguish the external characteristics of the objects, such as size and shape. The spectral information can characterize various properties of the object, such as physical, chemical, and material characteristics. The comprehensive information on HSIs lays the foundation for hyperspectral image classification (HSIC) technologies. HSIC is an important task; it involves assigning a category to each pixel in a hyperspectral image (HSI) through rich spectral and spatial information. As the development of HSIC technologies continues, HSIC technologies are widely applied in geological exploration [1,2], agriculture [3], forestry [4], environmental monitoring [5] and the military industry [6,7,8]. So far, researchers have proposed numerous methods for HSIC, which can be mainly divided into non-deep-learning-based methods and deep-learning-based methods.
In the task of HSIC, researchers have proposed various non-deep-learning-based methods to fully utilize the spectral information of HSIs, including random forest (RF) models [9], Bayesian models [10], k-nearest neighbors (KNN) [11], etc. Though the above algorithms that utilize spectral information are effective, to some extent, they overlook the spatial structural relationships between neighboring pixels of HSIs, and these structural relationships can enhance the performance of methods. In order to leverage the spatial structural relationships of HSIs and achieve satisfactory results, researchers have proposed various methods that utilize spectral-spatial information. Some examples include Markov random fields [12], sparse representation [13], metric learning [14] and compound kernels [15,16]. There is a problem of spectral information redundancy in HSIs. In order to alleviate the impact of this problem, researchers have proposed some dimensionality reduction techniques, such as principal component analysis (PCA) [17,18].
Compared with non-deep-learning methods, deep-learning-based approaches show great advantages in HSIC, because they can automatically learn complex features of HSIs. These methods include stacked autoencoder (SAE) models [19,20], Boltzmann machines [21], deep belief networks (DBN) [22,23], etc. Chen et al. [24] first introduced the SAE into HSIC. This method proposed a spectral-spatial joint deep neural network that can hierarchically extract deep-level features, leading to higher classification accuracy. In [19], the noise of the image was reduced by a band-by-band nonlinear diffusion method, and a restricted Boltzmann machine was used to extract higher-level features. The experiments prove the effectiveness of the method. Li et al. [22] employed a DBN in HSIC, utilizing a multi-layer DBN to extract features and achieving good results. In addition, tensor-based models are suitable for extracting features and classification of HSIs [25]. In recent years, convolutional neural networks (CNNs) [24,26,27] have become one of the most essential neural network architectures and have attracted the widespread attention of researchers. CNNs can not only promote the development of hyperspectral classification task through HSI restoration [28] and denoising [29] tasks, but they also improve HSIC accuracy through their excellent performance.
In [30], the proposed method can learn discriminative features among a large number of spectral-spatial features and achieve excellent results. Roy et al. proposed HybridSN [31], which can learn spectral-spatial features more effectively and can also learn more abstract spatial features, which helps to improve classification accuracy. Li et al. [22] proposed the two-stream 2D CNN architecture, which can simultaneously acquire spectral features, local spatial features and global spatial features. Not only this, but this method can adaptively fuse feature weights in two parallel streams, thereby improving the expressiveness of the network. In [32], an end-to-end residual spectral-spatial attention network (RSSAN) was proposed. This method can effectively suppress useless band information and spatial information through the spectral attention and spatial attention module, refining features in the feature-learning process. Finally, in order to avoid overfitting of training results, in the residual block, a sequential spectral-spatial attention module was implemented. In [33], the method proposed by the authors can effectively fuse feature maps of different levels and scales and can adaptively capture direction-aware and position-sensitive information, thereby improving classification accuracy. In [34], the author proposed a space-spectrum dense convolution neural network framework, FADCNN, based on a feedback attention mechanism. This method uses semantic features to enhance the attention map through the feedback attention mechanism. In addition, the network can better refine features by better mining and fusing spectral-spatial features.
Although the above CNNs have achieved satisfactory results, like most methods, to obtain a larger receptive field, they all use a stack of many small convolution kernels (the size is three). At present, some researchers have shown that large convolution kernels can build a large effective receptive field [35], which can improve the performance of CNNs. However, using large kernels is not widespread in CNN architectures. Only a few architectures employ large spatial convolution (the size is larger than three), such as AlexNet [36], Inception [37,38,39] and some architectures derived from neural architecture search [40,41,42,43]. Recently, some researchers have employed large kernels in order to design advanced CNN architectures. For example, Liu et al. [44] successfully increased the convolution kernel size to 7 × 7 and significantly improved the performance of the network. Shang et al. [45] proposed a method based on Multi-Scale Cross-Branch Response and Second-Order Channel Attention (MCRSCA), which utilizes 5 × 5 kernels to capture rich and complementary spatial information. The authors of [46] introduced the Multi-Scale Random Convolution Broad Learning System (MRCBLS). This method utilizes different large sizes of convolution kernels. By combining multi-scale spatial features extracted from different kernel sizes, a good HSIC result is achieved. In [47], the proposed method employs 5 × 5 convolution kernels in an end-to-end manner to extract spectral-spatial features. In [48], the proposed model aggregates convolution kernel features of different sizes, which helps to improve the performance of the network.
To obtain a large and effective receptive field and further enhance the performance of CNNs under small sample conditions, we propose a plug-and-play method called structural re-parameterization dynamic convolution (DCSRP). The contributions of this method are as follows: (1) Dynamic convolution is a nonlinear function. It has more representation power, and it can enhance the feature representation capability of the convolution layer. (2) A large kernel is allowed to capture global features and local features by fusing a small kernel into a large kernel, hence improving the performance of the large kernel.
The structure of this paper is as follows. Section 2 provides a detailed introduction to the proposed method. Section 3 presents the experimental results and analysis. Section 4 includes the discussion, and the final part is the conclusion.

2. Materials and Methods

In this subsequent part, we provide an elaborate explanation of DCSRP. We firstly describe the data preprocessing. Next, we provide a summary of the complete architecture of the DCSRP and show how it can be used. Then, we introduce dynamic convolution in detail in Section 3 and structural re-parameterization in Section 4.

2.1. Data Preprocessing

In our experiments, the input data are in the form of data cube. We use I R H × W × C to represent the original input HSI cube: H means the height of the input cube, W means the width of the input cube, and C means the spectral dimension of the input cube. Firstly, PCA is applied to reduce the dimensions of the data. Then, each pixel is taken as the center to obtain the input 3D patches. The class of the center pixels represents the ground-truth of these patches.

2.2. Overall Framework of Proposed Method

CNNs have achieved advanced results in HSIC. Compared to small convolution kernels, large convolution kernels can enhance the receptive field more effectively and can bring higher shape bias rather than texture bias [49], which can make full use of the spatial information of HSI. At present, to enlarge the receptive fields, a typical fashion is overlapping using a stack of small kernel convolutions [50,51,52]. However, some research findings indicate that too many layers will lead to the opposite result [53]. To obtain a large and effective receptive field and further enhance the HSIC accuracy of CNNs under small sample conditions, we propose DCSRP. DCSRP increases the model complexity without increasing the depth of the network, improving the performance of the model. The better performance of the proposed approach can be attributed to the following two factors. One is using multiple convolution kernels in each layer instead of a single kernel and aggregating these kernels in a non-linear manner, which enhances the representation of HSI information. The other is the addition of a smaller kernel to the larger kernel, enabling the larger kernel to effectively capture smaller-scale features and slightly larger global features of HSI, which enhances the feature extraction capability of the model. Figure 1 shows the structure of DCSRP. Next, we describe the data flow of the proposed method.
We define x as the input HSI data block, and the size of x is H × W × C. The x will first go through the dynamic convolution part, which can enhance the representation of HSI information. The detailed process is as follows. The x will calculate the weight of the convolution kernel through squeeze-and-excitation (SE) attention, which can extract features that are beneficial to improving the accuracy of HSI classification by establishing connections between channels. During this process, first, the spatial information of x is compressed through the global average pooling layer, and the size of x becomes 1 × 1 × C. Then, x will go through two FC layers (with a ReLU activation function in the middle), and the sizes of x are 1 × 1 × C/r, 1 × 1 × C/r and 1 × 1 × k, respectively. We denote the number of attention weights by k and set k as 2. Thirdly, attention weights are generated through softmax. Fourthly, the generated weights will act on k convolution kernels. The specific operation consists of the k weights being multiplied by k convolution kernel parameters, respectively. Finally, k convolution kernels are added and aggregated into a convolution kernel. Since the generated weights will change as the x changes, the aggregated convolution kernel weights will also change, and the dynamic arises from this. Dynamic convolution will generate n groups of convolution kernels with a size of 7 × 7 and 3 × 3, and the size of n can be freely set. After dynamic convolution comes the structural re-parameterization part, which can enhance the feature extraction capability of the model. The specific operations include the convolution kernels of sizes 7 × 7 and 3 × 3 being fused with the Batch Normalization (BN) layer, respectively. Then, the 7 × 7 and the 3 × 3 n groups of convolution kernels are integrated into n new 7 × 7 convolution kernels. The specific operation consists of expanding the 3 × 3 convolution kernel into a 7 × 7 convolution kernel by padding zeros, and the two 7 × 7 convolution kernels are added and aggregated into a convolution kernel. Because there are n groups of 7 × 7 and 3 × 3 convolution kernels, n 7 × 7 convolution kernels will be generated. These 7 × 7 convolution kernels are added to generate a 7 × 7 convolution kernel. The 7 × 7 convolution kernel will extract features of HSI data. After passing through the BN layer and activation function, the output y is obtained, whose size is H × W × C.
As shown in Figure 2, as a plug-and-play method, the using method of DCSRP is simple. When there is a necessity to enhance the original network model’s performance, simply substitute the traditional convolution layer in the original network with DCSRP. When a new network needs to be built, the method of DCSRP is the same as that of the traditional convolution layer. Figure 2 is an example of using DCSRP and displays the network structures used in the verification. In this network, the input data first pass through two convolution layers, and the data size remains constant. Then, the data go through a global average pooling layer. Finally, the classification result is obtained through the fully connected layer. Figure 2a is the original convolution network structure, Figure 2b uses a dynamic convolution network structure, and Figure 2c uses a DCSRP network structure.

2.3. Dynamic Convolution

To improve the accuracy of the network in the HSIC task, the currently popular approach is to increase the complexity of the neural network by increasing the depth of the network. However, too many layers can lead to disappointing results. Thus, we propose DCSRP. This approach increases model complexity in a new way and improves the feature representation capability of the network. Figure 3 illustrates the structure of the dynamic convolution.
Figure 3 shows a dynamic convolution layer. There are k convolution kernels in dynamic convolution; these convolution kernels have a uniform kernel size, input and output dimensions. We define x as the input HSI data block. The input x first undergoes SE attention, which extracts features that are beneficial to improving the accuracy of HSI classification by establishing connections between channels. For an HSI data block of size H × W × C, after SE attention, the size of the HSI data becomes 1 × 1 × k. SE attention makes full use of spatial and spectral information of HSI data and generates attention weights. Then, these attention weights are applied to k convolution kernels using multiplication. Thirdly, these convolution kernels are aggregated into one convolution kernel using addition, which extracts HSI information with more expressive capability. Finally, after passing through a BN layer and an activation function, the output y is obtained. Due to the padding zero operation, the output HSI data have the same size as the input HSI data. Dynamic convolution enhances the representation of HSI information by aggregating multiple convolution kernels that can extract effective channel information. It should be noted that the convolutional kernel that is ultimately aggregated changes based on variations in the input—hence the name dynamic convolution.
y = g ( W T x + b ) represents the traditional convolution process, where W represents the weight matrix, b represents the bias, and g represents the activation function (e.g., ReLU). { W ˜ k T x + b ˜ k } represents the process of dynamic convolution, which aggregates several (k) linear functions. The process can be described as follows:
y = g ( W T ( x ) x + b ( x ) ) ,
W ˜ ( x ) = k = 1 K π k ( x ) W ˜ k ,
b ˜ ( x ) = k = 1 K π k ( x ) b ˜ k ,
s . t .   0 π k 1 ,   k = 1 K π k ( x ) = 1
where π k represents the attention weight for the k-th linear function W ˜ k T x + b ˜ k . It should be noted that the total weight W ˜ ( x ) and total bias b ˜ ( x ) have the same attention weights. The attention weights { π k ( x ) } are not fixed; they will change when input x is changed. They represent the optimal set of linear models { W ˜ k T x + b ˜ k } for input. The aggregated model W T ( x ) x + b ( x ) is a non-linear function. Therefore, dynamic convolution has a stronger capacity for feature representation.
We employ the SE [43] method to generate the attention weights for k convolution kernels. SE can extract features that are beneficial to improving the accuracy of HSI classification by establishing connections between channels. Figure 4 is the structure of the SE method. We use X R H × W × C to denote the original feature maps of the HSIs for an HSI data block of size H × W × C. Firstly, the global spatial information of HSI data is squeezed (Fsq) by global average pooling, obtaining a column vector Z C of size 1 × 1 × C, which includes the attention weights for each channel. This process can be represented as:
Z C = 1 H × W i = 1 H j = 1 W X ( i , j ) ,
This operation has some advantages. The first is that the filters using a local receptive field can utilize extra-regional context information to a certain extent. Secondly, the excitation (Fex) operation will use the information aggregated in the squeeze operation, aiming to capture channel-wise dependencies. This operation consists of two fully connected layers and a ReLU function, which can learn non-linear interactions between channels and ensure emphasis on multiple channels. Through this operation, a column vector S C is obtained. Thirdly, the final output of the features is obtained by rescaling (Fscale) the transformation output. The process can also be described by the following formula:
X ˜ C = F scale ( u c , s c ) = s c u c ,
where X ˜ = [ X ˜ 1 , X ˜ 2 , , X ˜ C ] , F scale ( u c , s c ) represents channel-wise multiplication between the feature maps u c R H × W and S C .

2.4. Structural Re-Parameterization

CNNs have achieved satisfactory results in HSIC, but most networks are designed with small kernels. In recent years, researchers have proved that in the design of convolution neural network models, compared with small convolution kernels, large convolution kernels can build large and effective receptive fields and can extract more spatial information of HSIs. Therefore, some advanced convolution models apply large convolution kernels and show attractive performance and efficiency. Although a large convolution kernel shows better performance, directly increasing the convolution kernel size may lead to the opposite result in HSIC. In order to solve this problem, we propose DCSRP, which better extracts the detailed features and slightly larger global features of HSIs. Specifically, the large convolution kernel and the small convolution kernel are fused with a BN layer, respectively. Then, the small convolution kernel is made the same size as the large convolution kernel by padding zero. Finally, the two large convolution kernels are added and aggregated into a convolution kernel. Therefore, the large kernel can capture small-scale patterns, thus improving the performance of the model.
Figure 5 is an example of re-parameterizing a small kernel (e.g., 3 × 3) into a larger kernel (e.g., 7 × 7). We denote the kernel of a 3 × 3 convolutional layer with C 1 input channels and C 2 output channels as W 3 R C 2 × C 1 × 3 × 3 . Similarly, we denote the kernel of a 7 × 7 convolutional layer with C 1 input channels and C 2 output channels as W 7 R C 2 × C 1 × 7 × 7 . For example, there is an HSI data block of size H × W × C 1 , and after the convolution operation, its size becomes H × W × C 2 . For the ease of clarifying processes, we set C 1 and C 2 to be 2. μ ( 3 ) , σ ( 3 ) , γ ( 3 ) and β ( 3 ) are used to represent the accumulated mean, standard deviation, learned scaling factor and bias of the BN layer following the 3 × 3 convolution layer. Similarly, μ ( 7 ) , σ ( 7 ) , γ ( 7 ) , β ( 7 ) are used to represent the accumulated mean, standard deviation, learned scaling factor and bias of the BN layer following the 7 × 7 convolution layer. Let M ( 1 ) R N × C i n × H i n × W i n and M ( 2 ) R N × C o u t × H o u t × W o u t be the input HSI data and output HSI data, respectively. Then, ∗ is used to represent the convolution operator. In the example, C i n = C o u t = 2 , H i n = H o u t , and W i n = W o u t . The output features M ( 2 ) can be represented as:
M ( 2 ) = b n ( M ( 1 ) W ( 7 ) , μ ( 7 ) , σ ( 7 ) , γ ( 7 ) , β ( 7 ) ) + b n ( M ( 1 ) * W ( 3 ) , μ ( 3 ) , σ ( 3 ) , γ ( 3 ) , β ( 3 ) ) ,
The calculation formula for the feature maps after the batch normalization (BN) layer is as follows:
b n ( M , μ , σ , γ , β ) : , i , : , : = ( M : , i , : , : μ i ) γ i σ i + β i ,
b n ( M W , μ , σ , γ , β ) : , i , : , : = ( M W ) : , i , : , : + b i ,
where M represents the HSI feature maps of the input in the BN layer, and i represents the value of the i-th channel of M . Additionally, i C 2 .
After fusing the BN and convolution layer into a convolution layer, let { W , b } be the kernel and bias converted from { W , μ , σ , γ , β } . We have:
W i , : , : , : = γ i σ i W i , : , : , : ,
b i = μ i γ i σ i + β i ,
The above process can be described as follows. The first step is to fuse the 7 × 7 convolution layer and the 3 × 3 convolution layer with the BN layer, respectively. After such transformations, one 7 × 7 kernel, one 3 × 3 kernel and two bias vectors are obtained. Then, the two independent bias vectors are fused into a bias vector by addition. The final 7 × 7 kernel is obtained by adding the 3 × 3 kernel onto the 7 × 7 kernel center. The detailed process is first to pad the 3 × 3 kernel to 7 × 7 with zero-padding; then, the two 7 × 7 kernels are added up.

3. Experiments

In the first part of this section, we provide a detailed description of the three public HSI datasets that are used in the experiments. In the second part, we introduce the experimental configuration. The last part gives the experiment results.

3.1. Data Description

3.1.1. Indian Pines

The Indian Pines (IP) dataset was acquired using the AVIRIS sensor at an Indian pine test site in the northwestern part of Indiana. The spatial size of the dataset is 145 × 145, and the number of spectra is 224. The dataset creators further refined the dataset by removing spectra that caused interference, resulting in a final spectral dimension of 200. Excluding background pixels in the IP, there are a total of 10,249 pixels that can be used for classification, which can be categorized into 16 classes. In Figure 6, (a) is the false color image, (b) is the label map, and (c) is the corresponding color label. Detailed information on training samples and testing samples of the IP dataset is shown in Table 1.

3.1.2. University of Pavia

The University of Pavia (PU) dataset was acquired using the ROSIS sensor at the University of Pavia in northern Italy. The spatial size of the dataset is 610 × 340, and the number of spectra is 115. The dataset creators further refined the dataset by removing spectra that caused interference, resulting in a final spectral dimension of 103. Excluding background pixels in the PU dataset, there are a total of 42,776 pixels that can be used for classification, which can be categorized into 9 classes. In Figure 7, (a) is the false color image, (b) is the label map, and (c) is the corresponding color label. The detailed information on training samples and testing samples of the PU dataset is shown in Table 2.

3.1.3. Salinas

The Salinas (SA) dataset was acquired using the AVIRIS sensor over the Salinas Valley in California. The spatial size of the dataset is 512 × 217, and the number of spectra is 224. The dataset creators further refined the dataset by removing spectra that caused interference, resulting in a final spectral dimension of 204. Excluding background pixels in the SA dataset, there are a total of 54,129 pixels that can be used for classification, which can be categorized into 16 classes. In Figure 8, (a) is the false color image, (b) is the label map, and (c) is the corresponding color label. The detailed information on training samples and testing samples in the SA dataset is shown in Table 3.
All the experiments in this paper were conducted in the IP, PU and SA. The experimental data information for the three datasets is shown in Table 1, Table 2, Table 3. The training data are randomly selected and non-overlapping.
In the experiments in this paper, 5 samples were selected from each class in the HSI as training samples, and the remaining samples were used as test samples. The numbers of training samples in the IP, PU and SA datasets are 80, 45 and 80, respectively, and the numbers of test samples are 10,169, 42,731 and 54,049, respectively.

3.2. Experimental Configuration

The paper utilizes an Intel(R) Core(TM) i9-10900K CPU @ 3.70 GHz, along with an Nvidia RTX 4090 graphics processing unit. The code was run on a Windows 10 environment. The compiler and deep-learning framework used are PyTorch 1.8.1 and Python 3.9, respectively.
In the experiment, we use the Kappa coefficient (Kappa), average accuracy (AA) and overall accuracy (OA) to evaluate the effectiveness of the methods. Kappa measures the agreement between model predictions and actual classifications. AA represents the average accuracy across different categories. OA represents the quantity of accurately classified pixels relative to the total quantity of pixels.

3.3. Experimental Results

3.3.1. Ablation Studies

To demonstrate the superiority of the DCSRP method, we conducted ablation experiments. The results are presented in Table 4.
To validate the effectiveness of the proposed DCSRP method, experiments are performed with three datasets, and a simple two-layer convolution layer and 5 samples were randomly selected for each class as the training samples. Firstly, we use traditional convolution as a baseline. According to the experimental results of Section 4, on each dataset, we adopt the optimal experimental parameters in the experiments. The following gives the specific parameter information of each dataset. In the IP dataset, the size of the input data blocks is 13 × 13 × 40, the number of parallel convolution kernels is 3, and the kernel size is 9 × 9. In the PU dataset, the size of the input data blocks is 25 × 25 × 10, the number of parallel convolution kernels is 5, and the kernel size is 9 × 9. In the SA dataset, the size of the input data blocks is 23 × 23 × 50, the number of parallel convolution kernels is 5, and the kernel size is 7 × 7. Secondly, we use dynamic convolution to obtain the results from the three datasets and compare the dynamic convolution with the traditional convolution. The results demonstrate that dynamic convolution can improve the performance of the model. This is because multiple convolution kernels are used in the convolution layer, and these parallel kernels are aggregated in a non-linear manner using attention mechanisms, thereby improving the feature representation capability of the model. Finally, we use the proposed method to obtain the results from the three datasets and compare the proposed method with dynamic convolution. The results show that OA, AA and Kappa increased with different degrees in the three datasets. This is because the proposed method enhances the performance of the convolution kernel. Adding a relatively smaller kernel into a larger kernel enables the larger kernel to capture smaller-scale features, resulting in improved classification accuracy.
It can be seen from Table 4 that replacing traditional static convolution with dynamic convolution can improve the model’s performance. In the IP dataset, there is an increase of 2.70% in OA, 1.67% in AA and 3.16% in Kappa compared to the baseline. In the PU dataset, the three metrics increased by 2.34%, 1.70% and 2.63%, respectively. Similarly, in the SA dataset, there are improvements in OA, AA and Kappa by 1.46%, 1.86% and 1.65%, respectively. Moreover, when replacing dynamic convolution in the model with the proposed method, the performance is further enhanced. In the IP, PU and SA datasets, there are respective increases of 1.80%, 0.52% and 0.35% in OA, 1.67%, 0.55% and 0.05% in AA, and 1.84%, 0.56% and 0.42% in Kappa. Overall, the results of the ablation experiments demonstrate that the proposed method not only improves the model’s feature representation capability but also enhances the performance of large kernels, thereby strengthening the final classification results.

3.3.2. Compared Results in Different Methods

In order to ascertain the validity of the proposed method, we conducted experiments utilizing the following models, including SSRN [30], HybridSN [31], BSNET [54], 3D2DCNN [55], SSAtt [56], SpectralNET [57], JigsawHSI [48] and HPCA [33]. We compared the results obtained from the original networks with the results obtained by replacing the 2D convolution layers in the original networks with the dynamic convolution and proposed method. Table 5 presents the overall accuracy of the different methods on different models in the three datasets. The detailed experimental results are presented in Table 6, Table 7, Table 8. In Table 6, Table 7, Table 8, TC represents using the traditional convolution network structure, DC represents using the dynamic convolution network structure, and PM represents using the proposed method network structure. According to the experimental results of Section 4, on each dataset, we adopt the optimal experimental parameters in the experiments. The following includes the specific parameter information of each dataset. In the IP dataset, the size of the input data blocks is 13 × 13 × 40, the number of parallel convolution kernels is 3, and the kernel size is 9 × 9. In the PU dataset, the size of the input data blocks is 25 × 25 × 10, the number of parallel convolution kernels is 5, and the kernel size is 9 × 9. In the SA dataset, the size of the input data blocks is 23 × 23 × 50, the number of parallel convolution kernels is 5, and the kernel size is 7 × 7.
In the IP dataset, Table 6 displays the classification results of the different methods on the different models. As evident from Table 6, under the same experimental conditions, when the dynamic convolution is added, compared with the experimental results of traditional convolution, the specific results of the increase in OA are 2.24%, 0.75%, 0.52%, 0.70%, 0.51%, 0.93%, 2.21% and 0.83% in SSRN, HybridSN, BSNET, 3D2DCNN, SSAtt, SpectralNET, JigsawHSI and HPCA. When the proposed method is added, compared with the experimental results of dynamic convolution, it further enhances the OA by 0.87%, 1.45%, 3.11%, 1.56%, 0.89%, 1.30%, 0.83% and 2.59% for SSRN, HybridSN, BSNET, 3D2DCNN, SSAtt, SpectralNET, JigsawHSI and HPCA, respectively. We compare the results by adding the proposed method and traditional method. The specific results of the increase in OA are 3.11%, 2.20%, 3.63%, 2.26%, 1.40%, 2.23%, 3.04% and 3.42% in SSRN, HybridSN, BSNET, 3D2DCNN, SSAtt, SpectralNET, JigsawHSI and HPCA. In the PU dataset, Table 7 displays the classification results of different methods on different models. As evident from Table 7, under the same experimental conditions, when the dynamic convolution is added, compared with the experimental results of traditional convolution, the specific results of the increase in OA are 2.39%, 0.69%, 1.01%, 0.41%, 0.86%, 1.42%, 0.89% and 0.64% in SSRN, HybridSN, BSNET, 3D2DCNN, SSAtt, SpectralNET, JigsawHSI and HPCA. When the proposed method is added, we compare the results adding the proposed method and dynamic convolution, which further enhances the OA by 2.19%, 1.45%, 0.35%, 2.18%, 0.78%, 1.21%, 0.67% and 2.01% for SSRN, HybridSN, BSNET, 3D2DCNN, SSAtt, SpectralNET, JigsawHSI and HPCA. We compare the results by adding the proposed method and the traditional method. The specific results of the increase in OA are 4.58%, 2.14%, 1.36%, 2.59%, 1.64%, 2.63%, 1.56% and 2.65% in SSRN, HybridSN, BSNET, 3D2DCNN, SSAtt, SpectralNET, JigsawHSI and HPCA. In the SA dataset, Table 8 displays the classification results of the different methods on the different models. As evident from Table 8, under the same experimental conditions, when the dynamic convolution is added, compared with the experimental results of traditional convolution, the specific results of the increase in OA are 0.43%, 0.87%, 0.29%, 0.37%, 0.36%, 1.00%, 0.63% and 0.42% in SSRN, HybridSN, BSNET, 3D2DCNN, SSAtt, SpectralNET, JigsawHSI and HPCA. When the proposed method is added, compared with the experimental results of dynamic convolution, it further enhances the OA by 0.50%, 0.65%, 1.37%, 0.59%, 0.94%, 0.51%, 0.68% and 0.61% for SSRN, HybridSN, BSNET, 3D2DCNN, SSAtt, SpectralNET, JigsawHSI and HPCA. We compare the results by adding the proposed method and the traditional method. The specific results of the increase in OA are 0.93%, 1.52%, 1.66%, 0.96%, 1.30%, 1.51%, 1.31% and 1.03% in SSRN, HybridSN, BSNET, 3D2DCNN, SSAtt, SpectralNET, JigsawHSI and HPCA.
Figure 9, Figure 10, Figure 11, Figure 12, Figure 13, Figure 14, Figure 15, Figure 16 and Figure 17 visually display the different models’ comparative results of the experiments in the IP, PU and SA datasets and show the classification maps of the IP, PU and SA datasets. As evident from Figure 9, Figure 10, Figure 11, Figure 12, Figure 13, Figure 14, Figure 15, Figure 16 and Figure 17, when adding the proposed method, the classification reduction maps have smoother boundaries and edges, effectively demonstrating the superiority of the proposed method.
Figure 18, Figure 19, Figure 20 more intuitively show the changed situation in the OA of the different models when adding the dynamic convolution and the proposed method in IP, PU and SA. Purple corresponds to the experimental results for each model when employing the traditional convolution. Green corresponds to the experimental results for each model when employing the dynamic convolution. Orange corresponds to the experimental results for each model when employing the proposed method.

3.3.3. Experimental Results of Different Sizes of Convolutional Kernels

In this section, we verify the mutual encouragement between the large convolutional kernel and the small convolutional kernel. We adopt the optimal experimental parameters on different datasets. Table 9, Table 10, Table 11, Table 12, Table 13, Table 14 show the experimental results of the three datasets. SC represents the experimental results of using the small convolutional kernel; BC represents the experimental results of using the large convolution kernel; SBC represents the experimental results of using the concat operation to add the results of the small convolution kernel and the large convolution kernel. PRO represents the experimental results of using the proposed method.
We enumerate the comparison results in different models and describe the results in the order of the OA of the proposed method with the OA of SC, BC and SBC. Table 9 and Table 10 are the results of the IP dataset. In SSRN, the OA increased by 2.19%, 3.54% and 1.49%, respectively.
In HybridSN, the OA increased by 1.97%, 2.20% and 1.47%. respectively. In BSNET, the OA increased by 3.24%, 3.63% and 4.91%, respectively. In 3D2DCNN, the OA increased by 0.20%, 2.26% and 1.39%, respectively. In SSAtt, the OA increased by 1.39%, 1.40% and 3.25%, respectively. In SpectralNET, the OA increased by 1.64%, 2.23% and 2.23%, respectively; In JigsawHSI, the OA increased by 2.23%, 0.05% and 3.04%, respectively. In HPCA, the OA increased by 2.32%, 3.42% and 2.17%, respectively.
Table 11 and Table 12 show the results of the PU dataset. In SSRN, the OA increased by 7.16%, 4.58% and 3.65%, respectively. In HybridSN, the OA increased by 1.66%, 2.14% and −0.78%, respectively. In BSNET, the OA increased by 2.54%, 1.36% and 0.36%, respectively. In 3D2DCNN, the OA increased by 2.42%, 2.59% and 1.59%, respectively. In SSAtt, the OA increased by 5.19%, 1.64% and 2.03%, respectively. In SpectralNET, the OA increased by 3.12%, 2.63% and 0.75%, respectively. In JigsawHSI, the OA increased by 1.32%, 1.56% and 0.93%, respectively. In HPCA, the OA increased by 0.93%, 2.65% and 1.88%, respectively. Although the proposed method gives results that are lower than the those of SBC in HybridSN, the results of the proposed method are higher than the results in other models.
Table 13 and Table 14 show the results of the SA dataset. In SSRN, the OA increased by 2.04%, 0.93% and 0.44%, respectively. In HybridSN, the OA increased by 1.37%, 1.55% and −0.16%, respectively. In BSNET, the OA increased by 2.64%, 1.66% and 1.55%, respectively. In 3D2DCNN, the OA increased by 2.42%, 0.96% and −0.37%, respectively. In SSAtt, the OA increased by 0.45%, 1.30% and 3.60%, respectively. In SpectralNET, the OA increased by 3.84%, 1.51% and 1.91%, respectively. In JigsawHSI, the OA increased by 3.22%, 1.31% and 0.73%, respectively. In HPCA, the OA increased by 0.83%, 1.03% and 0.86%, respectively. Although the results of the proposed method are lower than the results of SBC in HybridSN and 3D2DCNN, the results of the proposed method are competitive. At the same time, the results of the proposed method are higher than those in the other models.
In summary, the experimental results can prove the mutual encouragement between the large convolutional kernel and the small convolutional kernel of the proposed method.

4. Discussion

4.1. The Influence of the Number of Parallel Re-Parameterization Kernels on Experimental Results

Different inputs produce different attention weights, which leads to the weights of the re-parameterized convolution kernel being different. The different numbers of parameterized convolution kernels will have a certain impact on the experimental results. In order to determine the optimal number of kernels, we compared the results with different numbers of convolution kernels. Five sets of experiments were conducted, with the number of parallel kernels being one, two, three, four, five and six, respectively. Figure 21 displays the experimental results.
As evident from Figure 21, in the IP dataset, the best result is obtained when the number of parallel convolutional kernels is three. In the PU and SA datasets, the best results are obtained when the number of parallel convolutional kernels is five. Therefore, there is a non-linear relationship between the number of convolution kernels and the experimental results.

4.2. The Influence of Different Channel and Spatial Sizes on Experimental Results

We compared the results of different spatial sizes and spectral dimensions, aiming to identify the optimal spatial size and spectral dimension for each dataset.
The number of spectral dimensions in the extracted 3D patches indicates how much spectral information is available for HSIC. In the experiments, the spectral dimensions were set to {10, 20, 30, 40, 50, 60, 70}. In order to obtain the optimal number of dimensions, we use the control variable method and set the spatial size to 17 × 17 in the IP, PU and SA.
Figure 22 displays the experimental results of the OA with different spectral dimensions in different datasets. In the IP dataset, increasing the spectral dimension initially leads to an improvement in OA, followed by a subsequent decrease. Notably, the highest OA is observed when the spectral dimension is 40. Conversely, the PU dataset exhibits a different trend. Initially, the OA decreases with an increase in the spectral dimension, followed by an increase and ultimately another decrease. The optimal number of channels for the highest OA is observed at 10. Moving on to the SA dataset, as the number of channels increases, the OA first improves and then declines. The highest OA is achieved when the number of channels is 50. Consequently, the optimal spectral dimension varies across the three datasets. Specifically, for the IP, PU and SA datasets, the proposed method determined the optimal spectral dimensions to be 40, 10 and 50, respectively.
The spatial size in the extracted 3D patches indicates how much spatial information is available for HSIC. In this paper, the effect of spatial size on the performance of the proposed method is verified in three datasets. In the experiments, the spatial size was set to {11 × 11, 13 × 13, 15 × 15, 17 × 17, 19 × 19, 21 × 21, 23 × 23, 25 × 25, 27 × 27, 29 × 29}, and the number of channels of the IP, PU and SA datasets was set to 40, 10 and 50, respectively.
Figure 23 displays the experimental results of the OA with different spatial sizes in different datasets. For the IP dataset, there is a clear decreasing trend in OA with an increase in spatial size. The highest OA is achieved when the spatial size is 13 × 13. The PU dataset does not exhibit a discernible trend in OA as the spatial size increases. Notably, the OA achieves its maximum when the spatial size is configured as 25 × 25. Turning to the SA dataset, the OA demonstrates a pattern of first decreasing, then increasing and subsequently decreasing again with the increase in spatial size. The highest OA is attained when the spatial size is 23 × 23. Consequently, the optimal spatial size varies across the three datasets. Specifically, for the IP, PU and SA datasets, the results determine the optimal spatial sizes to be 13 × 13, 25 × 25 and 23 × 23, respectively.

4.3. The Influence of Different Re-Parameterization Kernel Sizes

Adding relatively smaller kernels to larger kernels can allow the large kernel to capture small-scale features. Therefore, the experimental results will change with the size of the large kernel. To determine the optimal size of the kernel, we compared the results of adding a 3 × 3 kernel to larger kernels of different sizes, such as {5 × 5, 7 × 7, 9 × 9, 11 × 11 and 13 × 13}. Figure 24 displays the experimental results.
In Figure 24, the experimental results generally exhibit a pattern of initially increasing and subsequently decreasing in the three datasets. This trend may be attributed to the limitation of input spatial sizes. When the kernel size is 13 × 13, the experimental results decrease significantly. This may be because the size of the convolution kernel is similar to the size of the input data. The optimal result is attained with a kernel size of 7 × 7 in the SA dataset, and in the IP and PU datasets, the finest results are obtained when the kernel size is 9 × 9. Therefore, there is a non-linear relationship between the size of the kernel and the experimental results.

5. Conclusions

This paper proposes a plug-and-play method called structural re-parameterization dynamic convolution. Firstly, dynamic convolution is a non-linear function, so it has more representation power. In addition, it can adaptively capture the contextual features of input data. Secondly, the large convolutional kernel and the small convolutional kernel are integrated into a new large convolutional kernel, which can capture global information and local information at the same time. To validate the effectiveness of the proposed method, we compared the experimental results of the original network with the results obtained by adding the proposed method to the original network on three datasets. The results illustrate that the proposed method is proficient at extracting spatial features and improving the model’s performance.

Author Contributions

Conceptualization: C.D. and X.L.; methodology: C.D. and X.L.; validation: X.L., J.C., M.Z. and Y.X.; investigation: X.L., J.C., M.Z. and Y.X.; writing—original draft preparation: C.D. and L.Z.; writing—review and editing: C.D. and X.L.; supervision: C.D. and L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundations of China (grant no. 61901369, grant no. 62101454) and the Xi’an University of Posts and Telecommunications Graduate Innovation Foundation (grant no. CXJJYL2022040).

Data Availability Statement

The Indiana Pines, University of Pavia and Salinas datasets are available online at https://ehu.eus/ccwintco/index.php?title=Hyperspectral_Remote_Sensing_Scenes#userconsent (accessed on 1 May 2021).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Flores, H.; Lorenz, S.; Jackisch, R.; Tusa, L.; Contreras, I.C.; Zimmermann, R.; Gloaguen, R. UAS-based hyperspectral environmental monitoring of acid mine drainage affected waters. Minerals 2021, 11, 182. [Google Scholar] [CrossRef]
  2. Gao, A.F.; Rasmussen, B.; Kulits, P.; Scheller, E.L.; Greenberger, R.; Ehlmann, B.L. Generalized unsupervised clustering of hyperspectral images of geological targets in the near infrared. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 4294–4303. [Google Scholar]
  3. Zhou, S.; Sun, L.; Ji, Y. Germination Prediction of Sugar Beet Seeds Based on HSI and SVM-RBF. In Proceedings of the 4th International Conference on Measurement, Information and Control (ICMIC), Harbin, China, 23–25 August 2019; pp. 93–97. [Google Scholar]
  4. Haq, M.A.; Rahaman, G.; Baral, P.; Ghosh, A. Deep learning based supervised image classification using UAV images for forest areas classification. J. Indian Soc. Remote Sens. 2021, 49, 601–606. [Google Scholar] [CrossRef]
  5. Pan, B.; Shi, Z.; An, Z.; Jiang, Z.; Ma, Y. A novel spectral-unmixing-based green algae area estimation method for GOCI data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 10, 437–449. [Google Scholar] [CrossRef]
  6. Makki, I.; Younes, R.; Francis, C.; Bianchi, T.; Zucchetti, M. A survey of landmine detection using hyperspectral imaging. ISPRS J. Photogramm. Remote Sens. 2017, 124, 40–53. [Google Scholar] [CrossRef]
  7. Shimoni, M.; Haelterman, R.; Perneel, C. Hypersectral imaging for military and security applications: Combining myriad processing and sensing techniques. IEEE Geosci. Remote Sens. Mag. 2019, 7, 101–117. [Google Scholar] [CrossRef]
  8. Nigam, R.; Bhattacharya, B.K.; Kot, R.; Chattopadhyay, C. Wheat blast detection and assessment combining ground-based hyperspectral and satellite based multispectral data. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, 42, 473–475. [Google Scholar] [CrossRef]
  9. Gualtieri, J.A.; Cromp, R.F. Support Vector Machines for Hyperspectral Remote Sensing Classification. In Proceedings of the 27th AIPR Workshop: Advances in Computer-Assisted Recognition, Washington, DC, USA, 14–16 October 1998; SPIE: Bellingham, WA, USA, 1999; Volume 3584, pp. 221–232. [Google Scholar]
  10. Jain, V.; Phophalia, A. Exponential Weighted Random Forest for Hyperspectral Image Classification. In Proceedings of the IGARSS 2019—2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 3297–3300. [Google Scholar]
  11. Chen, Y.; Lin, Z.; Zhao, X. Riemannian manifold learning based k-nearest-neighbor for hyperspectral image classification. In Proceedings of the 2013 IEEE International Geoscience and Remote Sensing Symposium-IGARSS, Melbourne, Australia, 21–26 July 2013; pp. 1975–1978. [Google Scholar]
  12. Wu, Z.; Shi, L.; Li, J.; Wang, Q.; Sun, L.; Wei, Z.; Plaza, J.; Plaza, A. GPU parallel implementation of spatially adaptive hyperspectral image classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 11, 1131–1143. [Google Scholar] [CrossRef]
  13. Chen, Y.; Nasrabadi, N.M.; Tran, T.D. Hyperspectral image classification using dictionary-based sparse representation. IEEE Trans. Geosci. Remote Sens. 2011, 49, 3973–3985. [Google Scholar] [CrossRef]
  14. Cheng, G.; Li, Z.; Han, J.; Yao, X.; Guo, L. Exploring hierarchical convolutional features for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 56, 6712–6722. [Google Scholar] [CrossRef]
  15. Li, J.; Marpu, P.R.; Plaza, A.; Bioucas-Dias, J.M.; Benediktsson, J.A. Generalized composite kernel framework for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2013, 51, 4816–4829. [Google Scholar] [CrossRef]
  16. Zhou, Y.; Peng, J.; Chen, C.L.P. Extreme learning machine with composite kernels for hyperspectral image classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 8, 2351–2360. [Google Scholar] [CrossRef]
  17. Fan, W.; Zhang, R.; Wu, Q. Hyperspectral image classification based on PCA network. In Proceedings of the 2016 8th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS), Los Angeles, CA, USA, 21–24 August 2016. [Google Scholar]
  18. Gao, F.; Dong, J.; Li, B.; Xu, Q. Automatic change detection in synthetic aperture radar images based on PCANet. IEEE Geosci. Remote Sens. Lett. 2016, 13, 1792–1796. [Google Scholar] [CrossRef]
  19. Zhou, S.; Xue, Z.; Du, P. Semisupervised stacked autoencoder with cotraining for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2019, 57, 3813–3826. [Google Scholar] [CrossRef]
  20. Shi, C.; Pun, C.M. Multiscale superpixel-based hyperspectral image classification using recurrent neural networks with stacked autoencoders. IEEE Trans. Multimed. 2019, 22, 487–501. [Google Scholar] [CrossRef]
  21. Midhun, M.E.; Nair, S.R.; Prabhakar, V.T.N.; Kumar, S.S. Deep Model for Classification of Hyperspectral Image Using Restricted Boltzmann Machine. In Proceedings of the International Conference on Interdisciplinary Advances in Applied Computing, Amritapuri, India, 10–11 October 2014; pp. 1–7. [Google Scholar]
  22. Li, T.; Zhang, J.; Zhang, Y. Classification of Hyperspectral Image Based on Deep Belief Networks. In Proceedings of the 2014 IEEE International Conference on Image Processing (ICIP), Paris, France, 27–30 October 2014; pp. 5132–5136. [Google Scholar]
  23. Jiang, Z.; Pan, W.D.; Shen, H. Universal Golomb–Rice Coding Parameter Estimation Using Deep Belief Networks for Hyperspectral Image Compression. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 3830–3840. [Google Scholar] [CrossRef]
  24. Chen, Y.; Jiang, H.; Li, C.; Jia, X.; Ghamisi, P. Deep feature extraction and classification of hyperspectral images based on convolutional neural networks. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6232–6251. [Google Scholar] [CrossRef]
  25. Yang, J.; Xiao, L.; Zhao, Y.Q.; Chan, J.C.W. Variational regularization network with attentive deep prior for hyperspectral–multispectral image fusion. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–17. [Google Scholar] [CrossRef]
  26. Paoletti, M.E.; Haut, J.M.; Fernandez-Beltran, R.; Plaza, J.; Plaza, A.J.; Pla, F. Deep pyramidal residual networks for spectral–spatial hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 57, 740–754. [Google Scholar] [CrossRef]
  27. Paoletti, M.E.; Haut, J.M.; Fernandez-Beltran, R.; Plaza, J.; Plaza, A.; Li, J.; Pla, F. Capsule networks for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 57, 2145–2160. [Google Scholar] [CrossRef]
  28. Li, X.; Ding, M.; Gu, Y.; Pižurica, A. An End-to-End Framework for Joint Denoising and Classification of Hyperspectral Images. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 3269–3283. [Google Scholar] [CrossRef]
  29. Xue, J.; Zhao, Y.; Huang, S.; Liao, W.; Chan, J.; Kong, S. Multilayer sparsity-based tensor decomposition for low-rank tensor completion. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6916–6930. [Google Scholar] [CrossRef] [PubMed]
  30. Zilong, Z.; Li, J.; Luo, Z.; Chapman, M. Spectral–spatial residual network for hyperspectral image classification: A 3-D deep learning framework. IEEE Trans. Geosci. Remote Sens. 2017, 56, 847–858. [Google Scholar]
  31. Kumar, R.S.; Krishna, G.; Dubey, S.R.; Chaudhuri, B.B. HybridSN: Exploring 3-D-2-D CNN feature hierarchy for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 2019, 17, 277–281. [Google Scholar]
  32. Li, X.; Ding, M.; Pizurca, A. Deep feature fusion via two-stream convolutional neural network for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2020, 58, 2615–2629. [Google Scholar] [CrossRef]
  33. Ding, C.; Chen, Y.; Li, R.; Wen, D.; Xie, X.; Zhang, L.; Wei, W.; Zhang, Y. Integrating hybrid pyramid feature fusion and coordinate attention for effective small sample hyperspectral image classification. Remote Sens. 2022, 14, 2355. [Google Scholar] [CrossRef]
  34. Yu, C.; Han, R.; Song, M.; Liu, C.; Chang, C.-I. Feedback attentionbased dense CNN for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5501916. [Google Scholar]
  35. Luo, W.; Li, Y.; Urtasun, R.; Zemel, R. Understanding the effective receptive field in deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2016, 29, 2476. [Google Scholar]
  36. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25. [Google Scholar] [CrossRef]
  37. Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
  38. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
  39. Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
  40. Guo, Z.; Zhang, X.; Mu, H.; Heng, W.; Liu, Z.; Wei, Y.; Sun, J. Single path one-shot neural architecture search with uniform sampling. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; pp. 544–560. [Google Scholar]
  41. Howard, A.; Sandler, M.; Chu, G.; Chen, L.C.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V.; et al. Searching for Mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar]
  42. Liu, H.; Simonyan, K.; Yang, Y. Darts: Differentiable architecture search. arXiv 2018, arXiv:1806.09055. [Google Scholar]
  43. Barret, Z.; Le Quoc, V. Neural architecture search with reinforcement learning. arXiv 2016, arXiv:1611.01578. [Google Scholar]
  44. Liu, Z.; Mao, H.; Wu, C.Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A ConvNet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11976–11986. [Google Scholar]
  45. Shang, R.; Chang, H.; Zhang, W.; Feng, J.; Li, Y. Hyperspectral image classification based on multiscale cross-branch response and second-order channel attention. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5532016. [Google Scholar] [CrossRef]
  46. Ma, Y.; Liu, Z.; Chen, C.L.P. Multiscale random convolution broad learning system for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 2021, 19, 5503605. [Google Scholar] [CrossRef]
  47. Jayasree, S.; Khanna, Y.; Mukhopadhyay, J. A CNN with Multiscale Convolution for Hyperspectral Image Classification Using Target-Pixel-Orientation Scheme. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021. [Google Scholar]
  48. Moraga, J.; Duzgun, H.S. JigsawHSI: A network for hyperspectral image classification. arXiv 2022, arXiv:2206.02327. [Google Scholar]
  49. Ding, X.; Zhang, X.; Han, J.; Ding, G. Scaling up Your Kernels to 31 × 31: Revisiting Large Kernel Design in CNNs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11963–11975. [Google Scholar]
  50. Radosavovic, I.; Kosaraju, R.P.; Girshick, R.; He, K.; Dollár, P. Designing Network Design Spaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10428–10436. [Google Scholar]
  51. Zhang, X.; Zhou, X.; Lin, M.; Sun, J. Shufflenet: An extremely Efficient Convolutional Neural Network for Mobile Devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6848–6856. [Google Scholar]
  52. Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
  53. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
  54. Cai, Y.; Liu, X.; Cai, Z. BS-Nets: An end-to-end framework for band selection of hyperspectral image. IEEE Trans. Geosci. Remote Sens. 2019, 58, 1969–1984. [Google Scholar] [CrossRef]
  55. Ge, Z.; Cao, G.; Li, X.; Fu, P. Hyperspectral image classification method based on 2D–3D CNN and multibranch feature fusion. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 5776–5788. [Google Scholar] [CrossRef]
  56. Hang, R.; Li, Z.; Liu, Q.; Ghamisi, P.; Bhattacharyya, S.S. Hyperspectral image classification with attention-aided CNNs. IEEE Trans. Geosci. Remote Sens. 2020, 59, 2281–2293. [Google Scholar] [CrossRef]
  57. Chakraborty, T.; Trehan, U. Spectralnet: Exploring spatial-spectral waveletcnn for hyperspectral image classification. arXiv 2021, arXiv:2104.00341. [Google Scholar]
Figure 1. Overall framework of proposed method.
Figure 1. Overall framework of proposed method.
Remotesensing 15 05561 g001
Figure 2. An example of using DCSRP. (a) Original network structure; (b) using DYConv network structure; (c) using DCSRP network structure.
Figure 2. An example of using DCSRP. (a) Original network structure; (b) using DYConv network structure; (c) using DCSRP network structure.
Remotesensing 15 05561 g002
Figure 3. The framework of dynamic convolution.
Figure 3. The framework of dynamic convolution.
Remotesensing 15 05561 g003
Figure 4. Framework of squeeze-and-excitation method.
Figure 4. Framework of squeeze-and-excitation method.
Remotesensing 15 05561 g004
Figure 5. An example of structural re-parameterization. For ease of visualization, we assume in channels = out channels = 2; thus, the 3 × 3 layer has four 3 × 3 matrices, and the 7 × 7 layer has four 7 × 7 matrices.
Figure 5. An example of structural re-parameterization. For ease of visualization, we assume in channels = out channels = 2; thus, the 3 × 3 layer has four 3 × 3 matrices, and the 7 × 7 layer has four 7 × 7 matrices.
Remotesensing 15 05561 g005
Figure 6. The Indian Pines dataset. (a) False color image. (b) Label map. (c) The corresponding color labels.
Figure 6. The Indian Pines dataset. (a) False color image. (b) Label map. (c) The corresponding color labels.
Remotesensing 15 05561 g006
Figure 7. The University of Pavia dataset. (a) False color image. (b) Label map. (c) The corresponding color labels.
Figure 7. The University of Pavia dataset. (a) False color image. (b) Label map. (c) The corresponding color labels.
Remotesensing 15 05561 g007
Figure 8. The Salinas dataset. (a) False color image. (b) Label map. (c) The corresponding color labels.
Figure 8. The Salinas dataset. (a) False color image. (b) Label map. (c) The corresponding color labels.
Remotesensing 15 05561 g008
Figure 9. (ai) are the classification maps of different methods in the IP dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Figure 9. (ai) are the classification maps of different methods in the IP dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Remotesensing 15 05561 g009
Figure 10. (ai) are the classification maps of the dynamic convolution that is added to different methods in the IP dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Figure 10. (ai) are the classification maps of the dynamic convolution that is added to different methods in the IP dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Remotesensing 15 05561 g010
Figure 11. (ai) are the classification maps of the proposed method that is added to different methods in the IP dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Figure 11. (ai) are the classification maps of the proposed method that is added to different methods in the IP dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Remotesensing 15 05561 g011
Figure 12. (ai) are the classification maps of different methods in the PU dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Figure 12. (ai) are the classification maps of different methods in the PU dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Remotesensing 15 05561 g012
Figure 13. (ai) are the classification maps of the dynamic convolution that is added to different methods in the PU dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Figure 13. (ai) are the classification maps of the dynamic convolution that is added to different methods in the PU dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Remotesensing 15 05561 g013
Figure 14. (ai) are the classification maps of the proposed method that is added to different methods in the PU dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Figure 14. (ai) are the classification maps of the proposed method that is added to different methods in the PU dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Remotesensing 15 05561 g014
Figure 15. (ai) are the classification maps of the proposed method that is added to different methods in the SA dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Figure 15. (ai) are the classification maps of the proposed method that is added to different methods in the SA dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Remotesensing 15 05561 g015
Figure 16. (ai) are the classification maps of the dynamic convolution that is added to different methods in the SA dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Figure 16. (ai) are the classification maps of the dynamic convolution that is added to different methods in the SA dataset. (a) Label map; (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Remotesensing 15 05561 g016
Figure 17. (ai) are the classification maps of the proposed method that is added to different methods in the SA dataset. (a) Label map. (bi) are the classification maps of different methods. (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Figure 17. (ai) are the classification maps of the proposed method that is added to different methods in the SA dataset. (a) Label map. (bi) are the classification maps of different methods. (b) SSRN; (c) HybridSN; (d) BSNET; (e) 3D2DCNN; (f) SSAtt; (g) SpectralNET; (h) JigsawHSI; (i) HPCA.
Remotesensing 15 05561 g017
Figure 18. The OA situation of different models when adding the different methods is shown in IP dataset.
Figure 18. The OA situation of different models when adding the different methods is shown in IP dataset.
Remotesensing 15 05561 g018
Figure 19. The OA situation of different models when adding the different methods is shown in PU dataset.
Figure 19. The OA situation of different models when adding the different methods is shown in PU dataset.
Remotesensing 15 05561 g019
Figure 20. The OA situation of different models when adding the different methods is shown in SA dataset.
Figure 20. The OA situation of different models when adding the different methods is shown in SA dataset.
Remotesensing 15 05561 g020
Figure 21. The results of parallel convolution kernels with different numbers in three datasets.
Figure 21. The results of parallel convolution kernels with different numbers in three datasets.
Remotesensing 15 05561 g021
Figure 22. The experimental results of OA with different spectral dimensions in three datasets.
Figure 22. The experimental results of OA with different spectral dimensions in three datasets.
Remotesensing 15 05561 g022
Figure 23. The experimental results of OA with different spatial sizes in three datasets.
Figure 23. The experimental results of OA with different spatial sizes in three datasets.
Remotesensing 15 05561 g023
Figure 24. The results of different re-parameterization kernel sizes in three datasets.
Figure 24. The results of different re-parameterization kernel sizes in three datasets.
Remotesensing 15 05561 g024
Table 1. The number of samples used for training and testing in the IP dataset.
Table 1. The number of samples used for training and testing in the IP dataset.
ClassNameTotal SamplesTrain SamplesTest Samples
1Alfalfa46541
2Corn-notill142851423
3Corn-mintill8305825
4Corn2375232
5Grass-pasture4835478
6Grasstrees7305725
7Grass-pasture-mowed28523
8Background4785473
9Oats20515
10Soybean-no till9725967
11Soybean-min till245552450
12Soybean-clean5935588
13Wheat2055200
14Woods126551260
15Buildings-grass-trees-drives3865381
16Stone-steel-towers93588
Total 10,2498010,169
Table 2. The number of samples used for training and testing in the PU dataset.
Table 2. The number of samples used for training and testing in the PU dataset.
ClassNameTotal SamplesTrain SamplesTest Samples
1Asphalt663156626
2Meadows18,649518,644
3Gravel209952094
4Trees306453059
5Painted metal sheets 134551340
6Bare soil502955024
7Bitumen133051325
8Self-blocking bricks368253677
9Shadows9475942
Total 42,7764542,731
Table 3. The number of samples used for training and testing in the SA dataset.
Table 3. The number of samples used for training and testing in the SA dataset.
ClassNameTotal SamplesTrain SamplesTest Samples
1Brocoli_green_weeds_1200952004
2Brocoli_green_weeds_2372653721
3Fallow197651971
4Fallow rough plow139451389
5Fallow smooth267852673
6Stubble395953954
7Celery357953574
8Grapes untrained11,271511,266
9Soil vineyard develop620356198
10Corn senesced green weeds327853273
11Lettuce_romaine_4wkl106851063
12Lettuce_romaine_5wkl192751922
13Lettuce_romaine_6wkl9165911
14Lettuce_romaine_7wkl107051065
15Vineyard untrained726857263
16Vineyard vertical trellis180751802
Total 54,1298054,049
Table 4. Experimental results of ablation studies.
Table 4. Experimental results of ablation studies.
MethodIPPUSA
OA (%)AA (%)OA (%)AA (%)OA (%)AA (%)
Traditional convolution62.7076.6970.5274.2090.0292.69
Dynamic convolution65.4078.3672.8675.9091.4894.55
Proposed method67.2580.0373.3876.4591.8394.60
Table 5. The overall accuracy of different methods on different models in the three datasets.
Table 5. The overall accuracy of different methods on different models in the three datasets.
Different
Models
Different
Methods
Datasets
IPPUSA
SSRN [47]Traditional convolution66.5070.9791.05
Dynamic convolution68.7473.3691.48
Proposed method69.6175.5591.98
HybridSN [48]Traditional convolution63.1266.9286.70
Dynamic convolution63.8767.6187.57
Proposed method65.3269.0688.25
BSNET [50]Traditional convolution60.8467.6988.75
Dynamic convolution61.3668.7089.04
Proposed method64.4769.0590.41
3D2D-CNN [46]Traditional convolution68.1269.0993.16
Dynamic convolution68.8269.5093.53
Proposed method70.3871.6894.12
SSAtt [53]Traditional convolution67.0567.5587.85
Dynamic convolution67.5668.4188.21
Proposed method68.4569.1989.15
SpectralNET [52]Traditional convolution66.8668.1289.78
Dynamic convolution67.7969.5490.78
Proposed method69.0970.7591.29
JigsawHSI [51]Traditional convolution64.3069.0489.68
Dynamic convolution66.5169.9390.31
Proposed method67.3470.6090.99
HPCA [49]Traditional convolution67.1769.2491.26
Dynamic convolution68.0069.8891.68
Proposed method70.5971.8992.29
Table 6. The classification results of different methods on different models in the IP dataset.
Table 6. The classification results of different methods on different models in the IP dataset.
SSRNHybridSNBSNET3D2DCNNSSAttSpectralNETJigsawHSIHPCA
CLASSTCDCPMTCDCPMTCDCPMTCDCPMTCDCPMTCDCPMTCDCPMTCDCPM
128.6763.0839.0527.4035.6516.1830.2353.0623.6844.5758.5783.3388.1065.5769.8166.1382.0038.6894.7476.9286.3676.9266.1393.18
278.6276.8968.1872.7468.9450.2852.5761.7957.7983.1578.4885.8374.6889.1067.2662.1546.7467.5849.0776.5366.1783.0491.2681.08
360.1171.8956.9352.6751.5866.8642.4638.5763.8235.5450.5787.8235.6335.4253.5054.2157.0154.8577.7069.5352.4491.1139.1175.80
475.7451.7277.9134.1871.0368.7029.1141.5045.77100.0098.73100.0031.4068.1643.5631.4270.4267.3065.6079.8674.7977.2917.8963.60
597.7463.2581.0290.4677.2382.1776.8673.2790.8872.7385.0484.8585.5887.6978.8959.7270.7760.9562.4384.5484.4775.6283.1382.96
678.6773.2991.4487.0188.1691.1463.1257.1360.6669.6579.5080.5074.9476.2181.6671.0061.7064.3796.5590.0863.4769.8274.0282.36
75.9321.1021.30100.0033.9640.746.9514.7130.2619.6625.0015.2312.9220.1816.7913.1474.1914.944.8022.1239.6614.2024.7321.10
898.81100.0098.9465.0286.5295.5699.0488.1498.84100.0098.7499.79100.0100.0100.097.73100.0099.58100.0095.30100.00100.00100.00100.00
921.4316.489.623.0992.3165.2213.513.107.4617.6511.2871.4312.1018.185.8425.0018.5221.1342.8618.8415.3131.9110.879.74
1049.4466.3976.9871.2263.8961.6954.6781.7481.3768.2670.5354.8279.2176.4468.2993.1972.5979.5363.0152.8663.1748.2091.4889.46
1177.8382.0181.6474.1975.8280.2286.9075.3672.0783.4683.8378.2177.4266.0878.8895.1276.5082.3471.0860.5273.9279.2681.8077.87
1226.7532.9332.1724.7623.3930.7537.7760.5430.2734.4625.7630.6839.1434.0535.7132.5934.2940.6669.6834.4932.6632.8538.4128.66
1362.7466.4555.8782.1794.9586.9078.7153.7661.3572.9959.5279.0566.9071.9485.7852.4967.8077.8251.4294.2987.3960.3067.9459.00
1499.9199.00100.0095.4577.06100.0093.4196.2396.78100.0096.8493.2098.9098.3499.7998.2098.0099.3487.3785.4889.4198.3899.9099.18
1582.6259.1856.5752.6344.6643.7648.1165.0543.9762.5063.2788.2071.5271.9051.3179.1570.0066.9190.6464.0776.1865.2557.1465.43
1636.8228.3053.6626.9176.0037.1340.9311.8032.7134.3838.6013.9247.8351.7657.8959.0674.5821.9531.5428.1219.2119.3829.6325.73
OA (%)66.5068.7469.6163.1263.8765.3260.8461.3664.4768.1268.8270.3867.0567.5668.4566.8667.7969.0964.3066.5167.3467.1768.0070.59
AA (%)77.6781.7280.5473.5972.3676.8570.9369.6974.8079.8280.8580.8177.4678.0779.4879.1580.6581.4672.5574.0475.3680.5881.3181.39
Kappa × 10062.4365.1165.9758.6359.4261.1856.5956.8659.9664.3765.2366.7663.0463.2164.6563.1463.7559.9159.9161.8063.0463.5264.4867.05
Table 7. The classification results of different methods on different models in the PU dataset.
Table 7. The classification results of different methods on different models in the PU dataset.
SSRNHybridSNBSNET3D2DCNNSSAttSpectralNETJigsawHSIHPCA
CLASSTCDCPMTCDCPMTCDCPMTCDCPMTCDCPMTCDCPMTCDCPMTCDCPM
186.6277.6988.3971.0469.2467.0586.9091.5888.9478.9682.7282.0084.8491.2274.1369.7271.4063.9172.4485.0683.3977.2386.4787.48
296.3392.7499.3686.7884.2793.1695.6381.0590.0897.4197.2095.0895.9393.1287.9698.6487.5698.3496.0791.6184.8093.8490.1399.44
366.6163.7068.7241.5983.9733.6435.4631.4246.6757.5184.0666.0230.6153.9966.3844.4972.6544.2837.6828.1641.1476.2071.1969.53
473.0931.8273.8876.1650.6576.2057.5176.6445.0241.2842.6240.8166.1363.8935.1380.0630.7452.1054.0759.6758.3033.7026.6836.79
558.9967.2688.9180.4181.9686.6096.4562.7088.0281.7482.9988.5967.6383.2384.2281.4173.7987.1281.2586.4377.1985.3078.8470.42
640.9379.6760.3460.0845.8148.7855.1554.2256.6749.7446.8152.2146.3244.0668.2774.4777.1166.6962.1269.2981.4261.8768.1255.49
754.4963.5942.9847.979.2931.0419.4827.4169.3449.1156.7867.4957.4033.2618.2656.4934.9938.5925.1035.7327.4244.2183.4159.39
881.2681.8270.0229.3759.3638.7668.3965.4075.9274.0974.7875.5967.8056.7076.0665.7464.0277.0359.1059.7273.5970.2998.9666.87
927.4539.1723.1014.6487.9448.4425.9469.2819.0140.6835.8731.3552.5578.2347.099.8278.3926.2837.0834.3046.7827.4523.1549.66
OA (%)70.9773.3675.5566.9267.6169.0667.6968.7069.0569.0969.5071.6867.5568.4169.1968.1269.6470.7569.0469.9370.6069.2469.8871.89
AA (%)74.0174.0678.7160.0158.6162.8365.1062.3267.4375.6178.4576.6570.6871.8265.2370.6170.1872.3064.6165.4866.3075.2771.2777.14
Kappa × 10063.9966.4069.5656.5457.7589.9059.0958.0060.4262.0462.8664.7359.5260.3360.2460.7261.0563.4460.7461.2160.9862.0560.0365.28
Table 8. The classification results of different methods on different models in the SA dataset.
Table 8. The classification results of different methods on different models in the SA dataset.
SSRNHybridSNBSNET3D2DCNNSSAttSpectralNETJigsawHSIHPCA
CLASSTCDCPMTCDCPMTCDCPMTCDCPMTCDCPMTCDCPMTCDCPMTCDCPM
178.6598.91100.0099.8598.7791.3469.63100.0085.4997.3892.6396.21100.099.75100.076.2399.4095.9898.1892.3995.4783.5197.28100.00
299.0999.9797.4699.4998.13100.0099.7297.15100.00100.0099.9299.6897.9093.3295.17100.099.5092.0899.7999.7399.3399.9499.3199.89
399.9099.70100.0081.4692.5885.7696.1895.6081.7999.34100.0099.8094.1196.7488.2299.8499.52100.099.9091.7196.83100.00100.00100.00
484.1197.0396.5889.9190.7589.8698.0577.5395.0594.9595.9093.7998.9077.1284.9655.6586.7281.3696.1677.9499.6091.5392.5895.66
597.5693.2598.4389.0586.5897.1596.5096.5391.9199.3699.6396.7494.8197.4395.4196.2393.3884.0090.7996.9493.9687.5775.6290.93
696.2897.2998.3585.6896.3697.8591.9394.8999.8099.2799.2599.4299.9099.4099.7597.5699.6599.2298.6590.3496.1399.77100.0099.75
799.5799.9299.83100.00100.0098.4696.9198.4597.51100.0099.8399.52100.0100.0100.099.61100.0100.093.9987.1888.77100.0099.78100.00
886.1785.4286.3993.3891.4090.5598.4299.9598.1987.1488.0294.9995.5181.5987.8790.5293.3297.4782.6298.5297.8696.1797.7396.85
998.8897.6598.75100.0098.9998.3599.0593.0792.0198.6399.4798.8599.4499.6499.61100.099.9797.6598.0495.6590.2496.23100.0097.76
10100.0097.5299.6785.0988.3397.0885.4399.1697.65100.0099.9396.7697.8897.8298.7399.1298.9399.8399.1098.97100.0096.65100.0097.74
1194.6681.4682.5859.7860.7873.8080.2955.8665.6895.5984.4388.2274.7586.8386.8482.8985.1879.9190.6188.6197.4478.1689.0178.80
12100.0099.4297.9999.4798.2599.4276.5498.1698.23100.00100.00100.0097.7293.52100.098.5699.6399.5093.8999.9498.6097.5786.8198.20
1395.4397.2396.9695.7466.7996.3689.6953.1399.6690.0292.6898.9198.0399.0194.9087.7297.6288.1799.3799.8978.9297.0964.1797.77
1486.2784.6891.4577.6293.3863.0783.2960.7388.5088.6887.2189.0363.4872.7095.9493.8098.7995.9787.2279.8992.5395.3885.9695.09
1574.4274.2573.3469.2264.0266.2672.2272.2178.4377.8280.1677.7859.3566.5962.0074.3663.7772.3169.1072.7972.4071.3875.2270.07
1698.8899.6496.4274.1499.8888.8697.5699.8884.5595.1999.7899.8898.5789.2599.78100.099.88100.094.0097.58100.00100.0098.75100.00
OA (%)91.0591.4891.9886.7087.5788.2588.7589.0490.4193.1693.5394.1287.8588.2189.1589.7890.7891.2989.6890.3190.9991.2691.6892.29
AA (%)94.5895.0395.7691.9591.5291.7091.1789.0792.1096.4396.7096.8993.4993.0294.6991.8195.7293.5293.3191.8592.3093.7790.5495.34
Kappa × 10090.0590.5291.0885.3186.2486.9987.5587.8689.3792.3992.8193.4686.5886.9187.9888.6689.7890.3488.5189.2789.9990.3090.7691.45
Table 9. Comparing four methods results of different sizes of convolutional kernels in IP dataset.
Table 9. Comparing four methods results of different sizes of convolutional kernels in IP dataset.
SSAttSpectralNETJigsawHSIHPCA
CLASSSCBCSBCProposedSCBCSBCProposedSCBCSBCProposedSCBCSBCProposed
156.5288.1025.6669.8160.2966.1341.4938.6880.0012.3494.7486.3640.5976.9237.1493.18
258.3874.6872.4867.2646.6562.1560.7567.5871.5977.9949.0766.1774.1483.0469.1981.08
366.3835.6394.8153.5099.4454.2156.4354.8561.1669.9477.7052.4473.2091.1149.6675.80
466.3331.4041.1743.56100.0031.4245.2267.3056.7633.3365.6074.7985.0377.2942.3463.60
591.4685.5872.9778.8999.3359.7281.3060.95100.0073.9962.4384.4766.4075.6271.7982.96
682.2474.9472.5681.6671.0571.0087.7364.3783.2981.6396.5563.4778.7469.8285.9582.36
726.4412.9218.1116.7913.6913.1419.4914.9431.1513.454.8039.6611.0014.2021.9021.10
8100.00100.0100.00100.099.7897.73100.0099.5888.87100.00100.00100.0099.78100.00100.00100.00
96.5812.104.985.8415.1525.006.6721.1326.6775.0042.8615.319.6831.9110.569.74
1055.4379.2151.2968.2970.2693.1970.0379.5368.4268.5763.0163.1760.8748.2068.0789.46
1181.6277.4264.2878.8881.0695.1278.6782.3475.2583.6271.0873.9281.6079.2679.2877.87
1236.7239.1435.7535.7133.7332.5924.4640.6628.4229.8469.6832.6642.9232.8532.0528.66
1330.5366.9085.7885.7889.6452.4944.4477.8295.2942.0251.4287.3966.3360.3048.0859.00
1499.3998.9095.7099.7995.7498.2098.4299.3486.6291.1487.3789.4192.5298.3898.5399.18
1574.3571.5277.1351.3191.8079.1582.7666.9165.6890.5390.6476.1888.0765.2575.0865.43
1652.0747.8355.3557.8957.1459.0624.7921.9534.0325.0731.5419.2112.0119.3828.8525.73
OA (%)66.6367.0565.2068.4567.4566.8667.2069.0967.0767.2964.3067.3468.2767.1768.4270.59
AA (%)76.4377.4675.7679.4877.1379.1576.2781.4671.8176.1172.5575.3678.4880.5878.0381.39
Kappa × 10062.6163.0460.5464.6563.3963.1463.1759.9162.8363.5759.9163.0468.2763.5264.5567.05
Table 10. Comparing remaining methods results of different sizes of convolutional kernels in IP dataset.
Table 10. Comparing remaining methods results of different sizes of convolutional kernels in IP dataset.
SSRNHybridSNBSNET3D2DCNN
CLASSSCBCSBCProposedSCBCSBCProposedSCBCSBCProposedSCBCSBCProposed
163.0828.6729.0839.0567.8627.4093.9416.1896.1530.2332.7423.68100.0044.5731.5483.33
270.3378.6255.2768.1842.5572.7450.5350.2861.7652.5748.2957.7953.9883.1572.9385.83
370.5160.1159.8056.9354.1052.6737.4266.8629.4942.4631.0563.8297.7635.5477.1687.82
484.1475.7497.3777.9179.0834.1821.7068.7073.1229.1135.8845.77100.00100.0095.65100.00
561.0597.7456.7281.0274.6690.4661.3182.1785.9276.8642.5290.88100.0072.7387.0084.85
683.6378.6793.0991.4494.7387.0188.0391.1486.3963.1270.7060.6682.5169.6560.2680.50
714.655.9346.9421.3047.73100.0017.2940.740.006.9525.0030.2613.1419.6624.7315.23
8100.0098.8198.1298.9498.9065.02100.0095.5682.8799.0499.7798.8499.36100.0099.7999.79
922.3921.4317.059.6222.733.0913.0065.221.9913.5112.247.4620.5517.6515.7971.43
1087.2849.4487.5976.9874.5971.2276.4361.6988.4954.6797.8381.3793.7468.2684.5954.82
1181.3277.8373.1881.6463.9774.1993.4280.2269.9786.9083.2072.0780.6583.4674.9978.21
1224.0126.7548.4332.1729.7424.7638.3030.7527.9537.7726.3630.2730.9134.4640.3230.68
1355.1062.7461.3555.8763.4982.1793.4386.9065.5778.7161.0461.3582.6472.9937.1179.05
1494.0299.9197.93100.0087.8995.4592.52100.0089.5393.4185.2596.78100.00100.0092.0093.20
1586.3382.6260.3656.5754.8952.6350.9043.7652.5248.1169.9443.9783.6062.5050.7688.20
1631.4336.8232.1253.6623.2826.9138.2637.1355.2640.9335.7732.7162.8634.3814.1013.92
OA (%)67.4966.5068.8769.6163.3563.1263.8565.3261.2360.8459.5664.4770.1868.1268.9970.38
AA (%)79.4277.6779.1980.5476.2273.5974.6276.8564.2470.9372.3874.8078.3679.8275.9180.81
Kappa × 10063.7862.4364.8465.9758.7058.6359.8361.1856.2156.5954.9559.9666.2464.3765.0166.76
Table 11. Comparing four methods results of different sizes of convolutional kernels in PU dataset.
Table 11. Comparing four methods results of different sizes of convolutional kernels in PU dataset.
SSAttSpectralNETJigsawHSIHPCA
CLASSSCBCSBCProposedSCBCSBCProposedSCBCSBCProposedSCBCSBCProposed
183.8084.8478.2474.1382.9669.7282.9763.9165.0072.4476.9283.3979.2877.2383.4687.48
289.1295.9395.9787.9696.4998.6498.1398.3497.4596.0798.0984.8094.9093.8491.5999.44
365.5230.6162.9266.3847.6444.4954.6944.2843.0537.6845.7941.1471.5976.2066.5069.53
429.3766.1340.2635.1368.2380.0662.1552.1052.6054.0766.8858.3037.7533.7037.1436.79
586.2967.6387.9984.2285.2681.4184.7887.1285.2581.2582.0577.1984.3185.3065.2770.42
642.1446.3245.7168.2740.8174.4752.3366.6972.5862.1248.1581.4242.0761.8760.3855.49
730.7957.4041.3818.2624.5256.4927.3138.5924.8925.1087.6827.4257.7144.2134.1559.39
878.7667.8074.8176.0661.4865.7456.8377.0359.7859.1071.6973.5977.1370.2973.4566.87
959.3252.5544.0747.0960.809.8230.6026.2812.7237.0819.6646.7850.2527.4543.5049.66
OA (%)64.0067.5567.1669.1967.6368.1270.0070.7569.2869.0469.6770.6067.4269.2470.0171.89
AA (%)70.3370.6874.9165.2368.4870.6171.1372.3060.7464.6171.2966.3073.9275.2770.8277.14
Kappa × 10055.8859.5259.7360.2459.5760.7262.4863.4460.5960.7462.0960.9860.1562.0562.2565.28
Table 12. Comparing remaining methods results of different sizes of convolutional kernels in PU dataset.
Table 12. Comparing remaining methods results of different sizes of convolutional kernels in PU dataset.
SSRNHybridSNBSNET3D2DCNN
CLASSSCBCSBCProposedSCBCSBCProposedSCBCSBCProposedSCBCSBCProposed
183.0286.6283.6488.3981.3471.0480.8167.0582.0286.9078.9488.9480.5478.9679.0582.00
293.3396.3396.9299.3699.3886.7897.1393.1698.1095.6396.0690.0898.4397.4196.2295.08
363.9966.6159.3268.7236.9441.5941.2133.6473.2235.4663.1946.6761.9957.5164.7366.02
437.2873.0935.4273.8856.2976.1666.4276.2040.4557.5139.6745.0246.5941.2839.2740.81
586.8758.9985.9288.9187.6680.4187.0086.6087.9896.4588.1788.0288.3581.7488.1088.59
642.4640.9353.1760.3446.4660.0846.1648.7841.0155.1548.8956.6746.2949.7451.8952.21
746.4854.4948.0142.9840.4547.9738.8831.0439.6319.4842.3969.3463.9849.1143.7767.49
872.7981.2667.7870.0256.2129.3748.7938.7682.9268.3973.8975.9273.9774.0974.2975.59
967.6227.4524.7723.1042.6914.6468.3748.4459.3825.9447.7119.0126.7840.6848.5031.35
OA (%)68.3970.9771.9075.5567.4066.9269.8469.0666.5167.6968.6969.0569.2669.0970.0971.68
AA (%)73.6574.0172.9178.7168.2860.0170.8262.8369.5565.1070.3967.4375.7675.6175.9876.65
Kappa × 10060.8963.9964.8169.5659.5356.5462.0189.9059.5159.0961.3860.4262.4362.0462.9364.73
Table 13. Comparing four methods results of different sizes of convolutional kernels in SA dataset.
Table 13. Comparing four methods results of different sizes of convolutional kernels in SA dataset.
SSAttSpectralNETJigsawHSIHPCA
CLASSSCBCSBCProposedSCBCSBCProposedSCBCSBCProposedSCBCSBCProposed
196.16100.0100.00100.084.1776.2399.4595.9893.3098.1898.5795.4763.2783.5199.95100.00
297.7897.9097.1795.17100.00100.099.7892.0899.6099.7999.6899.33100.0099.9499.7999.89
398.1694.11100.0088.22100.0099.8499.85100.086.2499.9098.9596.8399.64100.0097.96100.00
497.1798.9088.1184.9666.0455.6595.5081.3678.6296.1693.1499.6096.5691.5393.4995.66
591.2594.8182.2495.4186.1496.2394.3584.0097.4490.7995.3193.9692.3387.5793.4490.93
699.3799.9099.1099.7596.9297.5699.7299.2296.0598.6594.4696.1399.9099.7799.4499.75
793.27100.0100.00100.0100.0099.6197.99100.092.9393.9993.8988.77100.00100.00100.00100.00
884.9795.5190.6587.8780.3390.5294.2797.47100.0082.6298.0497.86100.0096.1798.7796.85
999.1099.4490.4399.6196.53100.093.4197.6597.3598.0499.4490.2499.8496.2399.9797.76
1091.4897.8897.2698.7395.3699.1298.5199.8388.7099.1080.59100.0097.1796.6596.3697.74
1194.2974.7583.6586.8492.8482.8979.0779.9159.1490.6156.3697.4495.1478.1675.0278.80
1296.6197.72100.00100.090.0798.5699.0699.5099.6693.8998.4898.6095.0197.5799.6498.20
1398.7498.0385.5494.9072.9287.7298.5488.1796.2599.3798.1178.9283.5197.0999.0297.77
1474.8463.4866.9395.9493.0893.8071.3795.9747.4887.2293.0792.5370.7095.3881.8795.09
1564.7359.3561.5862.0086.9074.3664.2372.3171.4869.1073.6972.4075.3571.3867.2670.07
16100.0098.5772.2599.7855.23100.099.83100.092.4994.0090.17100.0098.47100.0099.14100.00
OA (%)88.7087.8585.5589.1587.4589.7889.3891.2987.7789.6890.2690.9991.4691.2691.4392.29
AA (%)94.2993.4989.8594.6990.0691.8194.1593.5291.5993.3193.4592.3093.2193.7795.5495.34
Kappa × 10087.4886.5884.0187.9886.0888.6688.2490.3486.5288.5189.2389.9990.5490.3090.5291.45
Table 14. Comparing remaining methods results of different sizes of convolutional kernels in SA dataset.
Table 14. Comparing remaining methods results of different sizes of convolutional kernels in SA dataset.
SSRNHybridSNBSNET3D2DCNN
CLASSSCBCSBCProposedSCBCSBCProposedSCBCSBCProposedSCBCSBCProposed
193.0378.65100.00100.00100.0099.8599.9091.3498.0469.6387.4785.4998.6297.3899.3596.21
2100.0099.0997.2597.46100.0099.4999.92100.0099.7399.72100.00100.0099.81100.0099.8499.68
3100.0099.90100.00100.0097.8481.4698.6685.7697.6296.18100.0081.79100.0099.3499.0499.80
472.1484.1177.8096.5892.0689.9192.1989.8693.7098.0569.0795.0580.3794.9589.9493.79
598.3897.5699.2298.4395.0189.0593.4597.1593.5396.5088.1991.9199.6699.3698.7796.74
690.9896.2897.8398.3599.5085.6899.4797.8599.7291.9397.7599.8096.9499.2799.4599.42
798.9499.5779.5699.83100.00100.00100.0098.4698.5796.91100.0097.51100.00100.00100.0099.52
880.6386.1796.1086.3997.3793.3897.4290.5596.0698.4281.3898.1982.5187.1497.6894.99
999.2098.88100.0098.7594.83100.0091.5898.3588.6699.0597.0492.01100.0098.6399.2198.85
1098.39100.0089.3199.6799.5485.0999.8797.0899.8985.4397.0497.65100.00100.0095.4896.76
1187.0094.6690.3182.5854.1859.7870.8073.8076.3080.2992.8465.6891.8395.5991.6488.22
1296.53100.0087.0897.9999.8899.4799.8999.4299.5776.5491.1498.2394.80100.0099.06100.00
1365.2095.4393.2796.9699.7795.7499.8996.3699.6689.6975.9299.6692.9690.0299.1198.91
1486.1286.2797.5991.4576.2377.6274.6863.0757.5983.2990.8788.5074.5288.6883.6789.03
1586.9074.4285.8073.3459.3269.2262.7166.2664.0772.2285.8378.4380.1777.8277.8677.78
1687.2698.8868.5396.4289.4374.1490.2688.8694.1097.5663.2584.5597.9695.19100.0099.88
OA (%)89.9491.0591.5491.9886.8886.7088.4188.2587.7788.7588.8690.4191.7093.1694.4994.12
AA (%)91.6494.5892.4395.7691.5191.9592.3991.7091.5391.1791.3592.1094.8996.4397.0396.89
Kappa × 10088.7990.0590.6291.0885.4985.3187.1686.9986.4587.5587.6389.3790.7692.3993.8893.46
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ding, C.; Li, X.; Chen, J.; Xu, Y.; Zheng, M.; Zhang, L. Hyperspectral Image Classification Promotion Using Dynamic Convolution Based on Structural Re-Parameterization. Remote Sens. 2023, 15, 5561. https://doi.org/10.3390/rs15235561

AMA Style

Ding C, Li X, Chen J, Xu Y, Zheng M, Zhang L. Hyperspectral Image Classification Promotion Using Dynamic Convolution Based on Structural Re-Parameterization. Remote Sensing. 2023; 15(23):5561. https://doi.org/10.3390/rs15235561

Chicago/Turabian Style

Ding, Chen, Xu Li, Jingyi Chen, Yaoyang Xu, Mengmeng Zheng, and Lei Zhang. 2023. "Hyperspectral Image Classification Promotion Using Dynamic Convolution Based on Structural Re-Parameterization" Remote Sensing 15, no. 23: 5561. https://doi.org/10.3390/rs15235561

APA Style

Ding, C., Li, X., Chen, J., Xu, Y., Zheng, M., & Zhang, L. (2023). Hyperspectral Image Classification Promotion Using Dynamic Convolution Based on Structural Re-Parameterization. Remote Sensing, 15(23), 5561. https://doi.org/10.3390/rs15235561

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop