PGNN-Net: Parallel Graph Neural Networks for Hyperspectral Image Classification Using Multiple Spatial-Spectral Features

Guo, Ningbo; Jiang, Mingyong; Wang, Decheng; Jia, Yutong; Li, Kaitao; Zhang, Yanan; Wang, Mingdong; Luo, Jiancheng

doi:10.3390/rs16183531

Open AccessArticle

PGNN-Net: Parallel Graph Neural Networks for Hyperspectral Image Classification Using Multiple Spatial-Spectral Features

by

Ningbo Guo

^1,*

,

Mingyong Jiang

¹,

Decheng Wang

²

,

Yutong Jia

¹

,

Kaitao Li

¹

,

Yanan Zhang

¹,

Mingdong Wang

¹ and

Jiancheng Luo

³

¹

School of Space Information, Space Engineering University, Beijing 101400, China

²

Beijing Institute of Tracking and Telecommunication Technology, Beijing 100094, China

³

State Key Laboratory of Remote Sensing Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(18), 3531; https://doi.org/10.3390/rs16183531

Submission received: 23 July 2024 / Revised: 8 September 2024 / Accepted: 21 September 2024 / Published: 23 September 2024

(This article belongs to the Section Remote Sensing Image Processing)

Download

Browse Figures

Versions Notes

Abstract

:

Hyperspectral image (HSI) shows great potential for application in remote sensing due to its rich spectral information and fine spatial resolution. However, the high dimensionality, nonlinearity, and complex relationship between spectral and spatial features of HSI pose challenges to its accurate classification. Traditional convolutional neural network (CNN)-based methods suffer from detail loss in feature extraction; Transformer-based methods rely too much on the quantity and quality of HSI; and graph neural network (GNN)-based methods provide a new impetus for HSI classification by virtue of their excellent ability to handle irregular data. To address these challenges and take advantage of GNN, we propose a network of parallel GNNs called PGNN-Net. The network first extracts the key spatial-spectral features of HSI using principal component analysis, followed by preprocessing to obtain two primary features and a normalized adjacency matrix. Then, a parallel architecture is constructed using improved GCN and ChebNet to extract local and global spatial-spectral features, respectively. Finally, the discriminative features obtained through the fusion strategy are input into the classifier to obtain the classification results. In addition, to alleviate the over-fitting problem, the label smoothing technique is embedded in the cross-entropy loss function. The experimental results show that the average overall accuracy obtained by our method on Indian Pines, Kennedy Space Center, Pavia University Scene, and Botswana reaches 97.35%, 99.40%, 99.64%, and 98.46%, respectively, which are better compared to some state-of-the-art methods.

Keywords:

hyperspectral image; graph neural networks; deep learning; classification algorithms

1. Introduction

HSI are captured by advanced imaging spectrometers deployed on various space platforms and contain spatial and spectral information of the target [1]. Each pixel of such images contains a complete spectral curve that accurately depicts the spectral characteristics of the target at that pixel location. It is these properties that give HSI a greater advantage in characterizing the physical, chemical, and biological properties of ground targets [2]. HSI classification technology is to classify classes of ground targets more accurately through analyzing HSI [3]. The technology has shown great practical value in important fields such as land resource exploration, surface environment monitoring, and precision agriculture management [4,5,6].

As we know, HSI contains multiple fine spectral bands, which enriches the data but also brings the problems of high correlation and information redundancy in adjacent bands. Moreover, different substances in nature may exhibit similar spectral features in specific spectral bands, and the same spectral features may correspond to different substances, which increase the difficulty of classification and complicate the precise differentiation of ground targets. Therefore, HSI classification needs to cope with these inherent complexities and uncertainties [7]. During early stages, traditional machine learning classification methods dominated HSI classification [8]. Initially, these methods mainly relied on the spectral information of ground targets to classify spectral features with the help of simple algorithms such as K-Nearest Neighbors (KNN) [9] and Support Vector Machines (SVM) [10]. However, while extracting representative features, these algorithms also expose the limitations of high demand for training samples, poor handling of the class imbalance problem, and neglect of spatial information [11]. With deeper studies, researchers realized the importance of spatial information in classification, especially in local areas where ground targets tend to have similar spectral values and high class consistency. Thus, they fused spatial and spectral information and proposed some new methods, such as those based on Markov random fields [12] and recursive filtering [13], etc. As these improve the classification performance, they also introduce new challenges, such as the intensified impact of noise and outliers on the model performance and the increase in computation and model complexity [14].

Nowadays, with the development of neural network technology, deep learning (DL) methods gradually show great potential in the field of HSI classification [15]. Different from traditional methods, the DL can automatically extract more advanced and representative features after a series of complex nonlinear transformations of HSI data through its multilevel neural network structure, which can then be used for the final decision of the classifier. The features extracted by DL methods are often more discriminative and accurate than those obtained by traditional methods [16]. It is thanks to its powerful feature extraction capability and ability to process high-dimensional data that the DL has been widely used in HSI classification and has shown increasingly better classification performance. Commonly used network structures of DL mainly include CNN-based, Transformer-based, and so on [17].

Among these structures, the CNN is the most commonly used for HSI classification, which not only integrates convolutional computation but also is a feed-forward neural network with deep structure [18]. It has three significant features: first, it realizes local sensing and weight sharing, which captures and extracts the local features by performing convolutional operations on the HSI through a convolutional kernel. Meanwhile, through weight sharing, CNN effectively reduces the number of network parameters and computational complexity; second, CNN possesses the ability of hierarchical feature extraction, which is able to extract and combine HSI features layer by layer in-depth through the stacked combination of multi-layer convolutional layers, pooling layers, and fully connected layers; and lastly, CNN demonstrates robustness to transformations such as translation, rotation, and scaling, i.e., translation invariant, which makes it more stable and reliable in processing HSI [19]. However, it also faces some challenges in practical applications. For example, in the process of reducing the spatial size of the feature map by pooling layers to extract higher-level features, some detail information may be lost, which may affect the HSI classification accuracy [20].

Unlike the CNNs, the Transformer treats HSI as an input sequence and transforms these sequences into information-rich vector features through its encoder [21]. The encoder consists of multiple self-attention layers, each of which computes the correlation coefficients between all the locations in the input sequence and uses these coefficients to generate a weighted sum, which in turn yields an output representation for each location. This mechanism performs well in capturing long-distance dependencies in HSI, but it may also result in the model not paying enough attention to immediately neighboring elements when processing local information [22]. In addition, the computation of the self-attention mechanism makes the network consider the information of the whole sequence of HSI in each computation, which enhances the global perception but may also weaken the model’s ability to capture local details to a certain extent. What is more, the performance of the structures largely depends on the quantity and quality of the training data, and the training and inference process requires a large amount of computational and storage resources, so there are some limitations on the training speed and inference efficiency [23].

The GNN-based methods provide a new impetus for the rapid development of HSI classification by virtue of their excellent ability to handle the non-euclidean structure data. As we know, GNN is designed to capture more comprehensive information about the graph structure by aggregating the features of each node with those of its surrounding nodes and iterating to form a richer node representation [24]. This makes GNN outstanding in coping with the complexity of HSI, improving classification accuracy, and solving the mixed pixel problem [25]. In HSI classification, spatial relationships and spectral features are crucial for improving classification accuracy. Currently, although several DL techniques have been applied to HSI classification, most of the methods either focus on the extraction of spectral features while ignoring the value of spatial information or fail to achieve efficient and complementary information utilization when fusing spectral and spatial features. GCN simulates spatial relationships between pixels by constructing a graph structure, which can effectively capture the spatial context information of HSI. ChebNet utilizes Chebyshev polynomials to approximate the convolution kernel, which achieves efficient convolution operations in the frequency domain, and it can effectively capture HSI features in the spectral dimension. Therefore, to effectively deal with the inherent challenges of HSI, such as redundancy information of band, “different substances but same spectrum” and “same spectrum but different substances”, and also to improve the utilization efficiency of spatial-spectral features and reduce the loss of multi-class information, we combined GCN and ChebNet to design a network called PGNN-Net for better HSI classification results. Our main contributions are as follows:

We designed a HSI classification network with a parallel-style architecture called PGNN-Net, which enriches and develops the application of GNN in HSI classification.
To compensate for the limitations of GCN in global spatial-spectral feature capture for HSI, we introduced ChebNet and adapted it, and finally constructed a more comprehensive representation of HSI features through a feature fusion strategy.
We validate the effectiveness and sophistication of PGNN-Net on the four classical HSI datasets and analyze the performance of the model with different parameter settings.

2. Related Work

2.1. CNN-Based Classification

As the DL technology continues to develop, the application of CNN in HSI classification is also deepening. Researchers have continuously improved the performance and effect of CNN in HSI classification through some means, such as improving the network structure and optimizing the algorithm parameters. The methods can be classified into three types according to convolutional utilization. The first is to utilize only CNNs. E.g., ref. [26] proposed to utilize 1- and 2-dimensional CNNs to automatically extract spatial and spectral features of HSI at multiple levels. Similarly, ref. [27] encodes both spatial and spectral information of HSI with the help of residual networks and utilizes a multilayer perceptron for classification. Ref. [28] proposed 3D convolutional networks to improve the classification accuracy of HSI. The second is to add attentional mechanisms (AM). E.g., ref. [29] designed a channel-based AM to achieve fast classification of complex scenes on HSI. The third is the fusion with other methods. E.g., ref. [30] fused convolutional operations with a bidirectional long short-term memory network, which effectively extracted the spatial and spectral feature information of HSI and improved the classification performance. Ref. [31] fused manual features to design a small-sample HSI classification network. Although these methods achieved more satisfactory classification performance, the loss of information in the HSI class boundary region and the difficulty of modeling the long-range spatial relationship of HSI still affect the result.

2.2. Transformer-Based Classification

To fully utilize the spectral and spatial information of HSI, researchers have proposed Transformer-based methods. These methods realize the joint extraction and fusion of spectral and spatial features by designing specific network structures. The methods can be divided into two types according to the network architecture. The first is the Transformer-only methods. E.g., ref. [32] introduced the Transformer into HSI classification, which utilized the sequence property of spectral features and achieved better results. Ref. [33] designed a vision transformer module for extracting spatial and spectral features, respectively, and achieved high accuracy classification by fusing the feature maps. The second is to combine with other networks. E.g., ref. [34] fused convolution and Transformer in serial and parallel network architectures to achieve full utilization of HSI. Ref. [35] combines grouped convolution with Transformer to improve the extraction accuracy of global and local features for HSI. These methods have achieved better results, but the extraction ability for HSI local features is still insufficient. In addition, the higher computational effort and the need for larger data are still problems that must be faced in practice.

2.3. GNN-Based Classification

With consideration of the inherent shortcomings of CNN and Transformer, researchers proposed the use of GNN for HSI classification. GNN is not only suitable for irregular data with non-euclidean structure, which can flexibly retain the irregular class boundary information in HSI, but also able to directly model the spatial relationship between distant pixels. Recently, GNN has achieved great success in graph data analysis and processing tasks by virtue of its powerful learning capability. The methods can be divided into two types according to whether they fuse with other neural networks or not. The first is that it uses only GNN. E.g., ref. [36] proposed a two-stream GNN, which maximizes feature utilization by fusing inputs utilizing two different dimensions and obtains better HSI classification results. Ref. [37] added a graph filter to the GNN to suppress the image distortion problem. The second is to fuse other neural networks. E.g., ref. [38] approximated the neighboring nodes in HSI as convolution and proposed a graph-based CNN. Ref. [39] fused edge convolution into GNN to obtain discriminative features of HSI through learning. ref. [40] jointly designed a hybrid network with CNN, Transformer, and graph convolution to classify HSI. Ref. [41] designed a parallel HSI classification network using a graph convolution network and CNN. Although GNN has demonstrated remarkable effectiveness in HSI classification, but still faces some challenges. Specifically, these include: first, how to efficiently deal with the inherent noise components and redundant information of HSI to optimize the model’s ability to extract effective information; second, how to improve the model’s joint utilization efficiency of HSI spatial and spectral features, i.e., to achieve deep spatial-spectral feature fusion and efficient parsing, so as to further enhance the model’s ability to understand and characterize complex scenes.

3. Methodology

For the inherent characteristics of HSI, in order to further improve the utilization efficiency of spatial and spectral features and to maximize the advantages of GNN in processing HSI, we innovatively combined GCN [42], which has advantages in spatial relationship modeling and nonlinear feature extraction, and ChebNet [43], which is excellent in spectral feature extraction and generalization ability, to build a novel type of HSI classifier called PGNN-Net (Figure 1).

Through the learning of HSI, PGNN-Net can not only fully extract the local spatial-spectral features and global spatial-spectral features of HSI but also obtain the discriminative features of HSI with comprehensive information and finally obtain a better classification effect. Its overall structure consists of four parts, namely, (A) data preprocessing; (B) local spatial-spectral feature extraction; (C) global spatial-spectral feature extraction; and (D) feature fusion and classification, which are described as follows:

3.1. Data Preprocessing

The core task of this chapter focuses on the refined preprocessing operations on the spatial and spectral dimensional features of HSI, aiming at laying a solid foundation for the subsequent stages of effectively capturing the local features by GCN and deeply extracting the global features by using ChebNet. This processing process is subdivided into three key steps, which are described as follows:

(1) The downscaling of spectral features. Considering that HSI usually contains large spectral bands and complex spatial structures, direct processing may face computational inefficiency and over-fitting risk. Therefore, the dimensionality reduction of spectral features was first implemented to retain the most representative spectral information while significantly reducing the data dimensionality through principal component analysis techniques [44].

Firstly, the data of each band of HSI is normalized to obtain

X_{H S I}

. Secondly, calculate each band’s pixel mean value

U_{H S I}

and the sample’s covariance matrix. Thirdly, the eigenvalue decomposition of the covariance matrix is performed to obtain the eigenvalues and the corresponding eigenvectors. Fourthly, select the principal components and construct the transformation matrix

U^{T}

. Finally, obtain

X_{p c a}

after (1).

X_{p c a} = U^{T} (X_{H S I} - U_{H S I})

(1)

(2) The pre-processing of primary features. For spatial features, a spatial cut operation is performed, i.e., dividing the HSI into a number of subregions or hyperpixels, so that the subsequent network model can be more focused on local details while reducing computational complexity. Specifically, we cut the image cubes

X_{c u b}

of the same size from

X_{p c a}

. Similarly, the corresponding labeled images are also cut. Separately, to fully utilize the feature information of

X_{c u b}

, we designed two coding algorithms for spatial-spectral features. One is to directly reshape the input image block from

[b, c, h, w]

to

[b, c, h \times w]

using the reshape function, and then use the transpose function to exchange the second and third dimensions to change its shape to

[b, h \times w, c]

, and finally use the normalization process to obtain the primary features

X_{G G F}

. The other is to use a convolution kernel with a size of 1 × 1 and a step size of 1 to perform the convolution operation before adjusting the shape. The other feature

X_{C G F}

is obtained.

(3) The normalization of adjacency matrix. As graph neural networks rely on graph representation when processing non-euclidean structured data (e.g., pixels or hyperpixels in HSI), constructing an accurate adjacency matrix becomes a critical aspect. The adjacency matrix not only reflects the spatial proximity between data points but also may incorporate spectral similarity information to fully portray the complex structure of HSI. Therefore, to effectively extract key information from the HSI graph structure, we construct a graph model based on the

e \times e

grid layout and convert it into a standardized adjacency matrix suitable for graph neural network processing, which lays the foundation for subsequent training and inference. Specifically, we first counted the number of edges and extracted the index information of the edges. Subsequently, using this information, we constructed the sparse adjacency matrix using the coo_matrix function. This step not only effectively compresses the storage requirements but also preserves the key information of the graph structure. To improve the stability of the subsequent processing, we added a unit matrix to alleviate the isolation problem of the nodes. Subsequently, we normalized each row to ensure a balanced transfer of information. Finally, the todense function was utilized to convert the matrix to its dense representation

A \in ℝ^{n \times n}

.

3.2. Local Spatial-Spectral Feature Extraction

GCN enables each node to aggregate the feature information of its neighboring nodes by designing a specific graph convolutional layer, thus realizing the update and enhancement of node features. This process not only enhances the ability of node representation but also facilitates the modeling of complex dependencies in graph-structured data, providing support for effectively capturing and characterizing the local structural properties embedded in the graph. Therefore, to better extract the local spatial-spectral features of HSI, we input

X_{C G F}

into the improved GCN, and the specific methods are as follows:

(1) Linear Transformation Processing. For each layer of the GCN, the feature matrices of the nodes are transformed by a linear transformation matrix to generate a new representation of the node features. This step allows the GCN to learn linear combinations of node features that capture potential relationships between nodes. First,

X_{C G F}

is transformed using a weight matrix

w

and a bias vector

b

to perform a linear transformation. Specifically, for each element of the input, the following operations are performed:

Y_{C G F} = X_{C G F} \cdot w^{T} + b

(2)

where

Y_{C G F}

denotes the result of

X_{C G F}

being linearly transformed.

(2) Compute Normalized Adjacency Matrix. The normalization process not only avoids gradient vanishing or gradient explosion, but also enables each node to consider the importance of neighboring nodes when aggregating neighbor information. Define a unit matrix

I_{C G F} \in ℝ^{n \times n}

based on the number of nodes in

A

and add it to

A

to obtain the new adjacency matrix

A^{'}

.

A^{'} = A + I_{C G F}

(3)

Add the elements of each row of

A^{'}

to obtain the degree matrix.

D_{C G F} = d i a g (A^{'} \cdot 1)

(4)

where 1 is a column vector that is 1. The diag function converts these sums to diagonal elements of a diagonal matrix. Calculate the reciprocal of each element on the diagonal of

D_{C G F}

to form a new diagonal matrix

{D_{C G F}}^{- 1}

.

Use matrix multiplication to multiply

{D_{C G F}}^{- 1}

and

A^{'}

to obtain the normalized adjacency matrix

N_{C G F}

.

N_{C G F} = {D_{C G F}}^{- 1} \cdot A^{'}

(5)

(3) Generate Local Spatial-Spectral Feature. During the graph convolution operation, the normalized adjacency matrix and the node feature matrix are used to compute a new node feature representation. Specifically, the features of each node are aggregated with the features of the neighboring nodes through the normalized adjacency matrix to generate a new feature representation that contains the neighbor information. We first matrix multiply

N_{C G F}

with

Y_{C G F}

, and then apply the ReLU activation function to introduce nonlinearities and limit the range of the output values to obtain the local spatial-spectral feature

Y_{C G F}

that can be learned and represent node class.

F_{C G F} = Re LU (N_{C G F} \cdot Y_{C G F})

(6)

3.3. Global Spatial-Spectral Feature Extraction

HSI usually contains a large number of continuous and subdivided spectral bands, which cover the spectral range from the visible to the infrared. ChebNet utilizes the high-order approximation capability of Chebyshev polynomials (CP), which not only captures the key spectral variations in the HSI but also maps the raw spectral data to the high-dimensional feature space. In the high-dimensional feature space, ChebNet not only preserves the global statistical properties of the spectral data, such as the overall trend and shape of the spectral curves, but also captures the local detail information, such as the small fluctuations of the spectral reflectance or emissivity. Therefore, given the tight coupling of spatial and spectral features in HSI, we adopt ChebNet for global feature extraction in order to better capture its global spatial-spectral features. Specifically, it consists of the following steps:

(1) Compute Normalized Laplace Matrix. This part is mainly used to process the data of graph structure to improve the stability and convergence speed of the algorithm. Sum

A

along the columns to obtain the degree vector for each node. Then, for each element in the degree vector, find its negative square root and convert to a diagonal matrix using the diag function to obtain the degree matrix

D_{G G F}

.

D_{G G F} = d i a g (1 / \sqrt{\sum_{d i m = 1} A})

(7)

Multiply

D_{G G F}

and

A

and then multiply by

D_{G G F}

to obtain the normalized graph matrix

D_{G G F}^{'}

.

D_{G G F}^{'} = D_{G G F} \cdot A \cdot D_{G G F}

(8)

Create a unit matrix

I_{G G F}

based on the number of edges of

A

. Use

I_{G G F}

minus

D_{G G F}^{'}

to obtain the normalized Laplace matrix

D_{G G F}^{″}

.

D_{G G F}^{″} = I_{G G F} - D_{G G F}^{'}

(9)

(2) Compute k-order CP. This part approximates the convolution kernel in the spectral domain by CP interpolation, which avoids the complex feature decomposition process and reduces the computational complexity. The Laplace matrix of a graph can be thought of in graph signal processing as a frequency domain representation of the graph, and the CP provides a way to define filters in the frequency domain. By computing the CP of multiple orders, we can obtain a series of matrices.

These matrices can be viewed as filters of different orders, and these filters can be used to perform convolution operations on the nodal features of the graph to capture local structural information on the graph.

CP are computed in (10).

T_{k} (x) = (\begin{matrix} 1 & (k = 0) \\ x & (k = 1) \\ 2 \times x \times T_{k - 1} (x) - T_{k - 2} (x) & (k = 2, 3, \dots) \end{matrix}

(10)

After replacing the

x

in (10) with the

D_{G G F}^{″}

, we obtain the CP in matrix form

T_{k} (D_{G G F}^{″})

.

(3) Generate Global Spatial-Spectral Feature. This part extracts the features of the graph data by performing convolution operations on the graph structure data. These features incorporate not only global information through graph structural relationships but also local information of nodes. To facilitate the computation with the graph matrix, we use the unsqueeze function to add a dimension to the second dimension of

T_{k} (D_{G G F}^{″})

to obtain

T_{k}^{'} (D_{G G F}^{″})

, giving it a shape of

[k + 1, 1, N, N]

. Then, using matrix multiplication to multiply

T_{k}^{'} (D_{G G F}^{″})

with

X_{L G F}

and then multiply by the weight parameter

W_{G G F}

of the model to obtain

T_{k}^{″} (D_{G G F}^{″})

.

T_{k}^{″} (D_{G G F}^{″}) = T_{k}^{'} (D_{G G F}^{″}) \cdot X_{G G F} \cdot W_{G G F}

(11)

Finally, sum along the first dimension of

T_{k}^{″} (D_{G G F}^{″})

and then add with

b_{G G F}

to obtain the global spatial-spectral feature

F_{G G F}

.

F_{G G F} = \sum_{d i m = 0} T_{k}^{″} (D_{G G F}^{″}) + b_{G G F}

(12)

3.4. Feature Fusion and Classification

To efficiently exploit the potential of GNN in processing HSI, we have crafted a fusion strategy that combines the local spatial-spectral feature obtained through the GCN and the global spatial-spectral feature obtained through the ChebNet. Specifically, we implement a matrix multiplication operation for

F_{C G F}

and

F_{G G F}

to create more discriminative fused features

F_{H S I}

. This fusion process not only enhances the representativeness of the features, but also achieves a deeper integration of information.

F_{H S I} = F_{C G F}^{'} \otimes F_{G G F}^{'}

(13)

Prior to fusion, we took pre-processing to ensure that the fusion effect is optimized. First, we normalized

F_{C G F}

and

F_{G G F}

. Then, we swapped the positions of the second and third dimensions to more efficiently utilize the intrinsic structure of the data. Finally, we subjected the features to adaptive mean pooling and then reshaped them to 1-dimensional shapes to obtain

F_{C G F}^{'}

and

F_{G G F}^{'}

. The cross-entropy loss function is commonly used in the classification process.

L_{H S I} = - \sum_{c} y_{c} \log {\hat{y}}_{c}

(14)

where

c

indicates the class,

y_{c}

indicates the correct class, and

{\hat{y}}_{c}

indicates the predicted class. It is a representation of the true labels in the form of one-hot vectors, where only one element of the vector is 1 (indicating the true class) and the rest is 0. However, this method may lead to over-fitting, as the model relies too much on the labels in the training data. To avoid over-fitting, we employ Label Smoothing [45], which averages the probability distributions of the true labels with those of the other classes, so that the model does not completely ignore information from non-true classes in its predictions. After repeated training to find the minimum loss, the best classification model is obtained.

4. Results

4.1. Experimental Datasets

To comprehensively verify the effectiveness of our proposed method, we selected four widely used HSI datasets. In Table 1, we list the significant data for each dataset.

4.1.1. Indian Pines (IP)

It is acquired with an Airborne Visible Infrared Imaging Spectrometer (AVIRIS). The data originally contained 220 consecutive bands, and after the correction process, the number of bands used in the final experiment was 200. As shown in Table 2, to clearly differentiate and identify them, we assigned different color labels to the 16 different pixels within the corrected dataset, which precisely correspond to the 16 different features captured by the AVIRIS.

4.1.2. Kennedy Space Center (KSC)

It is also obtained from AVIRIS acquisitions. The data covered 224 bands, but after correction and screening, 176 bands were finally identified as valid data. Similarly, to show the features more clearly, Table 2 details the correspondence between the 13 features and their respective corresponding pixels.

4.1.3. Pavia University Scene (PUS)

It is acquired with the Reflective Optics System Imaging Spectrometer. The raw data contains 115 bands, and after processing, 103 bands were used for the experiment. Table 3 lists the correspondence between the 9 features and their respective corresponding pixels.

4.1.4. Botswana (BOT)

It is acquired with the Hyperion sensor. It contained 242 bands and after processing, 145 bands, were used for this study. For similar purposes, Table 3 lists the correspondence between the 14 features in this dataset and their respective corresponding pixels.

4.2. Evaluation Metrics

4.2.1. Overall Accuracy (OA)

It represents the ratio between the number of correctly classified pixels and the total number of pixels after the prediction. Specifically, OA is calculated as shown in (15).

OA = \frac{1}{N} \sum_{n = 1}^{N} (f (x_{n}) = y_{n})

(15)

where

N

denotes the total number of the feature pixels,

f (x_{n})

denotes the predicted class of the n-th pixel, and

y_{n}

denotes the true class corresponding to the n-th pixel.

4.2.2. Average Accuracy (AA)

It denotes the ratio of the sum of the prediction accuracies of each class to the total number of classes for all pixels after being predicted. The calculation is shown in (16).

AA = \frac{1}{M} \sum_{m = 1}^{M} (\frac{1}{N_{m}} \sum_{n_{m} = 1}^{N_{m}} (f (x_{n_{m}}) = y_{n_{m}}))

(16)

where

M

denotes the total number of classes,

N_{m}

denotes the total number of pixels in the

m

-th class,

f (x_{n_{m}})

denotes the predicted class of the

n

-th pixel for the

m

-th class, and

y_{n_{m}}

denotes the true class of the

n

-th pixel for the

m

-th class.

4.2.3. Kappa

It is a statistical measure of classification accuracy. According to the experience of consistency assessment, when the value is above 0.80, it indicates that the results predicted are almost exactly the same as the actual classification results. The specific calculation is shown in (17).

Kappa = \frac{N \times N \times OA - \sum_{i = 1}^{M} (P_{i} \times T_{i})}{N \times N - \sum_{i = 1}^{M} (P_{i} \times T_{i})}

(17)

where

P_{i}

denotes the number of pixels that are predicted correctly for the

i

-th feature class.

T_{i}

denotes the number of pixels that are true for the

i

-th feature class.

4.3. Experimental Settings

4.3.1. Platform Settings

To verify the superior performance of our proposed method in HSI classification tasks, we constructed the PGNN-Net architecture under the PyTorch framework with the Python 3.7 programming language. The experiments were conducted on the Windows 10 operating system. During the experiments, we also took the help of the NVIDIA RTX A6000 GPU to accelerate the computational process of the model.

4.3.2. Training Details

To evaluate the classification performance of PGNN-Net objectively, we extracted 5% of the data from the four datasets for each type of feature as the training set, while retaining the remaining 95% of the data as the test set. To ensure the consistency of the inputs, we uniformly partitioned the spatial feature size of all the image cubes to 13 × 13, and the number of images in each batch of training inputs was 128. For a more accurate assessment of the stability and generalization ability of the model, we performed five independent training processes for each dataset; each training contains 100 rounds of iterations, Adam’s algorithm to update the parameters of the network, and the learning rate is taken as 0.001. Finally, we calculated the mean values of OA, AA, and Kappa and their standard deviations.

4.4. Experimental Results

To prove the effectiveness and sophistication of PGNN-Net in HSI classification tasks, we selected seven advanced classification algorithms to carry out comparative experiments, among which SVM [10] is a method utilizing machine learning, DPRN [27], M3CNN [28], and GAHT [35] are CNN-based methods, SF [32] and CS2DT [33] are Transformer-based methods, and MVAHN [40] is a GNN-based method.

4.4.1. Results for the IP

The quantitative experimental results of different methods on IP are shown in Table 4. From the table, it can be seen that the proposed PGNN-Net achieves better results in OA, AA, and Kappa on the IP dataset compared to other models. Among the 16 classes of IP, the AA obtained by PGNN-Net for the 9th class “Oats” is less than 90%, which is 61.05%. This is due to the fact that the number of pixels in this class is too small; only 1 pixel is involved in the model training. However, despite its low AA, it is still in a better position compared to other methods. The qualitative test results of different methods on IP are shown in Figure 2. Comparing with the ground truth, it can be seen that PGNN-Net has fewer classification error pixels on IP and smoother pixel boundaries for different classes.

4.4.2. Results for the KSC

As can be seen from Table 5, the proposed PGNN-Net achieved the best results in OA, AA, and kappa on KSC. Among the 13 classes on KSC, the AA is 100% except for the 5th class “Slash-pine” and the 6th class “Oak/Broadleaf”. This has to do with the relatively small amount of training they are involved in. Nevertheless, the 5th and 6th classes of pixels still lead in classification accuracy. Qualitative experiment results on KSC As shown in Figure 3, it can be seen that the visualization of PGNN-Net and MVAHN on KSC is very similar to the ground truth. In addition, due to the small number of labeled pixels, the effect of the other methods is more subtle from a visual point of view.

4.4.3. Results for the PUS

The results of the quantitative experiments on PUS are shown in Table 6. PGNN-Net achieved the best results in OA, AA, and Kappa on PUS. Among the 9 classes on PUS, the AA of all classes is greater than 96%, and the AA of the 1st class “Asphalt”, the 2nd class “Meadows”, and the 6th class “Bare Soil” had an AA of 100%. Although the OA values of ours in the fourth class are smaller than those of the M3CNN, CS2DT, and MVAHN methods, the other classes of PGNN-Net are somewhat better. From the results of the qualitative experiments (Figure 4), it can be seen that the visualization of PGNN-Net with MVAHN, GAHT, and CS2DT on PUS is very similar to the ground truth. Other methods had some problems with misclassification in classes such as “Gravel” and “Bare Soil”.

4.4.4. Results for the BOT

The results of the quantitative experiments on BOT are shown in Table 7. It can be seen that PGNN-Net achieved second results in OA, AA, and kappa on BOT. Among the prediction results of 14 classes on BOT, AA reached 100% in 11 classes. Again, it can be seen that the relatively low classification accuracy of our Class 1 and Class 14 pixels is partly related to the relatively small number of pixels involved in the training and partly an indication that the classification of our method is susceptible to noise on BOT with fewer samples. However, in terms of overall results, our method is very close to MVAHN on the BOT dataset, proving the effectiveness and advancement of PGNN-Net.

From the qualitative test results (Figure 5), it can be seen that all these seven methods are able to achieve relatively good classification results on BOT. Among them, the classification effect of MVAHN and PGNN-Net is closest to the GROUND TRUTH, and SVM, DPRN, M3CNN, GAHT, SF, and CS2DT all suffer from pixels’s loss or misclassification of category features. Visually, our method reaches a relatively leading position in terms of classification results. This proves that our method is able to achieve a relatively good classification effect on fewer sample datasets as well.

4.5. Ablation Study

To explore the specific influence of each module in PGNN-Net on the HSI classification effect, we specifically focus on analyzing and discussing four key influencing factors from the constituent structure of the model.

4.5.1. Effect of the Edges in $A$

During the data preprocessing stage, the edge plays an important role in generating the graph matrix. It can affect the accuracy of feature extraction by GCN and ChebNet. In order to maximize the utilization of spatial and spectral features of HSI, we discuss the effect of different edge sizes on the classification accuracy while other conditions remain constant. Specifically, we set the edges to 3, 6, 19, and 12, and the OA values obtained after training on the four datasets are shown in Figure 6a.

It can be seen that different edge sizes have different effects on different HSI datasets. The IP and BOT datasets are a bit more obviously affected by the edge sizes compared to the KSC and PUS datasets. The experimental results for the four datasets show that the model is able to achieve relatively high OA values when the value of

e

is set to 3.

4.5.2. Effect of k-Order in the CP

The CP plays a crucial role in the global spatial-spectral feature extraction stage. Different k-orders can make the graph features extracted from ChebNet branches contain different amounts of information. To understand its impact, we specifically explored the specific effects of different k-orders in CP on the classification results while keeping other conditions constant. Specifically, we set the k-order to 1, 2, 3, and 4, and trained on four different datasets. The OA values obtained after training are shown in Figure 6b. It is clear from the figure that different k-orders perform distinctively on the datasets. When k = 1, the model performs better on the PUS dataset, whereas when k = 3, the model performs better on the KSC and BOT datasets; for the IP dataset, k = 4 is a more appropriate choice.

4.5.3. Effect of the Loss Function

The loss function plays a crucial role in HSI classification. It defines the degree of discrepancy between model predictions and actual labels and is a key metric in the model optimization process. By minimizing the loss function, we can adjust the parameters of the model so that it can classify more accurately. To understand its impact on PGNN-Net, we have specifically explored the specific effects of different loss functions on the classification results while keeping all other conditions constant.

Figure 7a illustrates the OAs obtained after training on the four datasets using cross-entropy loss (CE), focal loss (FL), and label smoothing (LS). It can be seen that for the IP and BOT datasets, the effect of label smoothing is more obvious compared to the KSC and PUS datasets. From the overall classification effect of the four datasets, the LS loss function obtains a relatively high OA value and performs better in HSI classification.

4.5.4. Effect of the Cube Size

The size of the image cubes in the data preprocessing stage also has a significant effect on the effectiveness of PGNN-Net on HSI classification. Different sizes of image cubes affect the ability of GCN and ChebNet branches to capture features of images. To explore the classification accuracy of our method under different image cube sizes, we set the size of image cubes to 9, 11, 13, 15, and 17, respectively, while keeping other conditions unchanged, and made predictions on four datasets. The results are shown in Figure 7b. From the figure, we can find that the image cube size has different effects on OA for different datasets with different amounts of HSI training data. For the IP dataset, the more appropriate image cube size is 9; for KSC, it is 13; for BOT, it is 11; and for PUS, it is 9.

5. Discussion

5.1. Applicability

The proposed fusion strategy combining GCN and ChebNet for HSI classification has wide applicability in various fields. Imaging of ground cover using hyperspectral technology yields rich spectral information and high spatial resolution, and the technique has wide applications in fields such as precision agriculture, environmental monitoring, and land planning. Our approach exploits the complementary advantages of spatial modeling with GCN and spectral feature extraction with ChebNet and is well suited for these applications, as spectral and spatial information are crucial for accurate classification. This fusion not only enriches the feature representation of the ground cover but also facilitates deeper information integration, as evidenced by the improved classification accuracy.

Furthermore, the inclusion of label smoothing during the training process plays a key role in preventing over-fitting and improving the generalization ability of the model, making it particularly suitable for real-world surface scenarios where the data may be noisy or unbalanced. The ability of our approach to handle this complexity and produce reliable classification results underscores its practical value and applicability.

5.2. Limitations

Although our method achieves high classification accuracy and has wide applicability, there are still some limitations that require further research and improvement. First, the performance of our method depends heavily on the quality and quantity of HSI data. The presence of noise or missing labels, as well as the limited number of labeled samples, can negatively affect the model’s classification ability. Future research should explore techniques to mitigate these issues, such as semi-supervised learning.

Furthermore, while our approach achieved better performance on the dataset used in this study, it is important to note that these results may not be applicable to all HSI classification tasks. The variability of spectral features, spatial patterns, and noise characteristics across HSI datasets can have a significant impact on the performance of any classification model. Therefore, the applicability of our method should be evaluated on a wider range of datasets to fully understand its strengths and limitations.

6. Conclusions

In this paper, we propose an advanced network architecture called PGNN-Net that aims to significantly improve the accuracy of HSI classification. The architecture utilizes a parallel dual-feature extraction structure with multiple Spatial-Spectral feature inputs. At first, we utilize the PCA to effectively eliminate the redundant and invalid dimension information of HSI and refine the key feature dimensions. Subsequently, through data preprocessing strategies, we successfully generated two input features as well as a graph matrix for constructing the graph structure. The spatial-spectral feature extraction architecture consists of a modified GCN and ChebNet in parallel. GCN excels at capturing deep hidden layer representations of local graph structures and node features. ChebNet computes graph convolution by polynomial approximation and is able to deal with sparse graph structures and dig deeper into global graph information. To maximize their complementary strengths, we design an efficient feature fusion strategy to fuse the feature information extracted from them to construct a more comprehensive feature representation. Furthermore, during the training process, we embedded the label smoothing in the cross-entropy loss function, which effectively mitigated the over-fitting problem. The experimental results show that PGNN-Net exhibits better classification performance on Indian Pines, Kennedy Space Center, Pavia University Scene, and Botswana, which validates the effectiveness of the design. In the future, we will continue to deepen our research and explore ways to improve the accuracy of HSI classification under semi-supervised and unsupervised conditions.

Author Contributions

N.G. and M.J.: Methodology, Software, Writing—original draft; D.W., Y.J., K.L., Y.Z., M.W. and J.L.: Validation, Investigation. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Internal Parenting Program (Grant number: 145AXL250004000X).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset used for this study is publicly available. The IP, KSC, PUS, and BOT datasets can be downloaded at https://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes, accessed on 5 July 2024.

Conflicts of Interest

The authors declare no conflict of interest.

References

Qian, S.E. Overview of hyperspectral imaging remote sensing from satellites. In Advances in Hyperspectral Image Processing Techniques; Wiley: Hoboken, NJ, USA, 2022; pp. 41–66. [Google Scholar]
Stamford, J.; Aciksoz, S.B.; Lawson, T. Remote sensing techniques: Hyperspectral imaging and data analysis. In Photosynthesis: Methods and Protocols; Springer: Berlin/Heidelberg, Germany, 2024; pp. 373–390. [Google Scholar]
Liu, H.; Li, W.; Xia, X.-G.; Zhang, M.; Gao, C.-Z.; Tao, R. Central attention network for hyperspectral imagery classification. IEEE Trans. Neural Netw. Learn. Syst. 2022, 34, 8989–9003. [Google Scholar] [CrossRef] [PubMed]
Lu, B.; Dao, P.D.; Liu, J.; He, Y.; Shang, J. Recent advances of hyperspectral imaging technology and applications in agriculture. Remote Sens. 2020, 12, 2659. [Google Scholar] [CrossRef]
Agilandeeswari, L.; Prabukumar, M.; Radhesyam, V.; Phaneendra, K.L.B.; Farhan, A. Crop classification for agricultural applications in hyperspectral remote sensing images. Appl. Sci. 2022, 12, 1670. [Google Scholar] [CrossRef]
Zhang, M.; Li, W.; Zhao, X.; Liu, H.; Tao, R.; Du, Q. Morphological transformation and spatial-logical aggregation for tree species classification using hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5501212. [Google Scholar] [CrossRef]
Datta, D.; Mallick, P.K.; Bhoi, A.K.; Ijaz, M.F.; Shafi, J.; Choi, J. Hyperspectral image classification: Potentials, challenges, and future directions. Comput. Intell. Neurosci. 2022, 2022, 3854635. [Google Scholar] [CrossRef] [PubMed]
Gewali, U.B.; Monteiro, S.T.; Saber, E. Machine learning based hyperspectral image analysis: A survey. arXiv 2018, arXiv:1802.08701. [Google Scholar]
Huang, K.; Li, S.; Kang, X.; Fang, L. Spectral–spatial hyperspectral image classification based on KNN. Sens. Imaging 2016, 17, 1. [Google Scholar] [CrossRef]
Wong, W.-T.; Hsu, S.-H. Application of SVM and ANN for image retrieval. Eur. J. Oper. Res. 2006, 173, 938–950. [Google Scholar] [CrossRef]
Hasanlou, M.; Samadzadegan, F.; Homayouni, S. SVM-based hyperspectral image classification using intrinsic dimension. Arab. J. Geosci. 2015, 8, 477–487. [Google Scholar] [CrossRef]
Yu, H.; Gao, L.; Li, J.; Li, S.S.; Zhang, B.; Benediktsson, J.A. Spectral-spatial hyperspectral image classification using subspace-based support vector machines and adaptive Markov random fields. Remote Sens. 2016, 8, 355. [Google Scholar] [CrossRef]
Uchaev, D.; Uchaev, D. Small sample hyperspectral image classification based on the random patches network and recursive filtering. Sensors 2023, 23, 2499. [Google Scholar] [CrossRef] [PubMed]
Khan, M.J.; Khan, H.S.; Yousaf, A.; Khurshid, K.; Abbas, A. Modern trends in hyperspectral image analysis: A review. IEEE Access 2018, 6, 14118–14129. [Google Scholar] [CrossRef]
Audebert, N.; Le Saux, B.; Lefèvre, S. Deep learning for classification of hyperspectral data: A comparative review. IEEE Geosci. Remote Sens. Mag. 2019, 7, 159–173. [Google Scholar] [CrossRef]
Li, S.; Song, W.; Fang, L.; Chen, Y.; Ghamisi, P.; Benediktsson, J.A. Deep learning for hyperspectral image classification: An overview. IEEE Trans. Geosci. Remote Sens. 2019, 57, 6690–6709. [Google Scholar] [CrossRef]
Li, X.; Li, Z.; Qiu, H.; Hou, G.; Fan, P. An overview of hyperspectral image feature extraction, classification methods and the methods based on small samples. Appl. Spectrosc. Rev. 2023, 58, 367–400. [Google Scholar] [CrossRef]
Vaddi, R.; Manoharan, P. Hyperspectral image classification using CNN with spectral and spatial features integration. Infrared Phys. Technol. 2020, 107, 103296. [Google Scholar] [CrossRef]
Lee, H.; Kwon, H. Going deeper with contextual CNN for hyperspectral image classification. IEEE Trans. Image Process. 2017, 26, 4843–4855. [Google Scholar] [CrossRef]
Sun, M.; Song, Z.; Jiang, X.; Pan, J.; Pang, Y. Learning pooling for convolutional neural network. Neurocomputing 2017, 224, 96–104. [Google Scholar] [CrossRef]
Yang, X.; Cao, W.; Lu, Y.; Zhou, Y. Hyperspectral image transformer classification networks. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5528715. [Google Scholar] [CrossRef]
Qing, Y.; Liu, W.; Feng, L.; Gao, W. Improved transformer net for hyperspectral image classification. Remote Sens. 2021, 13, 2216. [Google Scholar] [CrossRef]
Dubey, S.R.; Singh, S.K. Transformer-based generative adversarial networks in computer vision: A comprehensive survey. IEEE Trans. Artif. Intell. 2024, 1–16. [Google Scholar] [CrossRef]
Liang, L.; Jin, L.; Xu, Y. Adaptive GNN for image analysis and editing. Adv. Neural Inf. Process. Syst. 2019, 32, 1–12. [Google Scholar]
Yao, D.; Zhi-li, Z.; Xiao-feng, Z.; Wei, C.; Fang, H.; Yao-ming, C.; Cai, W.-W. Deep hybrid: Multi-graph neural network collaboration for hyperspectral image classification. Def. Technol. 2023, 23, 164–176. [Google Scholar] [CrossRef]
Zhang, H.; Li, Y.; Zhang, Y.; Shen, Q. Spectral-spatial classification of hyperspectral imagery using a dual-channel convolutional neural network. Remote Sens. Lett. 2017, 8, 438–447. [Google Scholar] [CrossRef]
Paoletti, M.E.; Haut, J.M.; Fernandez-Beltran, R.; Plaza, J.; Plaza, A.J.; Pla, F. Deep pyramidal residual networks for spectral–spatial hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 57, 740–754. [Google Scholar] [CrossRef]
He, M.; Li, B.; Chen, H. Multi-Scale 3D Deep Convolutional Neural Network for Hyperspectral Image Classification. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 3904–3908. [Google Scholar]
Paoletti, M.E.; Moreno-Álvarez, S.; Xue, Y.; Haut, J.M.; Plaza, A. AAtt-CNN: Automatic attention-based convolutional neural networks for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5511118. [Google Scholar] [CrossRef]
Mei, S.; Li, X.; Liu, X.; Cai, H.; Du, Q. Hyperspectral image classification using attention-based bidirectional long short-term memory network. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5509612. [Google Scholar] [CrossRef]
Tang, H.; Li, Y.; Huang, Z.; Zhang, L.; Xie, W. Fusion of multidimensional CNN and handcrafted features for small-sample hyperspectral image classification. Remote Sens. 2022, 14, 3796. [Google Scholar] [CrossRef]
Hong, D.; Han, Z.; Yao, J.; Gao, L.; Zhang, B.; Plaza, A.; Chanussot, J. SpectralFormer: Rethinking hyperspectral image classification with transformers. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5518615. [Google Scholar] [CrossRef]
Xu, H.; Zeng, Z.; Yao, W.; Lu, J. CS2DT: Cross spatial–spectral dense transformer for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 2023, 20, 5510105. [Google Scholar] [CrossRef]
Yang, L.; Yang, Y.; Yang, J.; Zhao, N.; Wu, L.; Wang, L.; Wang, T. FusionNet: A convolution–transformer fusion network for hyperspectral image classification. Remote Sens. 2022, 14, 4066. [Google Scholar] [CrossRef]
Mei, S.; Song, C.; Ma, M.; Xu, F. Hyperspectral image classification using group-aware hierarchical transformer. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5539014. [Google Scholar] [CrossRef]
Li, W.; Liu, Q.; Fan, S.; Xu, C.A.; Bai, H. Dual-stream GNN fusion network for hyperspectral classification. Appl. Intell. 2023, 53, 26542–26567. [Google Scholar] [CrossRef]
Niruban, R.; Deepa, R. Graph neural network-based remote target classification in hyperspectral imaging. Int. J. Remote Sens. 2023, 44, 4465–4485. [Google Scholar] [CrossRef]
Qin, A.; Shang, Z.; Tian, J.; Wang, Y.; Zhang, T.; Tang, Y.Y. Spectral–spatial graph convolutional networks for semisupervised hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 2018, 16, 241–245. [Google Scholar] [CrossRef]
Hu, H.; Yao, M.; He, F.; Zhang, F. Graph neural network via edge convolution for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 2021, 19, 5508905. [Google Scholar] [CrossRef]
Zhao, F.; Zhang, J.; Meng, Z.; Liu, H.; Chang, Z.; Fan, J. Multiple vision architectures-based hybrid network for hyperspectral image classification. Expert Syst. Appl. 2023, 234, 121032. [Google Scholar] [CrossRef]
Liu, Q.; Dong, Y.; Zhang, Y.; Luo, H. A fast dynamic graph convolutional network and CNN parallel network for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5530215. [Google Scholar] [CrossRef]
Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. arXiv 2016, arXiv:1606.09375. [Google Scholar]
Greenacre, M.; Groenen, P.J.; Hastie, T.; d’Enza, A.I.; Markos, A.; Tuzhilina, E. Principal component analysis. Nat. Rev. Methods Primers 2022, 2, 100. [Google Scholar] [CrossRef]
Müller, R.; Kornblith, S.; Hinton, G.E. When does label smoothing help? arXiv 2019, arXiv:1906.02629. [Google Scholar]

Figure 1. Algorithmic framework of the PGNN-Net. (A) Data preprocessing. (B) Local spatial-spectral feature extraction. (C) Global spatial-spectral feature extraction. (D) Feature fusion module.

Figure 2. Qualitative comparison of classification results on IP. (a) Pseudo-color Image. (b) Ground Truth. (c) SVM. (d) DPRN. (e) M3CNN. (f) GAHT. (g) SF. (h) CS2DT. (i) MVAHN. (j) PGNN-Net.

Figure 3. Qualitative comparison of classification results on KSC. (a) Pseudo-color Image. (b) Ground Truth. (c) SVM. (d) DPRN. (e) M3CNN. (f) GAHT. (g) SF. (h) CS2DT. (i) MVAHN. (j) PGNN-Net.

Figure 4. Qualitative comparison of classification results on PUS. (a) Pseudo-color Image. (b) Ground Truth. (c) SVM. (d) DPRN. (e) M3CNN. (f) GAHT. (g) SF. (h) CS2DT. (i) MVAHN. (j) PGNN-Net.

Figure 5. Qualitative comparison of classification results on BOT. (a) Pseudo-color Image. (b) Ground Truth. (c) SVM. (d) DPRN. (e) M3CNN. (f) GAHT. (g) SF. (h) CS2DT. (i) MVAHN. (j) PGNN-Net.

Figure 6. The OAs obtained for different edge sizes (a) and k-orders (b).

Figure 7. The OAs obtained for different loss functions (a) and cube sizes (b).

Table 1. Basic Specifications of the Four HSI Datasets.

Datasets	Size
Datasets	Class	Pixel	Image	Band
Indian Pines	16	10,249	145 × 145	200
Kennedy Space Center	13	5211	512 × 614	176
Pavia University Scene	13	42,776	610 × 340	115
Botswana	14	3248	1476 × 256	145

Table 2. Classes, Features and Pixel Formats for IP and KSC.

Class Number	Pixel Color	IP		KSC
Class Number	Pixel Color	Feature Name	Pixel Count	Feature Name	Pixel Count
1		Alfalfa	46	Scrub	761
2		Corn-notill	1428	Willow-swamp	243
3		Corn-mintill	830	CP-hammock	256
4		Corn	237	CP/Oak	252
5		Grass-pasture	483	Slash-pine	161
6		Grass-trees	730	Oak/Broadleaf	229
7		Grass-pasture-mowed	28	Hardwood-swamp	105
8		Hay-windrowed	478	Graminoid-marsh	431
9		Oats	20	Spartina-marsh	520
10		Soybean-notill	972	Catiail-marsh	404
11		Soybean-mintill	2455	Salt-marsh	419
12		Soybean-clean	593	Mud-flats	503
13		Wheat	205	Water	927
14		Woods	1265	/	/
15		Buildings-Grass-Trees-Drives	386	/	/
16		Stone-Steel-Towers	93	/	/

Table 3. Classes, Features and Pixel Formats for PUS and BOT.

Class Number	Pixel Color	PUS		BOT
Class Number	Pixel Color	Feature Name	Pixel Count	Feature Name	Pixel Count
1		Asphalt	6631	Water	270
2		Meadows	18,649	Hippo-grass	101
3		Gravel	2099	Floodplain-grasses-1	251
4		Trees	3064	Floodplain-grasses-2	215
5		Painted metal sheets	1345	Reeds	269
6		Bare Soil	5029	Riparian	269
7		Bitumen	1330	Firescar	259
8		Self-Blocking-Bricks	3682	Island-interior	203
9		Shadows	947	Acacia-woodlands	314
10		/	/	Acacia-shrublands	248
11		/	/	Acacia-grasslands	305
12		/	/	Short-mopane	181
13		/	/	Mixed-mopane	268
14		/	/	Chalcedony	95

Table 4. Quantitative experimental results of different methods on IP (%).

Class	SVM [10]	DPRN [27]	M3CNN [28]	GAHT [35]	SF [32]	CS2DT [33]	MVAHN [40]	Ours
1	48.18 ± 18.50	28.64 ± 7.69	8.18 ± 2.32	80.00 ± 0.91	49.55 ± 0.91	32.73 ± 6.52	95.00 ± 5.45	100 ± 0.00
2	68.08 ± 2.57	85.06 ± 6.21	73.94 ± 2.36	96.57 ± 0.21	88.15 ± 0.03	95.42 ± 0.61	96.57 ± 0.11	96.71 ± 0.20
3	51.62 ± 1.20	78.05 ± 2.74	63.22 ± 2.86	97.74 ± 0.25	95.13 ± 0.26	95.44 ± 1.47	97.19 ± 0.39	94.63 ± 0.83
4	42.49 ± 8.38	57.60 ± 12.01	14.31 ± 4.41	96.53 ± 0.18	93.16 ± 0.36	88.80 ± 4.14	100 ± 0.00	96.98 ± 0.76
5	86.71 ± 1.14	79.26 ± 10.36	77.69 ± 3.03	92.07 ± 0.17	94.25 ± 0.22	92.07 ± 0.49	94.60 ± 1.94	94.81 ± 0.37
6	92.94 ± 3.92	92.58 ± 2.00	94.46 ± 2.02	97.86 ± 0.11	98.50 ± 0.07	98.21 ± 0.35	98.47 ± 0.30	96.88 ± 0.12
7	50.37 ± 25.21	22.22 ± 27.32	1.48 ± 2.96	79.26 ± 7.98	85.93 ± 1.48	5.19 ± 6.87	66.67 ± 22.22	94.81 ± 1.81
8	96.30 ± 0.46	92.47 ± 6.25	99.47 ± 0.11	100 ± 0.00	100 ± 0.00	99.87 ± 0.11	100 ± 0.00	100 ± 0.00
9	21.05 ± 11.76	16.84 ± 7.74	0.00 ± 0.00	66.32 ± 12.28	26.32 ± 0.00	5.26 ± 5.77	50.53 ± 17.49	61.05 ± 6.32
10	64.01 ± 2.51	79.98 ± 5.87	68.73 ± 4.62	93.78 ± 0.22	94.17 ± 0.04	92.48 ± 1.20	96.49 ± 0.20	97.12 ± 0.15
11	79.70 ± 2.36	93.41 ± 0.68	86.63 ± 2.40	97.05 ± 0.05	96.74 ± 0.07	97.64 ± 0.39	97.84 ± 0.10	98.69 ± 0.16
12	59.36 ± 3.51	58.79 ± 13.86	40.99 ± 4.74	89.38 ± 0.54	94.28 ± 0.17	92.54 ± 2.23	92.90 ± 1.87	94.78 ± 0.55
13	93.33 ± 2.60	86.36 ± 8.72	89.85 ± 6.55	95.90 ± 0.00	99.08 ± 0.21	99.28 ± 0.77	99.18 ± 0.89	100 ± 0.00
14	94.29 ± 1.63	95.26 ± 0.48	97.64 ± 0.73	98.50 ± 0.07	99.23 ± 0.12	99.62 ± 0.07	99.58 ± 0	99.20 ± 0.04
15	42.13 ± 7.68	67.68 ± 14.59	65.50 ± 11.50	99.18 ± 0.00	95.69 ± 0.11	97.33 ± 1.00	97.60 ± 0.61	98.53 ± 0.13
16	81.36 ± 4.43	82.73 ± 3.48	73.64 ± 17.01	77.95 ± 0.56	85.00 ± 1.11	77.50 ± 9.81	88.18 ± 1.70	90.23 ± 0.91
OA	74.71 ± 0.92	84.51 ± 4.72	77.44 ± 1.85	96.11 ± 0.04	94.93 ± 0.02	95.40 ± 0.09	97.19 ± 0.06	97.35 ± 0.09
AA	67.02 ± 2.34	69.81 ± 8.05	59.73 ± 2.24	91.13 ± 1.12	87.20 ± 0.14	79.34 ± 0.98	91.92 ± 2.54	94.65 ± 0.40
Kappa	71.00 ± 1.07	82.15 ± 5.50	74.02 ± 2.15	95.57 ± 0.05	94.21 ± 0.02	94.75 ± 0.10	96.80 ± 0.07	96.98 ± 0.10

Table 5. Quantitative experimental results of different methods on KSC (%).

Class	SVM [10]	DPRN [27]	M3CNN [28]	GAHT [35]	SF [32]	CS2DT [33]	MVAHN [40]	Ours
1	94.97 ± 1.50	99.42 ± 0.71	100 ± 0.00	99.89 ± 0.16	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00
2	85.63 ± 1.05	89.01 ± 4.39	89.61 ± 2.31	99.05 ± 0.96	97.06 ± 0.69	95.15 ± 0.92	99.39 ± 1.01	100 ± 0.00
3	69.79 ± 12.35	85.93 ± 5.11	97.70 ± 0.33	97.61 ± 3.46	99.34 ± 0.33	97.45 ± 0.88	100 ± 0.00	100 ± 0.00
4	54.48 ± 8.07	58.08 ± 13.02	34.14 ± 4.63	93.47 ± 4.39	97.74 ± 0.20	88.20 ± 2.09	97.57 ± 2.43	100 ± 0.00
5	41.31 ± 20.39	38.95 ± 12.24	60.39 ± 2.43	78.43 ± 1.65	79.08 ± 0.00	73.86 ± 1.49	81.04 ± 0.00	81.05 ± 0.00
6	23.49 ± 13.17	54.13 ± 24.91	27.16 ± 8.72	93.58 ± 4.44	98.17 ± 0.00	96.79 ± 1.12	98.35 ± 0.55	99.72 ± 0.37
7	77.00 ± 8.57	78.00± 20.91	65.20 ± 6.97	98.20 ± 0.75	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00
8	68.41 ± 10.12	87.09 ± 10.13	99.85 ± 0.29	98.88 ± 1.78	100 ± 0.00	100 ± 0.00	99.90 ± 0.20	100 ± 0.00
9	90.97 ± 5.01	96.72 ± 0.72	99.68 ± 0.10	99.76 ± 0.32	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00
10	84.58 ± 2.34	100 ± 0.00	100 ± 0.00	99.79 ± 0.30	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00
11	95.63 ± 1.43	97.09 ± 1.81	92.01 ± 1.29	99.80 ± 1.89	99.75 ± 0.00	99.40 ± 0.12	99.90 ± 0.20	100 ± 0.00
12	83.43 ± 2.23	92.01 ± 4.65	99.41 ± 0.20	99.79 ± 0.19	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00
13	98.09 ± 0.55	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00
OA	82.27 ± 1.78	89.94 ± 2.81	90.35 ± 0.85	98.35 ± 0.32	98.97 ± 0.05	98.08 ± 0.12	99.18 ± 0.10	99.40 ± 0.02
AA	74.44 ± 2.57	82.80 ± 4.89	81.93 ± 1.39	96.79 ± 0.50	97.78 ± 0.09	96.22 ± 0.23	98.17 ± 0.16	98.52 ± 0.03
Kappa	80.21 ± 1.98	88.74 ± 3.19	89.19 ± 0.96	98.16 ± 0.35	98.86 ± 0.06	97.86 ± 0.13	99.09 ± 0.11	99.33 ± 0.02

Table 6. Quantitative experimental results of different methods on PUS (%).

Class	SVM [10]	DPRN [27]	M3CNN [28]	GAHT [35]	SF [32]	CS2DT [33]	MVAHN [40]	Ours
1	93.22 ± 0.95	97.26 ± 0.38	97.82 ± 1.23	99.45 ± 0.06	97.35 ± 0.31	99.20 ± 0.22	99.42 ± 0.06	100 ± 0.00
2	98.45 ± 0.13	99.99 ± 0.01	99.69 ± 0.28	99.94 ± 0.01	99.89 ± 0.01	99.99 ± 0.00	99.98 ± 0.01	100 ± 0.00
3	77.92 ± 1.10	90.24 ± 1.75	83.55 ± 6.53	98.72 ± 0.67	94.74 ± 1.55	97.49 ± 1.27	99.59 ± 0.13	99.69 ± 0.02
4	92.52 ± 0.81	96.43 ± 0.30	97.08 ± 1.55	95.45 ± 0.21	96.65 ± 0.22	98.60 ± 0.07	97.33 ± 0.47	96.90 ± 0.19
5	99.44 ± 0.08	97.36 ± 2.13	99.97 ± 0.06	98.90 ± 0.51	99.83 ± 0.03	100 ± 0.00	99.64 ± 0.19	99.73 ± 0.12
6	84.17 ± 1.01	99.23 ± 0.28	98.02 ± 1.59	99.99 ± 0.01	99.92 ± 0.07	99.92 ± 0.02	100 ± 0.00	100 ± 0.00
7	82.99 ± 1.52	99.64 ± 0.55	96.01 ± 4.53	98.59 ± 0.71	100 ± 0.00	99.94 ± 0.08	99.97 ± 0.04	99.98 ± 0.03
8	88.77 ± 0.98	93.41 ± 1.78	93.73 ± 1.69	98.99 ± 0.07	92.85 ± 0.57	96.92 ± 0.35	98.90 ± 0.44	99.41 ± 0.04
9	99.82 ± 0.15	97.93 ± 0.75	98.04 ± 1.69	94.07 ± 0.78	98.09 ± 0.39	99.20 ± 0.29	95.71 ± 1.29	97.16 ± 0.45
OA	93.27 ± 0.05	98.04 ± 0.25	97.57 ± 1.08	99.20 ± 0.01	98.37 ± 0.06	99.36 ± 0.04	99.49 ± 0.02	99.64 ± 0.02
AA	90.81 ± 0.17	96.83 ± 0.38	95.99 ± 2.04	98.24 ± 0.03	97.70 ± 0.10	99.03 ± 0.12	98.95 ± 0.08	99.21 ± 0.06
Kappa	91.02 ± 0.07	97.40 ± 0.34	96.77 ± 1.44	98.94 ± 0.01	97.84 ± 0.05	99.15 ± 0.05	99.32 ± 0.02	99.52 ± 0.03

Table 7. Quantitative experimental results of different methods on BOT (%).

Class	SVM [10]	DPRN [27]	M3CNN [28]	GAHT [35]	SF [32]	CS2DT [33]	MVAHN [40]	Ours
1	98.98 ± 1.36	96.09 ± 3.13	80.47 ± 1.34	91.88 ± 5.50	93.52 ± 0.31	98.91 ± 0.57	98.05 ± 0.43	88.83 ± 0.84
2	84.79 ± 15.58	92.50 ± 11.96	55.63 ± 6.34	100 ± 0.00	99.58 ± 0.51	100 ± 0.00	100 ± 0.00	100 ± 0.00
3	98.32 ± 1.30	91.85 ± 6.22	81.34 ± 1.54	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00
4	90.39 ± 3.74	94.22 ± 8.87	88.92 ± 1.82	100 ± 0.00	100 ± 0.00	99.90 ± 0.20	100 ± 0.00	100 ± 0.00
5	81.17 ± 3.26	72.11 ± 7.69	68.91 ± 3.93	96.56 ± 2.97	87.50 ± 1.31	85.86 ± 2.57	98.83 ± 0.99	98.83 ± 0.43
6	67.73 ± 9.89	90.55 ± 10.92	71.17 ± 5.05	99.69 ± 0.16	89.53 ± 1.27	98.59 ± 0.80	98.44 ± 0.00	100 ± 0.00
7	95.12 ± 2.28	100 ± 0.00	95.20 ± 0.65	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00
8	91.09 ± 7.81	93.58 ± 12.85	83.52 ± 1.52	100 ± 0.00	99.59 ± 0.21	99.69 ± 0.41	100 ± 0.00	100 ± 0.00
9	84.63 ± 1.84	99.33 ± 0.37	96.04 ± 1.17	100 ± 0.00	99.60 ± 0.13	100 ± 0.00	100 ± 0.00	100 ± 0.00
10	88.39 ± 5.72	92.12 ± 3.00	78.31 ± 7.29	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00
11	87.72 ± 2.67	85.59 ± 0.55	82.69 ± 1.61	91.45 ± 4.28	91.59 ± 1.10	92.62 ± 4.15	100 ± 0.00	100 ± 0.00
12	93.95 ± 2.54	89.77 ± 16.50	71.63 ± 4.93	100 ± 0.00	95.35 ± 0.00	100 ± 0.00	100 ± 0.00	100 ± 0.00
13	84.55 ± 4.65	93.25 ± 1.17	96.71± 1.80	98.98 ± 2.04	97.02 ± 0.31	100 ± 0.00	100 ± 0.00	100 ± 0.00
14	90.67 ± 11.62	76.89 ± 2.27	62.00 ± 1.09	93.11 ± 5.51	84.67 ± 1.78	82.22 ± 0.00	87.56 ± 0.44	82.22 ± 0.00
OA	88.08 ± 0.86	91.11 ± 3.38	81.81 ± 1.22	97.93 ± 0.39	95.74 ± 0.32	97.38 ± 0.11	99.25 ± 0.13	98.46 ± 0.07
AA	88.39 ± 1.42	90.56 ± 4.09	79.47 ± 1.37	97.98 ± 0.39	95.57 ± 0.20	96.99 ± 0.07	98.78 ± 0.13	97.85 ± 0.06
Kappa	87.09 ± 0.93	90.36 ± 3.68	80.26 ± 1.33	97.75 ± 0.42	95.38 ± 0.34	97.16 ± 0.11	99.19 ± 0.14	98.33 ± 0.08

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Guo, N.; Jiang, M.; Wang, D.; Jia, Y.; Li, K.; Zhang, Y.; Wang, M.; Luo, J. PGNN-Net: Parallel Graph Neural Networks for Hyperspectral Image Classification Using Multiple Spatial-Spectral Features. Remote Sens. 2024, 16, 3531. https://doi.org/10.3390/rs16183531

AMA Style

Guo N, Jiang M, Wang D, Jia Y, Li K, Zhang Y, Wang M, Luo J. PGNN-Net: Parallel Graph Neural Networks for Hyperspectral Image Classification Using Multiple Spatial-Spectral Features. Remote Sensing. 2024; 16(18):3531. https://doi.org/10.3390/rs16183531

Chicago/Turabian Style

Guo, Ningbo, Mingyong Jiang, Decheng Wang, Yutong Jia, Kaitao Li, Yanan Zhang, Mingdong Wang, and Jiancheng Luo. 2024. "PGNN-Net: Parallel Graph Neural Networks for Hyperspectral Image Classification Using Multiple Spatial-Spectral Features" Remote Sensing 16, no. 18: 3531. https://doi.org/10.3390/rs16183531

APA Style

Guo, N., Jiang, M., Wang, D., Jia, Y., Li, K., Zhang, Y., Wang, M., & Luo, J. (2024). PGNN-Net: Parallel Graph Neural Networks for Hyperspectral Image Classification Using Multiple Spatial-Spectral Features. Remote Sensing, 16(18), 3531. https://doi.org/10.3390/rs16183531

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

PGNN-Net: Parallel Graph Neural Networks for Hyperspectral Image Classification Using Multiple Spatial-Spectral Features

Abstract

1. Introduction

2. Related Work

2.1. CNN-Based Classification

2.2. Transformer-Based Classification

2.3. GNN-Based Classification

3. Methodology

3.1. Data Preprocessing

3.2. Local Spatial-Spectral Feature Extraction

3.3. Global Spatial-Spectral Feature Extraction

3.4. Feature Fusion and Classification

4. Results

4.1. Experimental Datasets

4.1.1. Indian Pines (IP)

4.1.2. Kennedy Space Center (KSC)

4.1.3. Pavia University Scene (PUS)

4.1.4. Botswana (BOT)

4.2. Evaluation Metrics

4.2.1. Overall Accuracy (OA)

4.2.2. Average Accuracy (AA)

4.2.3. Kappa

4.3. Experimental Settings

4.3.1. Platform Settings

4.3.2. Training Details

4.4. Experimental Results

4.4.1. Results for the IP

4.4.2. Results for the KSC

4.4.3. Results for the PUS

4.4.4. Results for the BOT

4.5. Ablation Study

4.5.1. Effect of the Edges in A

4.5.2. Effect of k-Order in the CP

4.5.3. Effect of the Loss Function

4.5.4. Effect of the Cube Size

5. Discussion

5.1. Applicability

5.2. Limitations

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.5.1. Effect of the Edges in $A$