Dimensionality Reduction of Hyperspectral Image Using Spatial-Spectral Regularized Sparse Hypergraph Embedding

Many graph embedding methods are developed for dimensionality reduction (DR) of hyperspectral image (HSI), which only use spectral features to reflect a point-to-point intrinsic relation and ignore complex spatial-spectral structure in HSI. A new DR method termed spatial-spectral regularized sparse hypergraph embedding (SSRHE) is proposed for the HSI classification. SSRHE explores sparse coefficients to adaptively select neighbors for constructing the dual sparse hypergraph. Based on the spatial coherence property of HSI, a local spatial neighborhood scatter is computed to preserve local structure, and a total scatter is computed to represent the global structure of HSI. Then, an optimal discriminant projection is obtained by possessing better intraclass compactness and interclass separability, which is beneficial for classification. Experiments on Indian Pines and PaviaU hyperspectral datasets illustrated that SSRHE effectively develops a better classification performance compared with the traditional spectral DR algorithms.


Introduction
With spectral sampling from visible to short-wave infrared region, hyperspectral image (HSI) can provide a spatial scene in hundreds of narrow contiguous spectral channels [1,2].HSI data with high spectral resolution can provide fine spectral details for different ground objects, and they have been widely applied in many fields such as geological survey, environmental monitoring, precision agriculture, and mineral exploration [3,4].Classification of each pixel in HSI plays a crucial role in these real applications.However, the high dimensional characteristic of HSI poses a huge challenge to the traditional classification methods, and the Hughes effect may occur if only limited training samples are available [5,6].
In general, dimensionality reduction (DR) is an effective way to reduce the volume of high-dimensional data with minimum loss of useful information by feature extraction or band selection, and it brings benefits for classification by achieving discriminating embedding features [7][8][9].Many DR methods based on feature extraction have been proposed to reduce the dimension of high-dimensional data.Principal component analysis (PCA) [10] and linear discriminant analysis (LDA) [11] are the most popular subspace methods, but the two linear methods cannot discover the underlying manifold structure embedding in the original high-dimensional space.Many manifold learning-based DR methods are introduced to reveal nonlinear structure in high-dimensional data, such as locally linear embedding (LLE) [12], Laplacian eigenmap (LE) [13], isometric mapping (ISOMAP) [14], neighborhood preserving embedding (NPE) [15], and locality preserving projection (LPP) [16].The above methods can be unified under the graph embedding framework (GE), and the difference between them is how to define the similarity matrix of an intrinsic graph and the constraint matrix of a penalty graph [17][18][19][20].On the basis of GE, some supervised DR methods are designed to exploit the prior knowledge of training samples for improving classification performance, such as marginal Fisher analysis (MFA) [21], local Fisher discriminant analysis (LFDA) [22], and regularized local discriminant embedding (RLDE) [23].However, these direct graph-based DR methods only consider the pairwise relationship between data points, while HSI data usually possess complex relationships such as one sample versus multiple samples (different classes) or one class versus multiple samples.Therefore, the pairwise relation cannot discover complex relations in HSI, which limit the discriminability of embedding features for classification [24,25].
To explore the multiple adjacency relationships in high-dimensional data, hypergraph learning has been introduced to discover the complex geometric structure between HSI pixels [26,27].In [28], discriminant hyper-Laplacian projection (DHLP) was proposed using the hypergraph Laplacian for exploring the high-order geometric relationship of samples.Semi-supervised hypergraph embedding (SHGE) learns the discriminant structure form both labeled and unlabeled data, and it reveals the complex relationships of HSI pixels by building a semi-supervised hypergraph [29].For analyzing the intrinsic properties of HSI pixels, a hypergraph Laplacian sparse coding method was constituted to capture the similarity among data points within the same hyperedge [30].In addition, the heterogeneous network is explored to measure the relatedness of heterogeneous objects.Pio et al. [31] introduced heterogeneous networks with an arbitrary structure to evaluate its performance for both clustering and classification tasks.Serafino et al. [32] proposed an ensemble learning approach to classify objects of different classes, which is based on the heterogeneous networks for extracting both correlation and autocorrelation that involve the observed objects.
The aforementioned methods are designed as spectral-based DR methods, in which the spatial relationship between a pixel and its spatial neighborhood is not taken into consideration for DR.Recent investigations show that incorporating spatial information into traditional spectral-based DR methods can further improve the performance of HSI classification [33][34][35][36][37]. Wu et al. presented a spatially adaptive model to extract the spectral and spatial-contextual information, which significantly enhances the land cover classification performance in both accuracy and computational efficiency [38].Local pixel NPE (LPNPE) [23] and spatial consistency LLE [39] were proposed to reveal the local manifold structure in HSI data by using the distance between different spatial blocks instead Euclidean distance between pixels.As an extension of LPNPE, spatial and spectral information-based RLDE (SSRLDE) tries to maximize the ratio between local spatial-spectral data scatter and global spatial-spectral data scatter for enhancing the representation ability of embedding features [23].The spatial-spectral coordination embedding (SSCE) method defines a spatial-spectral coordination distance for neighbor selection, and it can reduce the probability that heterogeneous objects are selected as nearest neighbors [40].Discriminative spectral-spatial margin (DSSM) exploits spatial-spectral neighbors to obtain the low-dimensional embedding via preserving the local spatial-spectral relationship of HSI data [41].These spatial-spectral DR methods have difficulty discovering the complex relationships in HSI data due to their pairwise nature.
Recently, the spatial information in HSI has been explored to construct a spatial-spectral hypergraph model [42][43][44].Sun et al. proposed an adaptive hyperedge weight estimation scheme to preserve the prominent hyperedges, which is better for improving the classification accuracy [45].Yuan et al. introduced a hypergraph embedding model for feature extraction, which can represent higher order relationships [46].However, these spatial-spectral hypergraph methods are unsupervised and do not use prior information in HSI data, which is not conducive to extract discriminant features for enhancing the classification performance.
Motived by the above limitations, a new hypergraph embedding method termed spatial-spectral joint regularized sparse hypergraph embedding (SSRHE) is proposed for DR of HSI data.SSRHE explores sparse coefficients and label information of pixels to adaptively select neighbors for constructing a regularized sparse intraclass hypergraph and a regularized sparse interclass hypergraph, which can effectively represent the complex relationships in HSI data.Then, a local spatial neighborhood preserving scatter matrix and a total-scatter matrix are computed to preserve the neighborhood structure in spatial domain and the global structure in spectral domain, respectively.Finally, an optimal objective function is designed to extract spatial-spectral discriminant features, which not only preserves the local spatial structure of HSI data, but also compacts the samples belonging to the same class and separates the samples from different classes simultaneously.Therefore, embedding features achieve a good discriminative power for HSI classification.
The rest of this paper is organized as follows.In Section 2, some related works are briefly introduced.Section 3 gives a detailed description of the proposed SSRHE method.In Section 4, experimental results on two real HSI datasets are reported to demonstrate the effectiveness of SSRHE.Finally, Section 5 summarizes this paper and provides some suggestions for future work.

Related Works
In this section, we provide a brief review of GE framework and hypergraph model.For convenience, supposed a HSI dataset X = {x 1 , x 2 , ..., x N } ∈ R D×N , where D is the number of spectral bands and N is the number of pixels.The label of x i denotes l i ∈ {1, 2, ..., c} and c is the number of land cover classes.The goal of dimensionality reduction is to map where d << D. For the linear DR methods, Y can be obtained by Y = P T X with a projection matrix P ∈ R D×d .

Graph Embedding
The graph embedding (GE) framework offers a unified view for understanding and explaining many popular DR algorithms such as PCA, LDA, ISOMAP, LLE, LE, NPE, and LPP.In GE, an intrinsic graph G I (X, W I ) represents a certain desired statistical or geometrical properties of data, and a penalty graph G P (X, W P ) describes some characteristics or relationships that should be avoided.W I and W P are the weight matrices of undirected graphs G I and G P .w I ij and w P ij describe the similarity and dissimilarity characteristics between vertices x i and x j in G I and G P , respectively.
The purpose of GE is to map each vertex of graph into a low-dimensional space that preserves similarities between the vertex pairs.The low-dimensional embedding can be obtained by solving the following objective function: min where D I is a diagonal matrix with D I ii = ∑ j w I ij , L I = D I − W I is the Laplacian matrix of G I , C is a constant, H typically is a diagonal matrix for scale normalization, and it may be the Laplacian matrix of a penalty graph G P .That is, H = L P = D P − W P , where D P ii = ∑ j w P ij .

Hypergraph Model
The hypergraph can represent the complex relations between high-dimensional data, and every hyperedge connects multiple vertexes.A hypergraph G = (V H , E H , W H ) can be constructed to formulate the relationships among data samples, where V H is a set of vertices, E H is a set of hyperedges, and W H is a diagonal matrix where each diagonal element denotes the weight of hyperedge.Each hyperedge e i ∈ E H is vested by a weight w(e i ) ∈ W H .
To represent the relationship of G, an incidence matrix H = [H mn : h(e m , v n )] ∈ R |E H |×|V H | is denoted as follows: Furthermore, the degree of hyper-edge e m and the degree of vertex v n can be defined as w(e m )H mn (4) There is an example to illustrate the hypergraph in Figure 1.A vertex is denoted as a circle (such as v 1 , v 2 , v 5 ).As shown in Figure 1a, the simple graph only holds two vertices per edge, which just describes a single one-to-one relationship (such as v 1 and v 2 , and v 1 and v 3 ).In Figure 1b, each hyper-edge is represented by a curve (such as a hypergraph e 1 is constructed by v 1 , v 2 , v 3 ), which represents complex multiple relations among pixels.That is, a hypergraph can describe the local neighborhood structure well and preserve the complex relationships within the neighborhood.This hypergraph consists of six vertices and four hyper-edges, and the corresponding incidence matrix is shown in Figure 1c.The incidence matrix intuitively represents affinity relationships between vertices and hyper-edges, the non-zero element in each row indicates that a hyper-edge is associated with the vertex, otherwise the vertex and the hyper-edge are not interrelated.However, the hypergraph model only represents complex higher-order relations between pixels in spectral domain, and it fails to consider spatial information in hyperspectral image that limits the discriminant ability of embedding features for land cover classification.

SSRHE
To reveal the complex structure in HSI, we propose a new hypergraph learning method called spatial-spectral joint regularized sparse hypergraph embedding (SSRHE) for dimensionality reduction of HSI data.At first, SSRHE constructs a regularized sparse intraclass hypergraph and a regularized sparse interclass hypergraph by exploring sparse representation (SR) and prior knowledge.After that, it exploits the spatial consistency and global structure of HSI by computing a local spatial neighborhood preserving scatter and a total scatter, which brings benefits for combining spatial structure and spectral information for DR.Finally, an optimal objective function is designed to learn a spatial-spectral discriminant projection by minimizing the regularized sparse intraclass hypergraph scatter and the local spatial neighborhood preserving scatter, while maximizing the regularized sparse interclass hypergraph scatter and the total scatter of samples simultaneously.The flowchart of the proposed SSRHE method is shown in Figure 2.

The Regularized Sparse Hypergraph Model
To discover the complex structure of HSI data, a hypergraph model is exploited to reveal the intrinsic relations between pixels.However, it remains difficult to choose a proper neighborhood size for constructing hypergraphs.Since sparse representation has natural discriminating power to adaptively reveal the inherent relationship of data [47][48][49], a sparse hypergraph model is designed based on SR theory.
Inspired by the observation that the most compact expression of a certain sample is generally given from similar samples, sparse coefficients are explored to find neighbors adaptively.Suppose S ∈ R N×N is the sparse coefficient matrix of data samples.The sparse coefficients can be calculated as follows: min where ε denotes the sparse error, T represents the sparse coefficients of pixel x i , which can be optimized by using Alternating Direction Method of Multipliers (ADMM) framework [50,51].With the sparse coefficient matrix S, a hypergraph can be constructed with the criterion that nodes x i and x j are connected in a hyper-edge if sparse coefficient s ij is not equal to 0. Since sparse coefficients can reflect the similarity between data, non-zero correspondence coefficient indicates the correlation between pixels, a large value indicates a high similarity.Compared with Euclidean metric, sparse coefficients can more effectively select neighbors of HSI data.
According to sparse coefficients and label information of samples, we construct a within-class sparse hypergraph G w (X, In G w , the intraclass hyper-edge e w is formed by connecting pixel x i with its corresponding neighbors whose sparse coefficients are non-zero.The subedge weights of e w and e b can be defined as in which s ij is the sparse coefficient between pixel x i and x j and parameter ϕ (ϕ > 1) is used to enhance the contribution of samples from the same class for improving the discriminant power.Figure 3 illustrates the construction of sparse hypergraph on a simple classification model.As shown in Figure 3, a simple graph considers only pairwise relation between two observed samples, while the sparse hypergraph through sparse representation can select neighbors of samples adaptively and represent complex multiple relations among HSI pixels.The weight matrices corresponding to hyper-edge regions are defined as

Interclass graph Intraclass graph
The incidence matrix where t = 1 T is the heat-kernel parameter, that is the mean value of pixels in one hyper-edge, N (e i ) denotes the number of vertices in each hyper-edge region.
According to H w and w(e w ), the degree of vertex x i ∈ X and the degree of intraclass hyper-edge e w i ∈ E w are computed by For the between-class hypergraph G b H , the between-class incidence matrix Based on H b and w(e b ), the degree of vertex x i and interclass hyper-edge e b i can be defined as In low-dimensional embedding space, the pixels from the same class should be as compact as possible for the intraclass sparse hypergraph, whereas pixels from different classes should be as separated as possible with the interclass sparse hypergraph.Therefore, the objective functions with intraclass and interclass sparse hypergraph constraint are constructed as arg min w(e w k )H w ik H w jk ϑ w k tr(P T x i x i T P − P T x j x i T P) tr[P T x j w(e w k )H w ik H w jk x i T P] where Let M b = XL b X T represent the between-class scatter of G b .M w = XL w X T is the within-class scatter of G w .Then, the mapping matrix P can be obtained through dealing with the following optimization problem: arg max P tr(P T M b P) tr(P T M w P) (24) In real applications, to avoid the singularity in the case of small samples, the above optimal objection can be further extended as arg max where β is a tradeoff parameter.The regularization term XX T is the maximal data variance, which is used to ensure that the diversity of HSI pixels.The diagonal regularization M w is introduced for overcoming the singularity problem when the number of training samples is small.With regularization, Equation ( 25) becomes more stable to effectively reserve the useful discriminant information.

Spatial-Spectral Hypergraph Embedding
Due to the spatial consistency of HSI, the pixels in HSI are usually spatially related, which means the pixels within a small neighborhood usually possess the spatial distribution consistency of ground objects.Therefore, neighborhood pixels can be utilized to learn spatial-spectral combined features.
Suppose that pixel x i is the center pixel, and T is the spatial neighborhood window.(u i , v i ) is the spatial coordinate of pixel x i in the image, and the spatial neighborhood set Ω(x i ) with the window size T (positive odd) can be recorded as where z = (T − 1) 2, x im : (u m , v m ) corresponds to the m-th pixel in the spatial neighborhood.Ω(x i ) has a total of T × T pixels.Thus, the distance measure in the spatial neighborhood block can be defined as where , which measures similarity between central pixel x i and its spectral-spatial neighbors.For all training samples in HSI data, the local spatial neighborhood preserving scatter matrix is calculated as follows: To preserve the global structure of samples, a total scatter matrix is defined as where x is the mean of all samples.
To extract low-dimensional spatial-spectral joint features, a objective function should be designed to preserve the local neighborhood as well as compact the samples with interclass hypergraph and separate the samples with interclass hypergraph simultaneously.Therefore, Equations ( 25), (28), and ( 29) are transformed into the following optimization function: The above optimal function can be further simplified as arg max According to Lagrange multiplier method, Equation ( 31) can be solved by where λ is an eigenvalue set.With the d largest eigenvalues corresponding eigenvector, the optimal projection matrix can be denoted as In the low-dimensional embedded space, the spatial-spectral embedding of test sample x t can be given as follows: In summary, SSRHE compacts the samples from the same class and spatial neighborhood while separates interclass samples, and the embedding features possess stronger discriminative power that ensures good classification performance.The steps of the proposed SSRHE method are shown in Algorithm 1.

Input: HSI dataset
Compute the extended total scatter matrix: 8: Solve the generalized eigenvalue problem: 9: Obtain the optimal projection matrix with largest eigenvalues corresponding eigenvectors:

Experimental Results and Discussion
Some experiments were performed on two real HSI datasets to verity the effectiveness of the proposed method, and several state-of-the-art DR methods were compared with SSRHE in the experiments.

Experimental Setup
In experiments, each HSI dataset was randomly divided into training and test sets.The training samples were utilized to learn a feature extracted model to obtain their low-dimensional embedding features.After that, the classifier was applied to obtain the class labels of test samples.The overall classification accuracy (OA), average classification accuracy of each class (AA), and kappa coefficient (KC) were employed to evaluate classification results.
To demonstrate the effectiveness of the proposed method, we compared SSRHE with four spectral-based approaches, LPP, LDA, MFA, and RLDE; two spatial-spectral combined approaches, LPNPE and SSCE; and a hypergraph method, DHLP.For all approaches, we tuned the parameters by cross-validation to achieve good results.In LPNPE, the size of spatial-spectral window was set to 15 for the PaviaU data and 11 for the Indian Pines data.In SSCE, the window scale and the local neighbor size were both set to 5 on both datasets.For RLDE and MFA, the intraclass neighbor size was set to 3 and 8, while the interclass neighbor size was set to 5 and 60, respectively.The number of nearest neighbor in DHLP was selected as 9.
The nearest neighbor (1-NN) classifier was applied for classification.Each experiment was randomly repeated ten times.All experiments were performed on a personal computer with i5-4590 central processing unit, 8-G memory, and 64-bit Windows 7 using MATLAB 2014a.

Parameters Selection
For the proposed SSRHE algorithm, there are four important parameters: spatial neighborhood size T, weighted coefficient ϕ, tradeoff parameters α and β.To select the optimal parameters, we performed some parameters selection experiments on the Indian Pines and PaviaU datasets.In each experiment, we randomly selected five samples from each class as training set and the remaining samples were used for testing.
To explore the effect of tradeoff parameters α and β, we turned them with a set of {0, 0.01, 0.05, 0.1, 0.2, • • • , 0.9, 1}.Figures 6 and 7 show the classification accuracies with respect to α and β on two datasets.As shown in Figure 6, an increasing α produced subtle change of OAs under a fixed β.The OAs first varied slightly and then declined significantly with the increase of α on Indian Pines dataset, since a too large α led to the algorithm losing the property of spatial-domain in HSI and the obtained features might fail to represent the intrinsic structure of hyperspectral image.For PaviaU dataset, the OAs first increased with increasing α and β, and then tended to decline.To balance the influence of spectral and spatial information for classification, we set parameters α = 0.3 and β = 0.7 for the Indian Pines dataset, and α = 0.2 and β = 0.5 for PaviaU dataset based on the results presented in Figures 6 and 7  The parameters T and ϕ have a significant influence on the discrimination of SSRHE.The former is defined to determine the size of spatial neighborhood and the latter is exploited to adjust the compactness of the interclass neighbors.To analyze the influence of two parameters for SSRHE, we performed two experiments with respect to T and ϕ, and the corresponding experimental results on two HSI datasets are shown in Figure 8.In Figure 8a, it is clear that the OAs first quickly increased with the growth of T on two dataset because a larger spatial neighborhood was beneficial to preserving more useful spatial-domain information in HSI.However, if the value of T was too large, it might bring pixels from other classes into training model, as well as greatly increase computational complexity.According to the results in Figure 8a, the size of spatial neighborhood T was set to 7 for Indian Pines dataset and 15 for PaviaU dataset.As shown in Figure 8b, with the increasing of ϕ, the OAs improved until reaching a peak value due to further enhancing the compactness of intraclass samples in the embedding space.While ϕ kept increasing, the OAs declined since too large ϕ weakened the ability to extract between-class features.Therefore, we set ϕ = 50 for the Indian Pines dataset and ϕ = 60 for the PaviaU dataset.

Investigation of Embedding Dimension
To explore how the embedding dimension d affects the performance of the proposed SSRHE method, thirty samples were randomly selected from each class for training, and d was tuned from 5 to 50 with an interval of 5. Figure 9 shows the classification accuracy of SSRHE and other DR methods with different embedding dimensions on Indian Pines and PaviaU datasets.According to Figure 9, the OAs firstly rapidly rose and then tended to stabilize with an increasing d, because embedding features with high dimension contained more useful information for training, while too large dimension might lead to information saturation.The classification accuracies of the most DR methods were better than the RAW method since these methods can remove the redundant information in HSI.The proposed SSRHE achieved the best classification performance in comparison with other methods on two HSI datasets.The reason is that SSRHE discovered the complex relationships between samples by hypergraph model and fused the useful spatial information for extracting discriminant features for classification.

Investigation of Classification Performance
To evaluate the performance of the proposed method under different training conditions, we selected n i (n i = 5, 20, 50, 100, 200) samples per class for training, and 1-NN classifier was utilized to classify the remaining samples.Each experiment was repeated ten times, the averaged OA with standard deviations (STDs) and kappa coefficient (KC) of all DR methods on Indian Pines and PaviaU datasets are shown in Tables 1 and 2.  As shown in Tables 1 and 2, for all methods, the OA and KC greatly improved and the STDs decreased with the increasing of training samples, because a large number of training samples usually provide more useful information for training.The spatial-spectral combined methods, SSCE, LPNPE and SSRHE, produced better classification results than the spectral-based methods, because they exploit the spatial and spectral information in HSI to improve the representation ability of extracted features.Among these spectral-based methods, DHLP presented better classification performance under most training conditions, since it applies the hypergraph learning model to discover the intrinsic complex relationship between samples.Compared with all the other methods, SSRHE had better discriminant performance with different proportions of training samples on two HSI datasets, especially with small size of training set.SSRHE utilizes the hypergraph framework to discover the complex multivariate relationships between interclass samples and intraclass samples, and computes two spatial neighborhood scatters to reveal the spatial correlation between each pixels in HSI, which further enhances the discriminating power of low-dimensional features.
To explore the classification accuracy of the proposed method for each class, we randomly selected a percentage of samples from each class as the training set while the remaining samples were used for testing.In experiments, we set the training percent to 3% for the Indian Pines dataset and 5% for PaviaU dataset.Tables 3 and 4 show the classification results of different DR methods on Indian Pines and PaviaU datasets, and the corresponding classification maps are displayed in Figures 10 and 11.According to Tables 3 and 4, SSRHE obtained better classification results than other methods in most classes, and it achieved the best OA, AA and KC on two HSI datasets.This indicates that SSRHE possessed stronger ability to reveal the intrinsic complex geometric relations between samples and extract more useful the spatial-spectral combined features for classification.As shown in Figures 10 and 11, the spatial-spectral methods generally produced smoother classification maps than traditional spectral-based methods, which demonstrates that the spatial information in HSI is effective for improving classification performance.Moreover, the classification maps of SSRHE possessed fewer misclassified pixels, especially in the areas Grass/Pasture, Wheat, and Stone-steel towers for Indian Pines dataset and Asphalt, Meadows, and Shadows for PaviaU dataset.In terms of running time, SSRHE cost more time than traditional spectral-based methods because it needs to build models in both spatial domain and spectral domain of HSI data.Compared with other spectral-spatial algorithms, the running time of SSRHE did not increase significantly and was far less than SSCE.Thus, SSRHE was more effective than other DR methods to weaken the influence of noisy points and develop the discrimination power of embedding features.

Conclusions
In this paper, a new dimensionality reduction method SSRHE is proposed based on sparse hypergraph learning and spatial-spectral information for HSI.SSRHE computes two spatial neighborhood scatters to reveal the spatial correlation between each pixels in HSI, and it also constructs two discriminant hypergraphs by sparse representation to discover the complex multivariate relationships between interclass samples and intraclass samples, respectively.Based on the spatial neighborhood scatters and the Laplacian scatters of hypergraphs, a spatial-spectral combined objective function is designed to obtain an optimal projection matrix mapping original HSI data to low-dimensional embedding space where the intrinsic spatial and spectral properties are well preserved.some experiments were conducted on Indian Pines and PaviaU datasets to demonstrate that the proposed method possesses better performance compared with some existing state-of-the-art DR methods.In the future, we will optimize the proposed algorithm to reduce the running time and extend it to other related fields such as multi-spectral images and very-high resolution images, and the sparse hypergraph model in this paper can be used in some application domains with high-dimensional data such as face images, gene expression and radiomics features.Furthermore, It would also be interesting to study integrating hypergraph learning and heterogeneous information networks when exploring high-dimensional data with multiple adjacency relationships.
E w H , W w H ) and a between-class sparse hypergraph G b (X, E b H , W b H ) to characterize the intrinsic structure of HSI data.X is denoted as the vertex set, E w H and E b H are the sets of intraclass hyper-edge and interclass hyper-edge.W w H and W b H are the weight matrices of hyper-edges of G w and G b , respectively.

Figure 3 .
Figure 3.A simple two classification model to explain the construction of sparse hypergraph.

1 H
the intraclass sparse hypergraph Laplacian matrix and L b = D b v − H b W b D b e −b T denotes the interclass sparse hypergraph Laplacian matrix.D w v , D b v , D w e , D b e , W w , W b with respect to θ w , θ b , ϑ w , ϑ b , w w , w b can be indicated as corresponding class label set L, tradeoff parameters α, β, weighted coefficient ϕ , spatial neighborhood size T, reduced dimensionality d.

1 : 2 : 4 : 1 H b T ; 5 :
According to ADMM, compute sparse coefficient matrix S; Compute the weights of intraclass and interclass hyperedge by Equations (8) and (9); 3: Compute the intraclass hyper-Laplacian matrix: L w = D w v − H w W w (D w e ) −1 (H w ) T ; Compute the interclass hyper-Laplacian matrix: L b = D b v − H b W b D b e −Compute the local spatial neighborhood preserving scatter matrix: 6:

11 :P = [p 1 , p 2 , • • • , p d ] Output: 12 :
Low-dimensional spatial-spectral embedding features of test samples:Y = P T X In this paper, we adopt big O notation to analyze the computational complexity of SSRHE.Spatial neighborhood size and the number of sparse iterations are denoted as T and t, respectively.The sparse coefficient matrix S is computed with the cost of O(Nt).The weight of intraclass hyperedge and the weight of interclass hyperedge both take O N 2 .The intraclass incidence matrix H w and the interclass incidence matrix H b are both calculated with O N 2 .The cost of diagonal matrices W w , W b , D w v , D b v , D w e , and D b e are O(8N)).The intraclass hyper-Laplacian matrix L w and the interclass hyper-Laplacian matrix L b both cost O N 3 .The local spatial neighborhood preserving scatter matrix S w takes O DNT 2 .The extended total scatter matrix S b costs O (DN).The costs of matrix M w and M b are both O DN 2 .XX T costs O (DN).It takes O D 3 to deal with the generalized problem of Equation (32).The total time complexity of SSRHE is O D 3 + DN 2 + N 3 .

4. 1 .
HSI Datasets Indian Pines Dataset: This hyperspectral image is a scene of northwest Indiana collected by an airborne imaging spectrometer sensor in 1992.It consists of 145 × 145 pixels and 220 spectral bands that cover the wavelength from 400 to 2450 nm.After removing the noise and water absorption bands, 200 spectral bands remained for use in the experiments.This image contains total 16 land cover types.The dataset in false color and its corresponding ground truth map are shown in Figure 4.

Figure 4 .
Figure 4. Indian Pines hyperspectral image: (a) HSI in false color; (b) ground truth; and (c) spectral curve.(Note that the number of samples for each class is shown in brackets.)University of Pavia (PaviaU) Dataset: This hyperspectral image covering the area of Pavia University in northern Italy was collected by ROSIS sensor in 2002.The spatial size of this image is 640 × 340 pixels and the number of spectral band is 115.Due to the atmospheric affection, 12 bands were discarded and the remaining 103 spectral bands were used for experiment.The false color and ground truth map for this scene are shown in Figure 5.

Figure 5 .
Figure 5. PaviaU hyperspectral image: (a) HSI in false color; (b) ground truth; and (c) spectral curve.(Note that the number of samples for each class is shown in brackets.) .

Figure 6 .Figure 7 .
Figure 6.The OAs of SSRHE with different α and β on Indian Pines dataset.

Figure 8 .
Figure 8. OAs with respect to parameters T and ϕ on different datasets: (a) the OAs of T; and (b) the OAs of ϕ.

Figure 10 .Figure 11 .
Figure 10.Classification maps of different DR methods with 1-NN on the Indian Pines dataset.Note that the time cost of corresponding DR algorithms is marked in bracket.
Note that the best results of a row are marked in bold.

Table 3 .
Classification results of different classifiers on the Indian Pines dataset.

Table 4 .
Classification results of different classifiers on the PaviaU dataset.