Superpixel Nonlocal Weighting Joint Sparse Representation for Hyperspectral Image Classification

: Joint sparse representation classification (JSRC) is a representative spectral–spatial classifier for hyperspectral images (HSIs). However, the JSRC is inappropriate for highly heterogeneous areas due to the spatial information being extracted from a fixed-sized neighborhood block, which is often unable to conform to the naturally irregular structure of land cover. To address this problem, a superpixel-based JSRC with nonlocal weighting, i.e., superpixel-based nonlocal weighted JSRC (SNLW-JSRC), is proposed in this paper. In SNLW-JSRC, the superpixel representation of an HSI is first constructed based on an entropy rate segmentation method. This strategy forms homo-geneous neighborhoods with naturally irregular structures and alleviates the inclusion of pixels from different classes in the process of spatial information extraction. Afterwards, the superpixel-based nonlocal weighting (SNLW) scheme is built to weigh the superpixel based on its structural and spectral information. In this way, the weight of one specific neighboring pixel is determined by the local structural similarity between the neighboring pixel and the central test pixel. Then, the obtained local weights are used to generate the weighted mean data for each superpixel. Finally, JSRC is used to produce the superpixel-level classification. This speeds up the sparse representation and makes the spatial content more centralized and compact. To verify the proposed SNLW-JSRC method, we conducted experiments on four benchmark hyperspectral datasets, namely Indian Pines, Pavia University, Salinas, and DFC2013. The experimental results suggest that the SNLW-JSRC can achieve better classification results than the other four SRC-based algorithms and the classical support vector machine algorithm. Moreover, the SNLW-JSRC can also outperform the other SRC-based algorithms, even with a small number of training samples.


Introduction
Hyperspectral imaging collects the spectral response of the Earth's surface from the visible to the infrared spectrum with a high spectral resolution, which enables the dis-As the spatial information of fixed-sized blocks is degraded by heterogeneous and noisy pixels, some methods are employed to increase the contribution of the central pixel whilst decreasing the influence of noisy pixels in a block. For instance, Tu et al. [46] used correlation coefficients between the central pixel and samples to enhance classification decisions. A weighted joint nearest-neighbor method is applied to improve the reliability of the classification performance [47]. These methods, however, are highly dependent on the training samples. Additionally, a neighborhood weighting strategy is also used for the suppression of heterogeneous pixels within the fixed-sized block. For example, Qiao et al. [48] proposed a weighting scheme based on spectral similarity, where the weights are based on an implicit assumption that the center pixels of blocks are noise-free, which is hardly satisfied. Zhang et al. [49] proposed a nonlocal weighting scheme (NLW) based on the local self-similarity of images. NLW can preserve pixels with local self-similarity in a smooth region.
In summary, neither the adaptive neighborhood nor weighting-based methods can fully solve the aforementioned two drawbacks in JSRC, i.e., the structure diversity of land cover and noisy pixels. To this end, in this paper, we propose a superpixel-based nonlocal weighted JSRC (SNLW-JSRC) for HSI classification. By combining nonlocal weighting and the adaptive neighborhood together, the two drawbacks faced by JSRC can be solved simultaneously. Specifically, the superpixel-based weighting scheme (SNLW) is conducted to select pixels within superpixels according to their associated structural and spectral similarity measurements.
The major purposes of this paper can be concluded as follows: (1) To simultaneously and adaptively extract land cover structures while removing the effects of noise and outliers; (2) To fully explore the advantages of the superpixel and nonlocal weighting scheme for spectral-spatial feature extraction in HSI; (3) To outperform several classical SRC approaches and achieve improved data classification results of HSI.
The remainder of this paper is organized as follows. Section 2 introduces the traditional SRC and nonlocal weighted SRC. In Section 3, the detailed introduction of the proposed SNLW-JSRC is presented. The experimental results and analysis are given in Section 4. Finally, Section 5 provides some concluding remarks.

Nonlocal Weighted Sparse Representation for HSI Classification
For an HSI image, pixels from the same category lie in a low-dimensional subspace; thus, these pixels can be represented linearly by a small number of pixels from the same class [35]. This has formed the theoretical basis for SR classification (SRC) of HSI. Denote a pixel of HSI as a vector x R B  , where B is the number of spectral bands, and in total, the pixels are in C classes. We select i N training samples from the i-th class to form an overcomplete dictionary B N i i   D R , and the pixel x of the i-th class can be reconstructed by [35]: As the class of x is unknown before classification, we need to build a dictionary D that contains all the classes, i.e., 1 2 [ ] .., C. Accordingly, x can be reconstructed by [35]:  (3) where 0  represents the number of non-zero elements of a . This is an NP-hard problem and can be solved by using the orthogonal matching pursuit (OMP) [50]. After determining a , the class of x can be determined as follows [35]: Since the SRC is based on the spectral characteristics of a single pixel, the spatial information of the pixel is ignored. As a result, it may lead to limited accuracy or sensitivity to noise [51]. To tackle this problem, joint sparse representation classification (JSRC) considering the spatial information of the pixel has been used to incorporate spectral-spatial information [35]. For a pixel x , its spatial neighborhood is denoted as where K denotes the number of pixels in x . The JSRC of x in relation to x can be derived as [47]: represents the sparse coefficient of X with respect to D , and denotes the sparse coefficient of X with respect to i D . Specifically, each column in A shares the same sparse elements; hence, the spatial information of the land cover can be jointly utilized. In order to derive a solution of A , we need to solve the following objective function [47]: where 0 row,  represents the number of non-zero rows. Similarly, the optimization of Equation (6) is an NP-hard problem, which can be approximated by a variant of OMP called simultaneous OMP (SOMP) [52]. After obtaining A , the class of x can be determined by [47]: However, spectral-spatial information extracted by JSRC is easily affected by heterogeneous pixels in the defined neighborhood region X . In [49], a nonlocal weighting scheme (NLW) is developed to solve this problem. For a given test sample x , a fixedsized block X is obtained, centering on x . The weight of a neighboring pixel i y within X is determined as follows [49]: where   J  is a joint neighborhood definition function, and   J x and   i J y refer to x -centric and i y -centric HSI neighborhood blocks, respectively.
represents the spectral-spatial difference between the two blocks, and T is the number of neighboring pixels.   f  denotes a Tukey weight function [49] to weigh the spectral-spatial differences.
With the determined weights, a weighted region W X centered on x can be obtained as below [49]: ω ω x y ω x y ω x y (9) where X ω is a vector of the weights for neighboring pixels in X . Finally, JSRC is performed on W X , using Equations (6) and (7) to obtain the labeled value of x . However, the weighted results of NLW cannot completely suppress the effects of noise and heterogeneous pixels, especially at the edges of land cover. Therefore, this paper proposes the SNLW scheme, which will be introduced in the following section.

Motivation
In addition to the NLW-based scheme, superpixel-based JSRC is another alternative for improving the accuracy. Figure 1 shows the different neighborhoods in JSRC. As shown in Figure 1, they both have their own limitations. Figure 1A shows the superpixel neighborhood. As shown, a superpixel X can give a good boundary partition of the building. However, there are still noisy pixels and outliers. For example, the red points and , which, respectively, represent red and black targets, are quite different from the building. Figure 1B shows the NLW-based weighting scheme, where X is the defined neighborhood block for the central pixel . Points and are two pixels within X . The red boxes denote the local structures for the three pixels, whose weights are calculated using Equation (10). Visually, pixels and have similar local structures (red boxes). Thus, the pixel will be assigned a large weight to the test pixel . This is clearly unreasonable since the pixel itself is in a different class with respect to the test pixel . In addition, although the pixel is the same class as the test pixel , its weight will be small because the local structures of and are quite different, as shown in Figure 1B. Obviously, the NLW neighborhood needs further improvement.
As for Figure 1C, it shows the proposed superpixel-based nonlocal weighting (SNLW) scheme. The superpixel is the neighborhood X of the test pixel , where X includes pixel but not pixel . The red boxes also illustrate the local structures of the three pixels . In the SNLW scheme, to eliminate the inclusion of pixels from different classes (such as pixel ), the local regions are further refined by the overlapped regions of X and the red boxes, as illustrated in Figure 1C. As shown, the neighborhood is defined as the overlapping region filled with blue dashed lines in the close-up view for pixel . Accordingly, the weights of pixels are calculated on these overlapping regions, which prevents the effects of external pixels. As a result, pixel will be assigned a large weight with respect to test pixel , and pixel is naturally excluded. This illustrates how the proposed SNLW-JSRC works more effectively to make use of spectral-spatial information for improved HSI classification. The block diagram of the proposed approach is given in Figure 2, which is actually composed of three main steps, i.e., the generation of the superpixel, superpixel-based NLW, and JSRC for weighted mean superpixels. Details of these are presented in the next three subsections.

Generation of Superpixels
Superpixels can be formed by segmentation methods [42,43] for a single-band image. In the case of HSI, the conventional segmentation methods are not applicable since HSIs are three-dimensional tensor data. Therefore, it is generally necessary to perform dimensionality reduction. The commonly used dimensionality reduction methods include principal component analysis (PCA) [53], two-dimensional singular spectrum analysis (2D-SSA) [15], etc.; PCA is used in this paper for its efficiency. After applying PCA on HSI, the first PC is extracted, followed by the entropy rate segmentation (ERS) [42] to segment the image. The first PC is treated as a base map G , and the ERS method divides G into L closely connected pixel groups, namely superpixels. ERS first constructs an edge set E of G , which calculates the similarity between pairwise pixels. An edge subset  A G is selected to construct the entropy rate   H A and balance the item   B A . Finally, superpixel segmentation is obtained by solving the objective function below [42]: where 0 λ  is a parameter to balance the contributions between

Superpixel-Based Nonlocal Weighting Scheme (SNLW)
After deriving the superpixel map, the weighting process is implemented as follows. Figure 3 shows three local structures (a-c) in superpixels. To identify the similarity between local structures-for example, as shown in Figure  3-the local structures in a (the green part) and b (the blue part) need be calculated first. We measure the spectral and structural information to jointly determine the similarity. However, when calculating the similarity between local structures, the local structures a and b are unequal in size. As seen in Figure 3C, our solution is to calculate the overlapping positions (the yellow part) of two local structures (a,b). Spectral information is obtained by the mean vector of local structures (a,b). Specifically, with a given scale s, the local structure   L x of the test pixel x is extracted, and for another pixel y in the superpixel, the local structure is   L y , and the overlap position By evaluating the difference between the local structures, the weighting of y can be decided by: where λ is a weight item, and x y can be calculated by [49]: In Equation (11),   f  represents the weighting function; after the differences between pixels are calculated, the weights are defined as: Equation (14) is a monotonic descending function within [0,1]; α controls the degree of compression. When α is relatively large, only those pixels with large differences are suppressed. ρ represents the decay and is set to the maximum difference value within the superpixel, ensuring that the weighted results between two arbitrary pixels are the same. For a superpixel, a symmetric weighted matrix is obtained, as shown in Figure 2, in which each row represents a weighted result for a test pixel.
Furthermore, the weight matrix is processed as Equation (15) to better suppress heterogeneous pixels and better enhance similar pixels, in which: where OTSU is a threshold adaptively acquired by the Otsu threshold method [54], which decides whether the corresponding pixel will be adopted or discarded.

JSRC for Weighted Mean Superpixels
In order to speed up the sparse representation and eliminate the effect of noisy pixels, we propose to centralize the information of similar pixels, i.e., weighted mean, in our superpixel-based SR. For a given superpixel   The weighted mean of the superpixel, WSP X , is the collection of i wsp x , i.e., . Finally, we assume all pixels within a superpixel from the same class and apply JSRC for classification, using Equations (6) and (7) to obtain the label.

Experimental Results and Discussion
In the experimental part, the performance of the proposed SNLW-JSRC approach is evaluated using four publicly available HSI datasets: Indian Pines, Pavia University (PaviaU), Salinas, and 2013 GRSS Data Fusion Contest (DFC2013) [55]. The proposed method was benchmarked with several classical HSI classification approaches, including pixel-wise sparse representation classification (SRC) [29], joint sparse representation classification (JSRC) [35], nonlocal weighted joint sparse representation (NLW-JSRC) [49], superpixel-based joint sparse representation (SP-JSRC), its single-scale version in [33], and SVM [4]. In these methods, SRC and SVM are typical pixel-wise classifiers; others are spectral-spatial-based classifiers. The NLW-JSRC method uses the same weighting scheme as ours, yet it is based on local self-similarity, i.e., spectral-spatial information. The SP-JSRC is a superpixel-level spectral-spatial classifier. The quantitative metrics used in this study include the overall accuracy (OA), the average accuracy (AA), and the Kappa coefficient (Kappa) [32].

Datasets
The Indian Pines dataset was acquired by the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) sensor in Northwestern Indiana, USA. The spectral range is from 400 to 2450 nm. We removed 20 water absorption bands and used the remaining 200 bands for experiments. The imaged scene had 145 × 145 pixels with a 20 m spatial resolution, among which 10,249 pixels are labeled. The total number of classes in this dataset is 16.
The PaviaU dataset was acquired in Pavia University, Italy, by the Reflective Optics System Imaging Spectrometer. The spatial resolution of the dataset is 1.3 m, while the spectral range is from 430 nm to 860 nm. After removing 12 water absorption bands, we keep 103 bands from the original 115 bands for the experiment. The imaged scene has 610 × 340 pixels, among which 42,776 pixels are labeled. The number of classes is 9.
The Salinas scene dataset was also collected by the AVIRIS sensor in Salinas Valley, California, which has a continuous spectral coverage from 400 nm to 2450 nm. The spatial resolution of the dataset is 3.7 m. There are 512 × 217 pixels, among which 54,129 pixels were labeled and used for the experiment. After removing the water absorption bands, we keep the remaining 204 bands in the experiments. The number of classes is 16.
The DFC2013 dataset is a part of the outcome of the 2013 GRSS Data Fusion Contest, and it was acquired by the NSF-funded Center for Airborne Laser Mapping over the University of Houston campus and its neighboring area in the summer of 2012. This dataset has 144 bands in the 380-1050 nm spectral region. The spatial resolution of the dataset is 2.5 m. There are 349 × 1905 pixels, and 15029 of them were labeled as training and testing pixels. The number of classes is 15.

Comparison of Classification Results
For SVM, we use the RBF kernel, where a fivefold cross-validation is used. The parameters of SRC were tuned to the best. For all the SRC-based methods, the sparse level was set to 3, as used in [18]. Additionally, the scale of local blocks is 5 × 5 for JSRC and 11 × 11 for NLW-JSRC. For SP-JSRC and SNLW-JSRC, the size of superpixels was chosen from a sequence, which is 400, 500, 600, 700, 800, 900, 1000, 1100, and 1200, and we chose 500 for Indian Pines, 1100 for PaviaU, 400 for Salinas, and 1000 for DFC2013. The parameter α in Equation (14) is set to 3 in this paper.
The first experiment was on the Indian Pines, where 2.5% of samples in each class were randomly selected for training, and the remaining (97.5%) for testing. The specific numbers of training and testing samples for each class are summarized in Table 1. The quantitative results for our approach and the benchmarking ones are given in Table 2 for comparison, where the best results are highlighted in bold. Note that to reduce the impact of randomness, all the experiments were repeated for 10 runs, where the averaged results are reported. Figure 4 shows the classification maps of the last run. According to the visualization results of Figure 4, the classification map of pixel-wise SRC has serious noise, while the classification results based on the spectral-spatial information classifier are obviously superior in both quantitative and qualitative terms. Although the classification result of JSRC suppresses the influence of noise, there is obvious misclassification. For NLW-JSRC, partial misclassification of JSRC is solved, but because NLW-JSRC cannot make good use of spectral-spatial information in the weighting process, the improvement is limited. In SP-JSRC, due to the use of superpixels, good boundaries of the classification map and higher accuracy were obtained, but the noise and outliers within several superpixels brought misclassifications. For the SNLW-JSRC, the quantitative result in Table 2 is the best among the comparison methods. In terms of qualitative results, the classification map is almost immune to noise, has good boundaries, and overcomes the problem of superpixel internal noise. The second experiment was conducted on the PaviaU dataset. For each class of this dataset, 50 samples were randomly selected as training samples, and the rest of the samples were taken as testing samples. The specific numbers of training and testing samples are shown in Table 3. The quantitative results for comparison methods and the proposed method are tabulated in Table 4, in which the best results are in bold. As with Indian Pines, all the results were averaged in 10 runs with different training sets. The obtained estimation maps of the last run are given in Figure 5. Table 3. Class-based numbers of training and testing samples for PaviaU.

Class
Name Training  Testing  1  Alfalfa  50  6881  2  Meadows  50  18,899  3  Graval  50  2349  4  Trees  50  3314  5  Metal sheets  50  1595  6  Bare soil  50  5279  7  Bitumen  50  1580  8  Bricks  50  3932  9  Shadows  50  1107  Total  450  44,936 As shown in Figure 5, compared to pixel-wise classifiers and block-based classifiers, superpixel-based methods achieve better noise suppression and boundary division. However, the superpixel information used by the SP-JSRC method may contain noise and outliers, thus causing misclassifications in the superpixel level. In SNLW-JSRC, these misclassifications were well solved due to the SNLW strategy. The quantitative results listed in Table 4 also confirm the superiority of SNLW-JSRC. In addition, the advantages of SNLW-JSRC on PaviaU are more obvious than those on Indian Pines. This may be because of the higher spatial resolution of the PaviaU dataset.  For the experiment of Salinas, we randomly selected 0.25% of the samples in each category as training samples, and the rest (99.75%) were taken as testing samples. The specific numbers of training and testing samples for each class are available in Table 5. The quantitative and qualitative results for comparison methods and the proposed method are tabulated in Table 6 and Figure 6, respectively. In Table 6, the best results of each row are in bold. The results shown in Table 6 were also averaged in 10 runs with different training sets, and the classification map was obtained from the last run. Table 5. Class-based numbers of training and testing samples for Salinas.

Class
Name Training  Testing  1  Weeds_1  6  2003  2  Weeds_2  10  3716  3  Fallow  5  1971  4  Fallow plow  4  1390  5  Fallow smooth  7  2671  6  Stubble  10  3949  7  Celery  9  3570  8  Grapes  29  11,242  9  Soil  16  6187  10  Corn  9  3269  11  Lettuce 4 wk  3  1065  12  Lettuce 5 wk  5  1922  13  Lettuce 6 wk  3  913  14  Lettuce 7 wk  3  1067  15  Vinyard untrained  19  7249  16  Vinyard trellis  5  1802  Total  143 53,986 As shown in Figure 6, all the four SRC variants integrated with spatial information have less salt and pepper noise compared to the spectral-reliant SVM and SRC. Moreover, the misclassification of the proposed SNLW-JSRC is the lowest. This is also confirmed by the quantitative results tabulated in Table 6. In addition, it is shown that although the SNLW-JSRC still produced the best OA and Kappa, its advantages on the Salinas dataset are not so remarkable as on the PaviaU dataset. This comes from the simple scene and lower spatial resolution of Salinas, which make its spatial heterogeneity lower. From Table  6, we can see that the performance of SP-JSRC and SNLW-JSRC is similar. This also indicates that SNLW-JSRC has a better effect on the HSI with higher heterogeneity. The last experiment was conducted on the DFC2013 dataset. In this paper, a central part of the Houston University campus containing 336 × 420 pixels belonging to 11 classes of targets is selected as the experimental area. For each class of this dataset, we selected 1% of samples as training samples, and the rest (99%) were taken as testing samples. The specific numbers of training and testing samples for each class are shown in Table 7. The quantitative and qualitative results for comparison methods and the proposed method are tabulated in Table 8 and Figure 7, respectively. The results shown in Table 8 were also averaged in 10 runs with different training sets, in which the best results are in bold. The classification map displayed in Figure 7 was obtained from the last run. Table 7. Class-based numbers of training and testing samples for DFC2013.

Class
Name Training  Testing  1  Healthy grass  5  454  2  Stressed grass  3  211  3  Tree  2  137  4  Soil  2  153  5  Water  1  6  6  Residential  4  372  7  Commercial  1  54  8  Road  3  275  9 Parking lot 1 5 483 10 Parking lot 2  1  8  11  Tennis court  3  247  Total  30  2400 From Table 8, we can conclude that for the more complicated DFC2013 dataset, the SNLW-JSRC performs with obvious superiority, with OA and Kappa equal to 86.83% and 0.85, respectively. Similar to the PaviaU dataset, the spatial resolution and heterogeneity of DFC2013 are higher; this also reveals that the SNLW-JSRC can not only provide adaptive neighborhood information following the irregular morphological characteristics of targets but also eliminates the outliers and noise in the neighborhood. Especially for the targets with confusing spectral characteristics, such as soil, residential areas, and parking lot areas, the SNLW-JSRC shows better classification performance, as highlighted in Figure 7C-H by the red circles.  To further test the computational efficiency of the proposed SNLW-JSRC, we calculated the running time of each experiment. These experiments were conducted on a PC with an Intel (R) Pentium (R) CPU 2.9 GHz and 6 GB RAM, and Matlab R2017b. The CPU times (in seconds) of the compared methods are listed in Table 9. Table 9. CPU times of compared methods.

Methods
Indian Pines (s) PaviaU (s) Salinas (s) DFC2013 (s)  SVM  7  31  13  10  SRC  12  40  38  3  JSRC  44  75  65  23  NLW-JSRC  248  532  467  39  SP-JSRC  6  13  26  2  SNLW-JSRC  18  146  173  26 As shown, due to the first four algorithms paying more and more attention to the use of spatial neighboring information, their CPU time increases. By contrast, the CPU time of SP-JSRC is much lower since it performs superpixel-level sparse decomposition. Compared to the SP-JSRC, the proposed SNLW-JSRC adds a more time-consuming SNLWbased weighting procedure. Thus, the SNLW-JSRC consumes more computing time than the SP-JSRC. Nevertheless, the SNLW-JSRC is clearly more efficient than the NLW-JSRC. Overall, comprehensively considering its superior classification performance and efficiency, the proposed SNLW-JSRC is a more preferable algorithm. Even so, mixed programming with C language and Matlab, as well as the use of GPU, will further speed up the calculation process, and SNLW-JSRC is still optional.

Effect of Superpixel Numbers
The number of superpixels affects the size of the superpixel. Generally, the larger the superpixel number, the smaller the superpixel size, and vice versa. Therefore, the number of superpixels has a great influence on the quality of superpixel segmentation. Here, we set up a sequence of superpixel numbers-400, 500, 600, 700, 800, 900, 1000, 1100, and 1200-to explore the impact on SNLW-JSRC and SP-JSRC. In the experiment, the number of training samples was 10%, 200, 1%, and 2% of each class for Indian Pines, PaviaU, Salinas, and DFC2013, respectively. The remaining parameters were the same as those in Section 4.2. The effect of the superpixel number on the Indian Pines, PaviaU Salinas, and DFC 2013 datasets is shown in Figure 8. As can be observed, for almost all the numbers of superpixels, SNLW-JSRC has an obvious improvement over SP-JSRC due to better noise suppression achieved by SNLW-JSRC. In addition, after an upward trend of accuracy, a downward trend is presented. As the number of superpixels becomes larger and larger, the superpixel scale becomes smaller and smaller, resulting in failure to provide sufficient spatial information for proper classification. However, the decline in the accuracy of SNLW-JSRC is slower than that of SP-JSRC, indicating that noise suppression promotes the robustness of classification.

Effect of the Number of Training Samples
Here, we explore the impact of the number of training samples on different methods, including JSRC, NLW-JSRC, SP-JSRC, and SNLW-JSRC, on four datasets. We set the percentage of training samples as 1%, 2.5%, 5%, 10%, 15%, and 20% of each class for Indian Pines, select 50, 100, 200, 300, 400, and 500 samples of each class for PaiviaU, and set the percentage as 0.1%, 0.25%, 0.5%, 1%, 1.5%, and 2% of each class for Salinas and DFC2013. The remaining parameters are the same as those in Section 4.2. The results are shown in Figure 9. The overall trend is that the more training samples included, the higher the classification accuracy of each method. When the sample percentage is 10% for Indian Pines, 200 for PaviaU, 1% for Salinas, and 2% for DFC 2013, the growth trend becomes slower. In particular, SNLW-JSRC is basically superior to other methods, especially for the more complex PaviaU data, indicating that the proposed method is good at handling complex data. When the training sample is small, SNLW-JSRC can achieve a better improvement since SNLW-JSRC achieves good noise suppression and makes the classification more robust to samples.

Conclusions
In this paper, we proposed superpixel-based nonlocal weighting joint sparse representation classification (SNLW-JSRC) for hyperspectral image classification. Firstly, superpixels help to obtain a relatively spectral-consistent neighborhood. The nonlocal weighting is used to further purify the spatial neighborhood, and finally, JSRC enables superpixel-level classification. The results on four benchmark datasets show that the proposed method is superior to the comparative methods in terms of improved classification accuracy, comparable computing time, and robustness to small numbers of training samples. The analysis of the classification results also shows that the proposed method can simultaneously solve the two problems of block neighborhoods in JSRC, which not only provides adaptive neighborhood information but also eliminates the outliers and noise in the neighborhood. However, the results of this paper are still limited by the results of segmented superpixels; thus, serious over-segmentation will also lead to a lack of spatial information. This will form the basis of our future investigation.