1. Introduction
Hyperspectral remote sensing is an advanced imaging technology capable of capturing both spectral and spatial characteristics of ground targets in a single acquisition [
1]. Hyperspectral images (HSI) leverage high-dimensional spectral data to enable precise discrimination and characterization of materials, with diverse applications across mineral exploration [
2], agricultural monitoring [
3], ecological assessment [
4], submarine geomorphology mapping [
5], and military surveillance [
6,
7]. However, the inherent limitations of HSI acquisition systems—such as restricted spatial resolution and atmospheric scattering effects—often result in mixed pixels, where each pixel spectrally integrates multiple materials. Consequently, accurately disentangling endmembers and quantifying their fractional abundances remains a critical challenge in HSI interpretation, constituting the fundamental task of spectral unmixing [
8].
Classical spectral unmixing relies on a mathematical formalism of spectral mixing mechanisms, typically implemented through a mixing model. Among these, the Linear Mixing Model (LMM) has gained prominence owing to its computational tractability and broad applicability [
9]. With LMM assumption, mixed pixels can be represented as a weighted sum of endmembers, with the weights corresponding to their fractional abundances [
10]. While LMM offers simplicity an ease of implementation, its effectiveness is often limited in scenarios with intricate spectral–spatial interactions [
11]. Furthermore, conventional unmixing approaches decouple endmember extraction from abundance estimation—a sequential workflow prone to error propagation. Prominent endmember extraction algorithms, i.e., N-FINDR [
12], Pixel Purity Index (PPI) [
13], and Vertex Component Analysis (VCA) [
14] inherently rely on the pure pixel assumption, an assumption that may not hold under real-world imaging conditions, imposing inherent limitations on unmixing fidelity.
To mitigate these limitations, contemporary spectral unmixing approaches increasingly rely on predefined spectral libraries. Among these, Multiple Endmember Spectral Mixture Analysis (MESMA) and Sparse Unmixing (SU) employ comprehensive or overcomplete spectral libraries to estimate per-pixel abundances. MESMA operates by iteratively selecting optimal endmember subsets from the library to reconstruct observed spectrum with minimal residual error [
15,
16,
17]. Inspired by the use of standard spectral libraries, together with the advances of sparse representation [
18,
19], sparse unmixing has rapidly become a typical semi-supervised unmixing method [
20]. By imposing sparsity priors on abundance vectors, SU capitalizes on the inherent sparsity of real-world scenes—where only 3–5 endmembers can appear within a pixel [
21,
22]. Sparse unmixing jointly enables spectral fitting and endmember selection, circumventing the need for explicit pure pixel assumptions.
This sparsity prior motivates the development of sparsity-constrained unmixing algorithms that explicitly exploit structural patterns within the abundance matrix. For example, SUnSAL [
23] imposes pixel-wise sparsity through an ℓ
1-norm regularizer, whereas the Collaborative SUnSAL (CLSUnSAL) [
24] adopts a group-sparsity paradigm via the ℓ
2,
1-norm, enforcing joint sparsity across all pixels under the assumption of homogeneous endmember activation. Critically, CLSUnSAL presupposes that all pixels share identical active endmembers—a condition rarely satisfied in practice, as endmember distributions are typically localized within spatially contiguous regions rather than globally pervasive. To overcome this limitation, the Local Collaborative Sparse Unmixing (LCSU) framework [
21] introduces spatially adaptive collaboration, restricting sparsity constraints to homogeneous superpixel neighborhoods to reconcile global spectral coherence with local spatial consistency.
Hyperspectral imagery inherently encapsulates not only discriminative spectral signatures but also multiscale spatial patterns. Due to the limitations of spectral-only unmixing, spatially aware methods have emerged as a dominant approach, leveraging spatial-contextual information to mitigate spectral ambiguity. For instance, SUnSAL–TV [
25] extends the baseline sparse unmixing model by imposing Total Variation (TV) regularization, enforcing piecewise smoothness in abundance maps through ℓ
1 penalties on gradient magnitudes. However, TV-based methods primarily capture local smoothness while neglecting spatially heterogeneous structures, prompting the integration of non-local self-similarity priors [
26] to preserve both local continuity and global texture coherence. Building upon this concept, Dual Regularized Sparse Unmixing (DRSU) [
27] incorporates dual sparsity-inducing weights—simultaneously constraining endmember activation sparsity and abundance spatial consistency. Its variant DRSU–TV [
28] further couples TV regularization with weighted sparse constraints, enabling adaptive exploration of spatial–spectral correlations. To unify these advancements, the Spectral–Spatial Weighted Sparse Unmixing (S
2WSU) [
29] achieves enhanced abundance estimation accuracy through joint optimization of spectral fidelity, structured sparsity, and multi-scale spatial constraints, demonstrating superior performance in complex scenes.
Superpixel segmentation as a pivotal strategy in spatially aware hyperspectral unmixing, has effectively balanced spatial coherence and computational efficiency [
30,
31]. For instance, the Multi-Scale Sparse Unmixing Algorithm (MUA-SLIC) [
32] leverages SLIC [
33] to partition image into homogeneous superpixels, enabling multi-scale spectral approximation through hierarchical transformation. By decoupling the spatially regularized unmixing problem into computationally tractable sub-problems, MUA–SLIC achieves efficient integration of spectral–spatial context while maintaining low algorithmic complexity. Building upon this paradigm, SUSRLR–TV [
34] synthesizes TV regularization with low-rank matrix recovery, simultaneously enforcing abundance smoothness and intra-superpixel correlation. Further advancing spatial constraints, SBGLSU [
35] incorporates graph Laplacian Operator to model intrinsic geometric relationships within superpixels, demonstrating statistically significant improvements in abundance estimation accuracy. At the frontier of this field, Spatial–Spectral Multiscale Sparse Unmixing [
36] proposes a unified regularization architecture for multi-resolution analysis, holistically exploiting cross-scale spectral variability and spatial texture persistence. Collectively, these superpixel-driven approaches establish a systematic methodology to address the spectral–spatial heterogeneity inherent to real-world hyperspectral scenes.
Conventional Total Variation (TV) regularization struggles to distinguish intrinsic material transitions from noise-induced variations in local regions. In contrast, Relative Total Variation (RTV) [
37] quantifies both absolute intensity gradients and relative structural coherence, enabling robust extraction of spatially coherent features. Inspired by RTV’s discriminative capability, GLDWSU [
38] is proposed to encode adaptive spatial dependencies through learned graph topologies. Specifically, the graph Laplacian matrix derived from spectral–spatial affinity learning captures multi-scale structural patterns adaptively, serving as a data-driven spatial regularizer. However, traditional graph Laplacian Operator primarily enforce second-order smoothness (i.e., penalizing curvature) [
39], which inadequately preserves first-order continuity (gradient-level consistency) across neighboring pixels. To address this limitation, FoGTF-HU [
40] directly constrains first-order differences via a trend filtering operator, effectively modeling piecewise linear abundance transitions while suppressing staircase artifacts common in TV-based methods.
In recent years, deep learning has been increasingly applied to sparse unmixing tasks. For instance, SUnCNN [
41] formulates the unmixing problem as an optimization over neural network parameters, using a convolutional encoder–decoder to generate abundance maps from a predefined spectral library. Abundance constraints are enforced through softmax activation, and spatial information is implicitly captured via convolutional operations. Another notable approach is EGU-Net [
42], an unsupervised two-stream Siamese network designed for hyperspectral unmixing. It incorporates a secondary network that learns from pure or nearly pure endmember spectra to guide the main unmixing branch, with both branches sharing parameters and enforcing physically meaningful constraints such as nonnegativity and sum-to-one. EGU-Net further supports spatial–spectral unmixing by integrating convolutional operations, enabling it to preserve spatial coherence while producing accurate and interpretable abundance maps without requiring ground truth labels. Despite these advancements, deep learning-based methods often require large-scale labeled data, substantial training time, and involve many tunable parameters. Additionally, their lack of transparency can hinder the integration of domain knowledge such as physical priors or spatial structures.
Given these limitations, deep learning–based methods may not always be practical for real hyperspectral unmixing tasks. In contrast, traditional sparse unmixing frameworks remain appealing because they are physically interpretable, unsupervised, and computationally efficient. However, existing sparse approaches such as SUnSAL and its spatial extensions typically rely on pixel-wise sparsity assumptions or second-order Laplacian regularization. While these strategies can be effective in certain cases, they tend to overlook local spectral similarity within homogeneous regions and often oversmooth structural boundaries, thereby limiting their ability to accurately model the complex spatial heterogeneity of real hyperspectral images. Motivated by these limitations, this study proposes a novel hyperspectral unmixing framework termed SCSU–GDO, which synergistically integrates superpixel collaborative sparse regression with graph differential operator spatial regularization. More details are illustrated in
Figure 1. The methodology begins by segmenting hyperspectral images into spatially contiguous and spectrally homogeneous superpixels using SLIC, establishing adaptive neighborhoods as the foundation for localized collaboration. Leveraging the inherent property that mixed pixels within each superpixel share similar endmember subsets and exhibit smoothly varying abundances, a locally weighted collaborative sparse regression model is designed. This model jointly optimizes endmember activation patterns and abundance smoothness within homogeneous regions through group sparsity constraints (ℓ
2,
1-norm), effectively addressing the limitations of global sparsity assumptions in handling spatial heterogeneity. Furthermore, to overcome the second-order smoothness bias of traditional graph Laplacian Operator, a first-order graph differential operator is constructed via spectral decomposition of the learned Laplacian matrix. This operator enforces gradient-level continuity of abundances among intra-class pixels while preserving inter-class structural boundaries, balancing sparsity promotion with spatial fidelity. In addition, ADMM [
43] is used for optimization, ensuring coordinated convergence of spectral reconstruction error, local collaborative sparsity, and differential spatial regularization. Unlike existing sparse or graph-based unmixing methods that rely solely on global sparsity assumptions or conventional Laplacian regularization, the proposed framework introduces several key innovations, summarized as follows:
- (1)
Superpixel-based Local Weighted Collaborative Sparse Regression, is proposed to consider local correlation for unmixing. This strategy applies superpixel segmenta-tion to hyperspectral images and then performs local weighted collaborative sparse unmixing, which provides improved sparse constraints on the abundance matrix and yields better unmixing results.
- (2)
First-order graph differential operator is adopted as a spatial regularizer, which directly models gradient-level variations and better preserves structural boundaries, in contrast to traditional Laplacian operators that primarily enforce second-order smoothness.
- (3)
Experimental results confirm the effectiveness of the proposed method, and SCSU–GDO has obtained better performance compared to other SOTA algorithms.
3. SCSU–GDO
Traditional sparse unmixing methods, such as SUnSAL and CLSUnSAL, typically assume a globally consistent sparse structure across the entire scene. This global sparsity assumption makes them less effective in capturing the spatial heterogeneity of real hyperspectral images, where material compositions vary locally. Similarly, conventional spatial regularization techniques—such as total variation and graph Laplacian-based models—enforce second-order smoothness, which often leads to over-smoothing and the loss of critical boundary information.
The proposed SCSU–GDO framework partitions the image into spatially coherent and spectrally homogeneous superpixels, within which group-sparse regression is applied to exploit local similarity and improve abundance estimation. To further enhance spatial consistency while preserving meaningful transitions, a first-order graph differential operator is constructed from a learned Laplacian matrix, allowing for gradient-level regularization that better aligns with object boundaries.
Compared with existing sparse or graph-based methods, the proposed model provides a more adaptive and structure-aware approach to hyperspectral unmixing, capable of handling spatial variability and preserving fine-scale details. The overall model is formulated as a constrained optimization problem and solved using the ADMM algorithm, which enables efficient coordination between spectral reconstruction, local sparsity modeling, and spatial regularization.
The following subsections present the detailed mathematical formulation and optimization strategy.
3.1. Superpixel-Based Local Collaborative Sparse Regression
3.1.1. Superpixel Segmentation-Based Uniform Region Extraction
Pixels in uniform regions exhibit similar spectral reflectance, which typically indicates that these pixels share similar endmembers as well as fractional abundances. Recognizing this characteristic, we use the SLIC algorithm, a typical superpixel segmentation algorithm, to segment the image into regions. Compared to standard K-means clustering, SLIC adopts a spatially localized comparison strategy by restricting centroid updates to a fixed spatial neighborhood. This spatial constraint allows SLIC to better capture local spectral–spatial homogeneity in hyperspectral images while maintaining computational efficiency. After this process, the uniform region can be obtained shown in
Figure 2.
3.1.2. Local Collaborative Sparse Regression
Inspired by CLSUnSAL, and acknowledged that mixed pixels with similar endmembers and fractional abundances often cluster within spatially uniform regions rather than being scattered across the entire image, we employ the Local Collaborative Sparse Unmixing (LCSU), formulated as below:
where
is the neighborhood of pixel
, and
is a regularization parameter controlling the degree of sparseness. Compared to CLSUnSAL, LCSU assumes that adjacent pixels share the same support. This approach fits better as the endmember is more likely to appear within spatially uniform regions.
3.1.3. Sparse Regularization
In this paper, uniform regions obtained through superpixel segmentation are utilized to achieve local collaborative sparsity. Additionally, a reweighting matrix is introduced to enhance the row sparsity. A hyperparameter, referred to as the superpixel homogeneity index, is introduced to adjust the sparsity constraint and denoted as
where
represents the weight matrix, and
is a small positive constant.
denotes the number of superpixels.
is denoted as Hadamard product.
3.2. Graph Differential Operator-Based Graph Learning
3.2.1. Graph Learning for Laplacian Matrix
In SBGLSU, SLIC is first applied to extract spatially uniform regions, and a weighted map is constructed for each superpixel to obtain graph Laplacian regularization. Since the graph Laplacian matrix (GLM) in this algorithm is predefined and not adaptively derived from image content, it limits the model’s applicability to other tasks. In contrast, the GLM in GLDWSU is constructed directly from the image using RTV, which more effectively captures local spatial structures and preserves meaningful edge information. This data-driven construction enhances the adaptability and relevance of the resulting graph to the input scene.
According to [
45], assuming each pixel
of HSI Y as a vertex of a graph, the graph Laplacian can be calculated by
where
and
are the diagonal weight matrices, and
and
are the discrete derivative operator in the row and column directions of the image Y and are defined as
In the row direction,
is defined as
where
is denoted as the Gaussian filter, and
represents convolution operator.
represents the element-wise multiplication operator, and
is a small positive constant.
In the column direction, the weight matrix is defined similarly.
Since original image
is usually contaminated by noise, to obtain accurate graph Laplacian matrix, denoising process can be considered before the calculation of graph learning. The specific process is as follows:
where y is denoted as the ideal clean image and s denotes the observed data,
is the parameter, and L() is the GLM.
3.2.2. Graph Differential Operator
Laplacian matrix is well-suited for capturing second-order smoothness, but it insufficiently represents the first-order smoothness of spatial structures in hyperspectral images [
40]. To address this limitation, a first-order graph differential operator is utilized to effectively represent the spatial information of HSI in the proposed method. Since the Laplacian matrix L can be expressed as
, and from Equation (9), we can calculate P as
where
Compared to Equation (10),
and
in the above formula are not distinguished in the row and column directions. Thus, the differential operator
imposes spatial constraints on the abundance matrix by modeling gradient-level variations across neighboring pixels. This enables the regularization term
to suppress noise while preserving sharp abundance transitions.
This operator plays a role similar to Total Variation (TV) in image processing, where first-order differences promote piecewise smoothness rather than enforcing global smoothing. By applying this operator over a learned graph topology, the model adaptively enforces spatial coherence across pixels while being sensitive to boundaries between materials.
3.3. Formulation of Proposed SCGDO Model
The proposed model employs local collaborative weighted sparse regression based on superpixel segmentation as the sparse constraint term, and utilizes a first-order graph differential operator to further enhance the model’s sparsity, represented as
where
and
are parameters. The second term represents the sparse constraint. Final term involves spatial structure regularization, utilizing the graph differential operator P to promote similar segmented smoothness between image Y and fractional abundance image X. Specifically,
controls the local sparsity within superpixels, and
enforces spatial smoothness on the abundance maps via the graph differential operator.
To solve the optimization problem of (13), ADMM is employed and transformed the original (13) into (14):
where
represents the indicator function. Adding D, E, and F, the augmented Lagrangian function for (14) is reformulated as
where
is a penalty coefficient.
During the iteration, ADMM is adopted to solve sub-problems of each variable:
Since the above problem (19) is convex optimization, it has a closed-form solution, and can be directly calculated as
Similarly, other sub-problems are expressed as follows:
Algorithm 1 has displayed the entire algorithm process, which involves the soft threshold function
and
.
Algorithm 1: Pseudo code of SCSU–GDO |
Input: 1. Initialization: 2. repeat 3. 4. 5. for to end for 6. 7. 8. 9. 10. 11. until the condition is met the termination condition Output: |
4. Experiment and Analysis
To assess the effectiveness of the proposed algorithm, both synthetic hyperspectral datasets and real hyperspectral remote sensing image were used. For performance comparison, five advanced spatial–spectral sparse unmixing methods were selected as benchmarks, including CLSUnSAL [
24], SUnSAL–TV [
25], S
2WSU [
29], SBGLSU [
30], and FoGTF-HU [
40]. Two quantitative indicators, signal reconstruction error (SRE) and root mean square error (RMSE), were used. The definitions of these two indicators are shown as follows
where
represents the expected function,
is the estimated abundance, and
denotes the true abundance. RMSE is determined by the difference between
and
, and is defined as follows:
4.1. Simulated Datasets
The splib06 dataset, released in September 2007, was randomly selected from U.S. Geological Survey (USGS) spectral library
, consisting of 240 endmembers. The selection ensured that the spectral angle of any two endmembers was greater than 4 degrees. The simulated data consists of reflectance values uniformly distributed over 224 spectral bands in the range of 0.4 to 2.5
. Several endmembers were randomly chosen from library
A, and simulated data cubes (DC) were generated by linearly combining them with specific abundance matrices satisfying ASC and ANC. DC1 selected five spectral signatures from
A, resulting in a cube with dimensions of 75 × 75 pixels. Similarly, DC2 contained nine endmembers and 100 × 100 pixels, which was generated similarly, and added Gaussian white noise with SNRs of 20 dB, 30 dB, and 40 dB. The spectra of DC1 and DC2 are displayed in
Figure 3. The abundance maps of DC1 and DC2 are shown in
Figure 4 and
Figure 5, respectively.
The proposed SCSU–GDO framework, along with five advanced sparse unmixing methods, was tested and compared on two simulated datasets. The comparative results are visually presented in
Figure 6 and
Figure 7.
Figure 6 shows the abundance maps of DC1 under 30 dB SNR. Compared to the other algorithms, CLSUnSAL, SUnSAL–TV, and S
2WSU exhibit significantly poorer performance in terms of spatial consistency and abundance accuracy. Notably, the estimated abundances for Endmember 3 from these three methods approach zero, indicating substantial deviations from ground truth. Furthermore, the abundance results obtained by S
2WSU lack spatial smoothness. In contrast, SBGLSU demonstrates improved alignment with true abundances for Endmembers 2 and 4 but retains residual errors in Endmember 3. Although SBGLSU’s performance is comparable to FoGTF-HU in
Figure 6, its abundance maps contain prominent noise artifacts, particularly under high noise levels (shown as
Table 1).
Similar trends are observed in
Figure 7. FoGTF-HU and SCSU–GDO preserve most density information from reference maps, whereas SUnSAL–TV and CLSUnSAL produce overly smooth distributions. SBGLSU, despite partial improvements, introduces localized false densities. In contrast, the improved performance of SCSU–GDO can be attributed to its algorithmic design. Specifically, the superpixel-based collaborative sparse regression adaptively enforces sparsity within spatially homogeneous regions, enhancing local coherence and suppressing noise. Meanwhile, the graph differential operator imposes structure-aware regularization that promotes spatial smoothness while preserving sharp transitions at material boundaries. This complementary combination enables the model to produce abundance maps that are both accurate and spatially consistent, achieving improved unmixing accuracy over FoGTF-HU.
Quantitative results (
Table 1 and
Table 2) further validate these findings. For DC1 (30 dB SNR), SCSU–GDO achieves an SRE of 44.6 dB, surpassing SBGLSU by 17.33 dB (63.55% improvement) and FoGTF-HU by 12 dB (36.52%). On DC2, SCSU–GDO attains SRE values 3.64 dB (15%) and 2.52 dB (9.9%) higher than SBGLSU and FoGTF-HU, respectively. Additionally, SCSU–GDO reports consistently lower RMSE values across both datasets, indicating a performance advantage in abundance estimation clarification.
Mechanistically, DC1’s well-defined boundaries and localized spectral homogeneity allow superpixel segmentation to aggregate similar pixels. Subsequent Laplacian regularization within superpixels drives convergence of endmember abundances, explaining SBGLSU’s moderate advantages over CLSUnSAL and SUnSAL–TV. However, FoGTF-HU’s first-order graph differential operator captures pixel-level smoothness in high-dimensional HSIs, while SCSU–GDO integrates this operator with superpixel-based weighted sparse regularization, which enables SCSU–GDO to balance the local detail preservation and the global structural coherence, ensuring robust performance across noise levels and image complexities.
In summary, the proposed SCSU–GDO algorithm demonstrates consistently improved performance in both visual results and quantitative metrics (e.g., RMSE, SRE) compared to the selected spatial–spectral sparse unmixing algorithms.
To investigate the influence of the regularization parameters, after multiple considerations and experiments, we finally test the value
(ranging for [1 × 10
−4, 5 × 10
−4, 1 × 10
−3, 5 × 10
−3, 5 × 10
−2, 1 × 10
−2]) and
(ranging for [1 × 10
−6, 1 × 10
−5, 5 × 10
−5, 1 × 10
−4, 1 × 10
−3, 1 × 10
−2]).
Figure 8 illustrates the variation in SRE values across different datasets and values with DC1 and DC2 under SNR with 30 dB.
4.2. Real Hyperspectral Remote Sensing Image
Here, a widely used real hyperspectral remote sensing image, Cuprite data, was utilized for comparative analysis. A sub-image of dimensions 250 × 191 pixels was selected for evaluation, with 224 spectral bands. To reduce impact of noise, bands 1–2, 105–115, 150–170, and 223–224 were excluded, leaving 188 bands for analysis. Additionally, 498 spectral signatures were selected from USGS library to construct an over-complete spectral library, also called spectral dictionary, denoted as A (188 bands).
Given the absence of reference abundance maps, qualitative comparisons were performed between the USGS Tricorder classification map and the abundance estimates generated by CLSUnSAL, SUnSAL–TV, S
2WSU, SBGLSU, FoGTF-HU, and the proposed SCSU–GDO algorithm. Three typical minerals—Alunite, Buddingtonite, and Chalcedony—were selected for visual assessment, as illustrated in
Figure 9. CLSUnSAL, which assumed a globally consistent active set of endmembers across all pixels, produced less accurate abundance maps with spatial inconsistencies. SUnSAL–TV generated smoother abundance distributions but failed to preserve fine-scale details. Although S
2WSU and SBGLSU achieved relatively accurate unmixing results compared to FoGTF-HU, their Buddingtonite abundance maps exhibited residual noise artifacts. This limitation arose because FoGTF-HU leveraged spatial structure optimization to extract more precise spatial information for the three minerals, aligning its results more closely with the reference classification map.
In contrast, the proposed SCSU–GDO accounted for the spatial coherence of endmembers, which predominantly clustered in localized regions rather than being uniformly distributed across the scene. By employing a graph differential operator to capture spatial features, SCSU–GDO generated abundance maps that retained finer details and exhibited superior alignment with the Tricorder classification map (
Figure 9). These results demonstrated the effectiveness of the proposed algorithm for real hyperspectral images.
To further evaluate the computational efficiency of the proposed algorithms, we report the average runtime (in seconds) of each algorithm on the Cuprite dataset in
Table 3. As expected, methods with simpler regularization schemes tend to execute faster, while algorithms incorporating spatial or graph-based constraints typically require more computation. The proposed SCSU–GDO algorithm exhibits a moderate runtime relative to other structured unmixing methods, reflecting the additional overhead from superpixel segmentation, collaborative sparse modeling, and graph differential regularization. In terms of computational complexity, the main components of SCSU–GDO include (1) superpixel segmentation via SLIC, which is an efficient clustering algorithm with linear time complexity relative to the number of pixels; (2) localized sparse regression within each superpixel, which significantly reduces the problem size and enables efficient parallel or block-wise optimization; and (3) graph-based regularization using a sparse differential operator, which relies on sparse matrix computations and thus remains tractable. The entire optimization process is solved using the ADMM framework, which decomposes the model into sub-problems with closed-form solutions, further improving computational stability. These design choices allow SCSU–GDO to balance accuracy and efficiency. While slightly more expensive than baseline methods, its runtime remains acceptable, and the improvements in unmixing performance justify the added complexity. This demonstrates the effectiveness of the proposed algorithm in practical hyperspectral applications.
In addition to the Cuprite dataset, we also evaluated the proposed algorithms on the Urban hyperspectral image [
46]. This dataset was acquired by the AVIRIS sensor over Washington, D.C., and a subregion of size 307 × 307 pixels with 210 spectral bands was selected for analysis. To reduce the impact of water absorption, several noisy bands were removed, resulting in 162 effective bands. The spectral library was constructed using the publicly available endmember signatures of six typical urban materials. These endmembers serve as the reference basis for sparse unmixing and comparison.
For the Urban dataset, six representative land-cover classes—Asphalt, Grass, Tree, Roof, Metal, and Dirt—were selected for evaluation based on the provided ground truth label map. Four typical classes—Asphalt, Grass, Roof, and Dirt—were selected for visual assessment.
Figure 10 presents the corresponding abundance maps estimated by CLSUnSAL, SUnSAL–TV, S
2WSU, SBGLSU, FoGTF-HU, and the proposed SCSU–GDO method. The hyperspectral image was cropped to a 120 × 120 region to focus on a spatially complex urban area. It is worth noting that, based on both the original image and the provided ground truth, the right-slanted road segment in the cropped region is not asphalt but rather an unpaved dirt road, as evidenced by its spectral characteristics and spatial context in the reference data. As illustrated in
Figure 10, the proposed SCSU–GDO method produced abundance maps most consistent with this ground truth interpretation. In particular, for the Dirt endmember, the method accurately recovered the right-hand unpaved road segment without misclassifying it as asphalt, unlike several competing methods. For the Grass endmember, the central region was reconstructed with spectral abundances closely matching the ground truth distribution, effectively capturing subtle within-class variations. Similar improvements can be observed for the Roof endmember, where sharp building boundaries were preserved without sacrificing intra-class homogeneity. These results highlight the ability of SCSU–GDO to leverage superpixel-based collaborative sparsity and first-order graph differential regularization to model local spectral variability while preserving structural boundaries in complex urban scenes.