Next Article in Journal
Remote Sensing of Rice Canopy Nitrogen Content Based on Unmanned Aerial Vehicle Multi-Angle Polarized Hyperspectral Data
Previous Article in Journal
10 Years of Lidar Observations of Polar Stratospheric Clouds at Concordia Station
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Graph-Regularized Double-Path Interactive Spectral Super-Resolution Network for Hyperspectral Image Reconstruction

1
Beijing Key Laboratory of Computational Intelligence and Intelligent System, School of Information Science and Technology, Beijing University of Technology, Beijing 100124, China
2
Space Star Technology Co., Ltd., Beijing 100089, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2026, 18(6), 875; https://doi.org/10.3390/rs18060875
Submission received: 21 January 2026 / Revised: 2 March 2026 / Accepted: 9 March 2026 / Published: 12 March 2026
(This article belongs to the Section Remote Sensing Image Processing)

Highlights

What are the main findings?
  • A graph-regularized double-path spectral super-resolution network is proposed to enhance spectral fidelity in MSI-to-HSI reconstruction.
  • Interactive and enhanced residual modules enable effective spectral–spatial feature interaction and fusion.
What are the implications of the main finding?
  • Spectral graph regularization alleviates the ill-posedness of spectral super-resolution in remote sensing.
  • The proposed framework achieves robust spectral enhancement on both simulated and real datasets.

Abstract

Deep learning has demonstrated outstanding potential for the spectral super-resolution (S2R) reconstruction of multispectral images (MSIs). However, it is still a challenge to alleviate spectral distortion during S2R reconstruction. Given the superiority of a graph, a graph-regularized double-path interactive S2R network (GDIS2Net) consisting of two parallel branches is proposed to reconstruct hyperspectral images (HSIs) from MSIs. An interactive residual module is carefully schemed as the backbone of the S2R network to facilitate the feature interaction between the two branches, while an enhanced residual module is constructed for further feature fusion. In addition, a new loss function considering the spectral continuity is proposed to optimize the proposed GDIS2Net. Experimental analyses show that the proposed GDIS2Net outperforms state-of-the-art methods on both simulated and real datasets.

1. Introduction

Hyperspectral images (HSIs) of numerous narrow spectral bands can effectively distinguish materials with subtle differences, making them widely used in remote sensing [1,2,3], target classification [4,5,6], agriculture [7,8,9], and geology [10,11,12]. Due to the trade-offs among spatial resolution, spectral resolution, and signal-to-noise ratio (SNR) [13,14], HSIs are often low-resolution, while multispectral images (MSIs) are usually high-resolution (in this paper, high-resolution and low-resolution are only used to describe the spatial resolution of an image) [3,15]. In order to generate high-resolution HSIs (H2SIs), a promising and cost-effective technology, i.e., spectral super-resolution (S2R) for MSIs [16], has been developed recently, aiming to recover high spectral detail from MSIs.
Existing S2R methods can be broadly categorized into sparse representation-based [17,18,19,20] and deep learning (DL)-based methods [21,22,23]. S2R is inherently a highly ill-posed problem [24], as the available spectral bands in MSIs are much fewer than the spectral requirements when performing H2SI reconstruction, causing underdetermined and ambiguous reconstruction [18,25]. To alleviate this ill-posedness, sparse representation-based S2R methods [19,24] have been proposed, which assume that H2SIs share the same set of sparse coefficients and the same high-spectral dictionaries with the corresponding high-resolution MSIs (HMSIs) and low-resolution HSIs (LHSIs), respectively. Thus, H2SIs can be produced by integrating sparse coefficients into the high-spectral dictionaries learned from pairs of HMSIs/LHSIs, where the coefficients are inferred from HMSIs via the learned dictionaries. The above sparse representation-based methods have achieved success towards solution stabilizing after introducing some effective prior constraints, such as low rankness [20] and sparsity [24,25,26]. However, the spatial–spectral continuity inherent in spectral images was ignored, as 3D images were usually unfolded into 2D matrices. Moreover, the reconstruction process often relies on an iterative optimization of expensive computation.
In recent years, DL has been introduced into the S2R task, which usually executes S2R by learning the reconstruction mapping from sufficient pairs of HMSIs/H2SIs in an end-to-end manner. Compared to sparse representation-based methods, DL approaches generally consume less time because the computational burden is shifted to the training phase. In addition, DL has been revealed to exhibit superior S2R performance because of its outstanding abilities of feature extraction and non-linear expression [27,28,29,30,31]. Considering the difficulty of obtaining high-resolution ground truth(H2SI) in real-world scenarios, Chen et al. [32] proposed a spectral–spatial residual network (SSRN). More specifically, the mapping learned from observed LHSIs and simulated low-resolution MSIs (LMSIs) can be generalized to real HMSIs. However, achieving high spectral fidelity when performing the S2R task is still a challenge.
To achieve better spectral enhancement for real MSIs, a graph-regularized double-path interactive spectral super-resolution network (GDIS2Net) is proposed. GDIS2Net is designed by connecting an S2R subnet and a spectral graph reconstruction (SGR) subnet in parallel, where the latter is used to optimize the S2R performance, because graph signal processing has shown superior characterization ability of the spectral continuity in high-dimensional images [33]. An interactive residual module (IRM) is designed to bridge the two subnets for level-by-level feature interaction. In addition, an enhanced residual module (ERM) is proposed for the final feature fusion. Under our spectral graph-assisted strategy, the proposed GDIS2Net is optimized using hyperspectral graph supervision in addition to the hybrid supervision of LHSIs and LMSIs. Correspondingly, a loss function regularized by the spectral graph reconstruction is used to optimize our network.
The main contributions of this article can be summarized as follows:
(1)
A graph-regularized double-path interactive S2R network that attempts to improve spectral enhancement by reconstructing a hyperspectral graph from a blurry one is proposed. Furthermore, a spectral graph reconstruction is integrated as a regularization when designing the loss for network training. Through graph signal processing, the spectral continuity inherent in HSIs is explicitly captured, and spectral distortion is effectively mitigated.
(2)
An interactive residual module is designed to achieve level-by-level feature interaction, where the spatial attention mechanism is used to guide feature interaction in a cross way.
(3)
An enhanced residual module is developed to refine the final feature fusion, which executes feature extraction across different receptive fields in a double-path manner, along with spatial attention-based critical feature emphasizing.
The subsequent sections are organized as follows. Section 2 presents the related studies. The proposed GDIS2Net is described in detail in Section 3. In Section 4, the experimental results and analyses are presented. Finally, the conclusion is summarized in Section 5.

2. Related Work

2.1. Deep S2R Methods

Since Galliani et al. [21] first applied a deep neural network for S2R, DL-based spectral resolution enhancement has recently gained increasing attention. For instance, a unified deep learning framework named HSCNN was published [22] to reconstruct H2SIs from spectral undersampled projections. Subsequently, a multi-scale deep convolutional neural network (MSCNN) was established on the symmetrical downsampling and upsampling operations, further enhancing the reconstruction performance [23]. To ensure the texture preservation and spectral fidelity as well as improve the robustness and generalization of the S2R algorithms, a variety of different network architectures have been proposed [34,35], such as a deep convolutional neural network (CNN) composed of numerous residual blocks (HSCNN+) [27], a pixel-aware deep function-mixture network (FMNet) dynamically adjusting receptive fields and mapping functions [28], and a four-level hierarchical regression network (HRNet) executing inter-level feature interactions via pixel shuffle [29]. Given that these remarkable S2R deep models involve substantial parameters and require high computational costs during the training phase, some lightweight nets have been developed. A lightweight residue-dense attention network for efficient spectral feature prediction was constructed [30], which utilized dual-path feature extraction and coordination convolutional blocks to markedly reduce model complexity. Exploiting spectral-wise multi-head self-attention mechanism to model the long-range dependencies across different spectral bands, Cai et al. [31] proposed a multi-level transformer-based architecture (MST++). In addition, a reparameterizing coordinate-preserving proximity spectral interaction network (RepCPSI) [36] was reported for lightweight S2R, where the spatial–spectral context features were explored via a multi-channel polymorphic residual context restructuring module.
Obviously, existing studies have demonstrated the effectiveness and superiority of DL when performing spectral reconstruction. However, it remains a significant challenge to apply the above deep S2R networks in real-world scenarios, as the necessary H2SI labels are actually unavailable. Fortunately, a hybrid supervision strategy was published in [32], enabling S2R to be performed without H2SI labels. Based on this, a spectral graph-aided spectral reconstruction network is proposed to ensure a better spectral fidelity by introducing the graph signal theory, for the first time, into spectrum prediction.

2.2. Attention Mechanisms in S2R

The human brain is able to automatically focus on salient regions while processing complex information [37], which enlightens the birth of the attention mechanism. Nowadays, the attention mechanism has been demonstrated with significant benefits for enhancing feature representation, particularly in image recognition and restoration tasks [38,39,40]. Based on the squeeze-and-excitation module [38] that explicitly modeled the interdependencies among feature channels, a hybrid residual attention module [41] was developed to enhance the spectral reconstruction effect. With this first success of attention mechanism in the S2R task, various attention-based image spectral enhancement methods have been proposed. For instance, the channel attention module was introduced to characterize the spectral differences between different bands [42]. However, Zheng et al. [16] found that most attention mechanisms focus on the contextual features of one certain dimension, such as channel or space [43]. Consequently, complementary information from other dimensions is ignored, which may compromise spatial–spectral consistency. For this, they designed a spatial–spectral residual attention network (SSRAN), which adopted the 1D convolution to generate attention weights to enhance the correlation between neighboring spectral bands. Then, a coordinate-preserving attention module [36] was further designed to selectively emphasize discriminative spatial–spectral features while suppressing redundant information.
Motivated by the benefits of the attention mechanism in improving spatial and spectral feature representations, two new modules, i.e., IRM and ERM, established on the spatial attention (SA) mechanism, are designed to promote interaction and fusion between two-subnet features in this paper.

3. Proposed Method

3.1. Overview

Referring to Wald’s protocol applied in [32], the GDIS2Net is proposed to achieve spectral enhancement for LMSIs via reconstruction mapping from LMSI to LHSI.
Let H2SI, HMSI, and LHSI be successively denoted as Z R H × W × C , X R H × W × c , and Y R h × w × C , where H h , W w , and C c are the rows, columns, and bands of the image, respectively. Then, LMSI X R h × w × c is generated by spatially degrading HMSI X via the point spread function (PSF):
X = Θ ( X P ) ,
where ∗ denotes spatial convolution, Θ ( · ) is the downsampling operator of interval r, r = H h = W w , and P R r × r represents the PSF. It is theoretically suggested that the S2R mapping learned from X Y can be generalized to X Z .
Attracted by the outstanding ability of graphs to characterize spectral continuity, an SGR subnet (see details in Section 3.2) is further designed to optimize the S2R process, utilizing the spectral graph as an auxiliary reconstruction target. Specifically, a pre-estimated LHSI Y ˜ R h × w × C is obtained by interpolating X , and its spectral graph Y ˜ SG is computed to assist the spectral reconstruction. That is,
( Y ^ , Y ^ SG ) = X , Y ˜ SG ,
where ( · ) describes the trained GDIS2Net, and Y ^ and Y ^ SG denote the reconstructed LHSI and spectral graph, respectively. Such reconstruction is co-supervised via the Y , Y SG pair, where Y SG is the spectral graph corresponding to Y . In order to preserve spatial information, the S2R process is further self-supervised by the input LMSI X . Therefore, a reconstructed LMSI X ^ is computed from Y ^ :
X ^ = Y ^ × R T 3 ,
where R R C × c is the spectral response function (SRF). To achieve the above hybrid supervision procedure, a spatial–spectral fidelity loss function is constructed as follows:
L = Y Y ^ 1 + λ X X ^ 1 + μ Y SG Y ^ SG 1 ,
where λ and μ are the regularization parameters, and · 1 is the L1-norm. Y Y ^ 1 is the spectral fidelity term to ensure spectral enhancement, X X ^ 1 is the spatial fidelity term to ensure spatial preservation, and Y SG Y ^ SG 1 is the spectral graph reconstruction term to ensure spectral continuity. Here, the L1-norm but not the common L2-norm is applied, since the former is more sensitive to small errors, so that it has a good edge-preserving ability [44], while the latter usually pays much attention to outliers and then ignores the overall structure [45,46].
Once the GDIS2Net is optimized via the loss function L, a corresponding H2SI Z ^ R H × W × C can be restored by inputting any HMSI X and its corresponding spectral graph Z ˜ SG into the trained GDIS2Net. The spectral reconstruction can be formulated as follows:
Z ^ = X , Z ˜ SG .

3.2. Spectral Graph

Graphs are widely used in high-dimensional signal processing due to their ability to capture inherent spatial and spectral continuity within signals [47,48]. A weighted undirected graph is defined as G = ( V , E , W ) , where V and E denote the vertices and edges, and W is the symmetric adjacency matrix with element w i j being the weight of the edge e i j between the ith and jth vertices v i and v j . There is a diagonal degree matrix D , where its ith diagonal entry d i i equals the sum of weights connected to vertex v i . The graph Laplacian is then defined as follows:
L = D W ,
which is a symmetric semi-positive definite matrix.
For a given third-order tensor A R H × W × C , a corresponding undirected graph G = ( V , E , W ) can be constructed by taking each of its bands as a vertex. Then, the spectral continuity of A can be enforced by the graph Laplacian regularization J = tr ( A ( 3 ) L A ( 3 ) T ) , where A ( 3 ) denotes the mode-3 unfolding of A [47]. A smaller value of this term generally indicates better continuity among adjacent vertices; namely, A × L 3 can describe spectral continuity in A . High spectral continuity has already been revealed in HSI [49]. Therefore, in addition to spectral enhancement for LMSI, an SGR is also considered in this paper. Specifically, the spectral graph A SG of a third-order tensor A is defined as follows:
A SG = A × L 3 .
Following this definition, the spectral graphs of the pre-estimated LHSI Y ˜ and LHSI label Y are individually calculated as follows:
Y ˜ SG = Y ˜ × L 3 ,
Y SG = Y × L 3 ,
where Y ˜ SG and Y SG are the spectral graphs of Y ˜ and Y , respectively; L means that the graphs corresponding to Y ˜ and Y share the same adjacency matrix. As the common strategy [47], w i i is set to 0, while w i j is set to 1, where j generally denotes n adjacent bands to the ith band, and n is a tunable parameter. As stated in Section 3.1, under the supervision of Y SG , Y ˜ SG is reconstructed as Y ^ SG .
As shown in Figure 1, features within SGR interact with the spectral enhancement process in a level-by-level way, and the final SGR and S2R features are fused to improve the spectral fidelity of the super-resolution result. One worth noting is that our approach uses the spectral graph as the reconstruction objective to explicitly facilitate S2R, rather than simply imposing a graph constraint on HSI reconstruction as existing methodologies [50] have done.

3.3. Graph-Regularized Double-Path Interactive Spectral Super-Resolution Network

To reduce the S2R difficulty, the degraded LMSI X is first interpolated to a coarse LHSI Y ˜ via the cubic interpolation algorithm. Next, the S2R and SGR subnets in our GDIS2Net focus on reconstructing the image and graph residuals between the interpolated result and LHSI labels, respectively. Therefore, the backbone accordingly adopts the residual architecture.
To promote level-by-level feature interaction between the two subnets, several IRMs, of which the detailed structure is shown in the light red box of Figure 1, are designed to consist of two cascaded dense blocks, two element-wise addition operations, one element-wise multiply operator, and one SA unit. Each dense block contains one 3 × 3 convolutional layer, one batch normalization (BN) layer, and one rectified linear unit (ReLU), which is used for feature extraction. Its subsequent addition operation is used for high- and low-level feature merging. The SA mechanism [36,39] is introduced into the two subnets to help extract the limited and important spectral features, since the number of linearly independent spectra is usually much less than the space size of an image [51,52]. As illustrated in Figure 1, let F S k and F G k denote the feature maps extracted from the kth IRM in the S2R and SGR subnets, respectively. Then, the level-by-level feature interaction is accomplished as follows:
F S k = M G k F S k ,
F G k = M S k F G k ,
where F S k and F G k denote the interacted results in the S2R and SGR subnets, respectively; ⊙ represents element-wise multiplication; M S k and M G k are the attention maps separately generated from F S k and F G k via an SA unit. Specifically, M p = σ ( W h ( Avg h ( F p ) ) ) σ ( W w ( Avg w ( F p ) ) ) , where p { S , G } ; Avg h ( · ) and Avg w ( · ) denote the along-height and -width pooling operations, respectively; W h ( · ) and W w ( · ) are the corresponding 1D convolutions with kernel sizes of 5 × 1 and 1 × 5 , respectively; σ ( · ) denotes the Sigmoid function. During this process, the spectral features from one subnet are injected into the other one as a modulation factor, enabling bidirectional information exchange. Specifically, the attention maps M S and M G attempt to act as the modulation factors via element-wise multiplication to dynamically scale features so that informative components are emphasized while the redundant and noisy ones are suppressed. Subsequently, in both subnets, an inner shortcut formed by one 1 × 1 convolutional layer and the ReLU function is connected behind several residual modules to execute efficient fusion for the high- and low-level features.
A simple element-wise addition is used to initially fuse the features of the two subnets; then, an ERM presented in the light-green box of Figure 1 is carefully schemed to refine the fused features and project them into the image domain. Firstly, the coarse fused result is unified via a 1 × 1 convolutional layer. Next, 3 attention-based fusion blocks are linked with residual connection [53,54] to promote sufficient fusion. Each fusion block contains double-path feature extraction, a concatenation operation, an SA unit, and a ReLU function. The double-path feature extraction, which consists of two parallel convolutional layers of size 3 × 3 and 1 × 1 , respectively, is used to extract features in different receptive fields, as local information could be lost when adopting a single receptive field. Although some different multi-path feature extraction strategies, for example, the multi-branch network in [55], have been published, our ERM is specifically designed to bridge the feature gap between the heterogeneous domains of the image and the spectral graph for a refined feature fusion.
After aggregating the double-path features, an SA unit is used to weaken the unimportant parts and strengthen the important ones. Then, the ReLU function enhances the information representation of the attention result by introducing non-linearity. The fusion output is further refined by a 1 × 1 convolutional layer and an SA unit. Here, the SA unit is used as the dynamic spatial filter to suppress potential noise or artifacts introduced during the cross-branch fusion process. As further verified in Section 4.5, even if slight noise is mixed during feature fusion, our S2R method maintains stable reconstruction performance. Finally, the refined features are converted into the desired image domain through a dense block cascaded by two convolutional layers with kernel sizes of 3 × 3 and 1 × 1 , respectively, and a Tanh function. The stride, padding, and channel size of each convolutional layer are set to 1, 0, and C, respectively.

4. Experiments and Discussion

4.1. Datasets and Metrics

Some simulation experiments on the public Harvard [56], CAVE [57], and Chikusei [58] datasets, as well as some verification experiments on the real GaoFen-5 dataset, are performed to demonstrate the effectiveness, superiority, and practicality of the proposed GDIS2Net.
The Harvard dataset covering the range of 420–720 nm at 10 nm intervals, contains 50 HSIs of size 1392 × 1040 × 31 . Subimages of size 512 × 512 × 31 are cut from the original HSIs to simulate the ground truth, i.e., H2SIs. These simulated H2SIs are degraded through the SRF of the Nikon D700 camera to generate HMSIs. Subsequently, LHSIs are produced by applying a Gaussian PSF of r = 8 to spatially blur the H2SIs. In total, 40 pairs of HMSI/LHSI are randomly selected for training, with the remaining being applied for testing.
To evaluate the robustness of S2R methods across different data distributions, the CAVE dataset is further utilized here. The dataset comprises 32 HSIs, which consist of 512 × 512 pixels and 31 spectral bands ranging from 400 nm to 700 nm. Similarly, the original HSIs are used as H2SIs, while the corresponding HMSIs and LHSIs are simulated via the Nikon D700 SRF and a Gaussian PSF of r = 8 . Seven HMSI/LHSI pairs are randomly chosen as the testing set, while the rest serve as the training set.
To examine the influence of different degradation conditions on S2R methods, the SRF of the IKONOS sensor and a Gaussian PSF of r = 16 are employed to perform linear degradation operations on the Chikusei dataset, synthesizing HMSIs and LHSIs. The Chikusei dataset is an HSI of size 2517 × 2335 × 128 , which was acquired over Chikusei, Japan, in the wavelength range from 363 nm to 1018 nm. The Chikusei image is divided into 34 simulated H2SIs of size 512 × 512 × 128 . Correspondingly, 34 pairs of HMSI/LHSI are synthesized, which are randomly split into the training and testing dataset, including 25 and 9 pairs, respectively.
Experiments on the real GaoFen-5 dataset are also conducted to verify the applicability of the considered super-resolution methods. The GaoFen-5 dataset, which was captured by the hyperspectral sensor onboard the GaoFen-5 satellite with the wavelength range of [390, 2513] nm, contains an HMSI of size 3555 × 4026 × 4 and an LHSI of size 1185 × 1342 × 285. Both the HMSI and LHSI are cropped into 72 subimages of size 147 × 147 × 4 and 441 × 441 × 285, where 57 pairs of multi- and high-spectral subimages are randomly chosen for training, and the remaining are allocated for testing. All PSFs and SRFs, which are necessary when training S2R networks, whether on the real or the simulated datasets, are estimated via our previous DiriNet [59].
Five common metrics, including the peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), spectral angle mapper (SAM), spectral information divergence (SID), and mean relative absolute error (MRAE), are used to quantitatively evaluate the performance of all considered S2R methods. PSNR and MRAE are estimates of the overall numerical error, SAM and SID are measures of spectral differences, and SSIM is an evaluation of spatial structure restoration. Generally, the higher the PSNR and SSIM, and the lower the SAM, SID, and MRAE, the better the super-resolution performance.

4.2. Parameter Settings of GDIS2Net

4.2.1. Regularization Parameters

The regularization parameters λ and μ in our loss function, Equation (4), are empirically tuned following the experimental results. Their optimal settings are experientially determined via a grid search, as shown in Table 1.
It can be observed that the best trade-offs between spatial and spectral reconstruction are achieved when λ and μ are set to 1.0 and 0.5 , respectively. Accordingly, λ = 1 and μ = 0.5 are adopted for all subsequent experiments.

4.2.2. Network Hyperparameters

The proposed GDIS2Net is optimized using an ADAM optimizer [60] with β 1 = 0.9 , β 2 = 0.999 , and ε = 10 9 . The number of the adjacent bands considered when constructing the spectral graph and the IRM as the backbone of the proposed net are critical to our super-resolution performance. Certain continuity exists in high spectrums, fewer adjacent bands could not reflect this property well while more might introduce unexpected redundancy. As to the IRM, the higher its number k is, the better the non-linearity of the GDIS2Net generally. However, the computational cost would be higher with the increasing number of IRM. Hence, settings for the two hyperparameters are discussed on the Harvard dataset, as listed in Table 2. Here, n is manually tuned by fixing k at 9, and k is tuned by fixing n at 3. Experimentally, the numbers of the adjacent bands and IRM are set as 3 and 9, respectively.

4.3. Experiments on Simulated Datasets

Seven state-of-the-art S2R networks, including MSCNN [23], HSCNN+ [27], FMNet [28], HRNet [29], MST++ [31], RepCPSI [36], and SSRAN [16], are compared with the proposed GDIS2Net to reveal its effectiveness. These compared nets are originally trained by H2SI. For a fair and comprehensive comparison, the three most advanced nets, i.e., HRNet, RepCPSI, and SSRAN, are trained via pairs of LMSI/LHSI without changing their architectures, where the LMSI is produced from the HMSI via PSF. The three super-resolution approaches that change labels are named as HR-MS, Rep-MS, and SSR-MS, respectively.
The quantitative results of all considered super-resolution methods on the Harvard, CAVE, and Chikusei datasets are listed in Table 3, Table 4 and Table 5. As Table 3, Table 4 and Table 5 show, these methods can be divided into super-resolution supervised by H2SI and LHSI. For a convenient analysis, the best and second-best results in both classes are highlighted in bold and underlined. One worth noting is that SID sometimes reveals opposite performance compared with other metrics, especially in Table 5. This might be because SID could be heavily penalized for minute absolute numerical fluctuations in low-intensity or low-reflectance bands, which result from the logarithmic calculation. Therefore, performance discussions are conducted from a comprehensive perspective rather than one specific metric. Among all methods that are supervised via ground truth, it can be seen from the results on the Harvard and CAVE datasets that HRNet almost achieves the best spectral restoration. On the Chikusei dataset, HRNet is inferior to RepCPSI and SSRAN. Therefore, HRNet might be of insufficient robustness to the resolution difference, because compared to Harvard and CAVE datasets, the resolution reduction degree of Chikusei is stronger. However, results on the three datasets all show that our GDIS2Net accomplishes comparable or even better performances to the three better super-resolution methods with H2SI supervision, namely, HRNet, RepCPSI, and SSRAN. For example, GDIS2Net attains the highest PSNR and SSIM values on the Harvard and Chikusei datasets, along with the best SAM results on all datasets, indicating its superior spectral reconstruction accuracy. That is, our super-resolution method achieves competitive and advanced spectral reconstruction when supervised by the accessible LHSI. Moreover, our GDIS2Net appears to have stable super-resolution performance, meaning that it is robust to data distribution and degradation conditions. If SSR-MS, HR-MS, and Rep-MS are separately compared to SSRAN, HRNet, and RepCPSI, it can be found that their reconstruction effects are slightly weaker. This phenomenon illustrates that the adopted linear degradation-based supervision achieves success. Compared with SSR-MS, HR-MS, and Rep-MS, the proposed GDIS2Net executes the best spectral reconstruction. Hence, the proposed graph-regularized double-path interactive net is superior to state-of-the-art networks geared towards spectral resolution enhancement.
Some visual presentations produced by all of the considered methods on the Harvard, CAVE, and Chikusei datasets are illustrated in Figure 2, Figure 3 and Figure 4, containing single bands, false-color images, and error heat maps. Error heat maps display the average error distribution between the reconstructed and referenced H2SI in space, where a darker blue color represents lower reconstruction errors. Full-sized error maps in Figure 4 sharing the same color range seem to illustrate that all reconstructions are similar to the reference, which is actually caused by the much worse result of FMNet. Therefore, accurate error analyses for Figure 4 should refer to the enlarged maps. As can be seen from Figure 2, Figure 3 and Figure 4, the proposed GDIS2Net produces sharper super-resolution images and darker error maps overall, which indicates its better spatial preservation ability.
In addition, the average spectral curves over the randomly selected 50 × 50 cubes (as shown in the red square of Figure 2, Figure 3 and Figure 4) from the above reconstruction results are shown in Figure 5. Spectral curves could exhibit the average difference between the reconstructed and reference H2SI cubes in space, where a method corresponding to the curve that fits the red curve better is of better spectral fidelity. Compared to others, the proposed S2R method generates results closely aligned to reference curves across the entire spectrum. The curves produced by our GDIS2Net exhibit smaller fluctuations, especially in regions with rapid spectral variations, which means effective preservation for both spectral shape and amplitude. In contrast, spectral curves with stronger fluctuations are created by competing methods, particularly on the Chikusei dataset with a higher spatial–spectral resolution difference. Thus, the proposed GDIS2Net could offer better spectral fidelity and stronger robustness under worse resolution degradation conditions.
Comprehensively, the proposed GDIS2Net produces an effective and superior spectral enhancement effect, as well as showing strong robustness and generalization to different datasets and degradation conditions.

4.4. Experiments on Real Dataset

To further assess the practical applicability of GDIS2Net, experiments are conducted on the real-world GaoFen-5. As this dataset lacks ground truth (H2SI), existing super-resolution methods that rely on the H2SI label cannot be directly applied. Therefore, in addition to GDIS2Net, three hybrid supervised methods (SSR-MS, HR-MS, and Rep-MS) are evaluated for a fair comparison.
For a quantitative evaluation, the spectral distortion index ( D λ ), spatial distortion index ( D s ), and the quality with no reference (QNR) are calculated under the Wald protocol as some existing studies [61]. Specifically, the reconstructed H2SIs are degraded using the pre-estimated PSF to generate simulated HSIs. The degraded results are then compared with the observed HSI to compute D λ and D s , and the overall QNR is obtained accordingly.
The quantitative results are reported in Table 6. Obviously, our GDIS2Net achieves superior performance across all three metrics. In addition, the single-band and false-color results in Figure 6 illustrate that GDIS2Net achieves better visual texture restoration. Then, the superior net design and better practical potential of our method are validated.

4.5. Noise Robustness Analysis

To evaluate the noise robustness of the proposed GDIS2Net, several comparative S2R networks are tested on the noisy Harvard dataset by adding Gaussian noise of different signal-to-noise ratios (SNR = 20, 25, and 30 dB) to the inputting MSI.
The quantitative results in Table 7 show that the reconstruction performance of all considered methods is naturally degraded with the increasing in noise intensity. However, our GDIS2Net comprehensively shows stable and better metrics, which could be the result of the task-irrelevant disturbance filtering of the incorporated spectral graph regularization and dual-branch interaction mechanism. Visual comparisons in Figure 7 further confirm that GDIS2Net effectively suppresses noise while preserving structural details.

4.6. Ablation Studies

A series of ablation studies was conducted on the Harvard dataset for the proposed GDIS2Net to thoroughly evaluate its contributions. Sufficient analysis has been accomplished for the effectiveness of training super-resolution networks with LHSI in the above experiments. Thus, no further discussion about this will be conducted in this section. Ablation experiments were designed by progressively adding components of spectral graph restoration, spectral continuity loss, IRM, and ERM to a baseline model.
Specifically, four network variants are defined. (1) SGLC-Net: GDIS2Net without ERM. (2) SGL-Net: SGLC-Net without the feature interaction of IRM, i.e., the two SA units and cross-feature multiply operations are deleted. (3) SG-Net: SGL-Net without the spectral continuity loss. (4) Base: SG-Net without the SGR subnet but cascading SSRAN. What needs to be explained is that Base would provide insufficient reconstruction without SSRAN, and SSRAN is connected for feature-to-image reconstruction because of its simple architecture.
All quantitative results are summarized in Table 8. Obviously, SG-Net outperforms Base, and SGL-Net achieves further improvements over SG-Net. Thus, both the spectral continuity constraint and the SGR subnet work well. Namely, the proposed spectral graph restoration procedure is quite effective. SGLC-Net surpasses SGL-Net, particularly towards SAM, indicating that attention-based feature interactions are beneficial for precise spectral recovery. The complete GDIS2Net delivers the best performance, which confirms the further refinement of the ERM. In conclusion, the proposed spectral graph restoration and two spatial attention-based residual modules contribute to the superior super-resolution effect of the whole GDIS2Net.

4.7. Computational Efficiency Analysis

To comprehensively evaluate the computational efficiency of the proposed GDIS2Net, its number of parameters (Params), floating-point operations (FLOPs), model size, and inference time are compared with other state-of-the-art methods on the Chikusei dataset. The inference time is averaged for a 128 × 128 input over 100 forward passes on a NVIDIA RTX 3090 GPU.
As reported in Table 9, our model size and parameters are well-controlled and significantly smaller than those of MSCNN and HRNet. Due to the adoption of the dual-branch structure and multi-level interactive residual modules, our method inevitably introduces higher FLOPs and a slightly longer inference time. However, our processing speed remains highly competitive and satisfies most actual requirements.

5. Conclusions

In this paper, a novel residual S2R network aided by spectral graph reconstruction, termed GDIS2Net, was proposed for MSI spectral enhancement. Combining the supervisions of the hyperspectral image and its spectral graph with the self-supervision of the multispectral input, the proposed GDIS2Net co-restored the spectrum and spectral graph for MSI to the corresponding high-spectral-resolution level. Such hybrid supervision was achieved by tailored spatial–spectral fidelity loss, where both spatial–spectral information protection and the spectral continuity guarantee were considered. Furthermore, an interactive residual module and an enhanced residual module are designed to facilitate cross-feature interaction and multi-scale feature fusion, respectively. Experiments on simulated datasets demonstrated that our GDIS2Net achieved competitive performance compared to state-of-the-art methods. Moreover, our super-resolution net was superior to state-of-the-art S2R networks, as shown in the experimental results on both simulated and real datasets. In further studies, we would explore a data-driven Laplacian matrix learning method to improve spectral reconstruction.

Author Contributions

Conceptualization, S.W. and T.H.; methodology, S.W. and T.H.; validation, S.W.; investigation, S.W., S.C., Z.L. and Z.S.; writing, S.W. and T.H.; supervision, T.H., Z.L., Z.S., K.J. and J.F. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the National Natural Foundation of China (No. 62301012).

Data Availability Statement

The Harvard dataset mentioned in this paper is openly and freely available at http://vision.seas.harvard.edu/hyperspec/ (accessed on 21 January 2026). The CAVE dataset used in this study is openly available at https://cave.cs.columbia.edu/repository/Multispectral (accessed on 21 January 2026). The Chikusei dataset used in this paper is freely available at http://park.itc.u-tokyo.ac.jp/sal/hyperdata (accessed on 21 January 2026). The GaoFen-5 dataset mentioned in this paper was obtained from a third-party data provider and is available at http://www.sasclouds.com/chinese/home/ (accessed on 21 January 2026).

Conflicts of Interest

Author Siyuan Cheng was employed by the company Space Star Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Akhtar, N.; Mian, A. Nonparametric Coupled Bayesian Dictionary and Classifier Learning for Hyperspectral Classification. IEEE Trans. Neural Networks Learn. Syst. 2018, 29, 4038–4050. [Google Scholar] [CrossRef]
  2. Sun, X.; Qu, Y.; Gao, L.; Sun, X.; Qi, H.; Zhang, B.; Shen, T. Ensemble-Based Information Retrieval with Mass Estimation for Hyperspectral Target Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5508123. [Google Scholar] [CrossRef]
  3. Lu, R.; Chen, B.; Cheng, Z.; Wang, P. RAFnet: Recurrent attention fusion network of hyperspectral and multispectral images. Signal Process. 2020, 177, 107737. [Google Scholar] [CrossRef]
  4. Li, Y.; Wang, J.; Liu, X.; Xian, N.; Xie, C. DIM Moving Target Detection using Spatio-Temporal Anomaly Detection for Hyperspectral Image Sequences. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium; IEEE: New York, NY, USA, 2018; pp. 7086–7089. [Google Scholar] [CrossRef]
  5. Xing, C.; Wang, M.; Dong, C.; Duan, C.; Wang, Z. Joint sparse-collaborative representation to fuse hyperspectral and multispectral images. Signal Process. 2020, 173, 107585. [Google Scholar] [CrossRef]
  6. Zhao, J.; Wu, K.; Zhang, L.; Huang, W.; Ruan, C.; Huang, L. Patch-based hierarchical residual spectral-spatial convolutional network for hyperspectral image classification. Signal Process. 2025, 230, 109850. [Google Scholar] [CrossRef]
  7. Munipalle, V.K.; Nelakuditi, U.R.; Nidamanuri, R.R. Agricultural Crop Hyperspectral Image Classification using Transfer Learning. In Proceedings of the International Conference on Machine Intelligence for GeoAnalytics and Remote Sensing (MIGARS); IEEE: Piscataway, NJ, USA, 2023; Volume 1, pp. 1–4. [Google Scholar] [CrossRef]
  8. Zhu, K.; Sun, Z.; Zhao, F.; Yang, T.; Tian, Z.; Lai, J.; Long, B.; Li, S. Remotely sensed canopy resistance model for analyzing the stomatal behavior of environmentally-stressed winter wheat. ISPRS J. Photogramm. Remote Sens. 2020, 168, 197–207. [Google Scholar] [CrossRef]
  9. Wang, D.; Hu, M.; Jin, Y.; Miao, Y.; Yang, J.; Xu, Y.; Qin, X.; Ma, J.; Sun, L.; Li, C.; et al. HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model. IEEE Trans. Pattern Anal. Mach. Intell. 2025, 47, 6427–6444. [Google Scholar] [CrossRef] [PubMed]
  10. Wang, P.; Zhang, G.; Wang, L.; Leung, H.; Bi, H. Subpixel Land Cover Mapping Based on Dual Processing Paths for Hyperspectral Image. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 1835–1848. [Google Scholar] [CrossRef]
  11. Liu, R.; Liang, J.; Yang, J.; Hu, M.; He, J.; Zhu, P.; Zhang, L. DHSNet: Dual Classification Head Self-Training Network for Cross-Scene Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5534515. [Google Scholar] [CrossRef]
  12. Yang, J.; Wu, C.; Du, B.; Zhang, L. Enhanced Multiscale Feature Fusion Network for HSI Classification. IEEE Trans. Geosci. Remote Sens. 2021, 59, 10328–10347. [Google Scholar] [CrossRef]
  13. Jia, J.; Chen, J.; Zheng, X.; Wang, Y.; Guo, S.; Sun, H.; Jiang, C.; Karjalainen, M.; Karila, K.; Duan, Z.; et al. Tradeoffs in the Spatial and Spectral Resolution of Airborne Hyperspectral Imaging Systems: A Crop Identification Case Study. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5510918. [Google Scholar] [CrossRef]
  14. Wang, L.; Xiong, Z.; Gao, D.; Shi, G.; Zeng, W.; Wu, F. High-speed hyperspectral video acquisition with a dual-camera architecture. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2015; pp. 4942–4950. [Google Scholar] [CrossRef]
  15. Yan, J.; Zhang, K.; Zhang, F.; Ge, C.; Wan, W.; Sun, J. Multispectral and hyperspectral image fusion based on low-rank unfolding network. Signal Process. 2023, 213, 109223. [Google Scholar] [CrossRef]
  16. Zheng, X.; Chen, W.; Lu, X. Spectral Super-Resolution of Multispectral Images Using Spatial–Spectral Residual Attention Network. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5404114. [Google Scholar] [CrossRef]
  17. Arad, B.; Shahar, O. Sparse Recovery of Hyperspectral Signal from Natural RGB Images. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016; Volume 9911, pp. 19–34. [Google Scholar] [CrossRef]
  18. Yokoya, N.; Heiden, U.; Bachmann, M. Spectral Enhancement of Multispectral Imagery Using Partially Overlapped Hyperspectral Data and Sparse Signal Representation. In Proceedings of the Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS), Amsterdam, The Netherlands, 23–26 September 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar]
  19. Yi, C.; Zhao, Y.Q.; Chan, J.C.W. Spectral Super-Resolution for Multispectral Image Based on Spectral Improvement Strategy and Spatial Preservation Strategy. IEEE Trans. Geosci. Remote Sens. 2019, 57, 9010–9024. [Google Scholar] [CrossRef]
  20. Gao, L.; Hong, D.; Yao, J.; Zhang, B.; Gamba, P.; Chanussot, J. Spectral Superresolution of Multispectral Imagery with Joint Sparse and Low-Rank Learning. IEEE Trans. Geosci. Remote Sens. 2021, 59, 2269–2280. [Google Scholar] [CrossRef]
  21. Galliani, S.; Lanaras, C.; Marmanis, D.; Baltsavias, E.; Schindler, K. Learned Spectral Super-Resolution. arXiv 2017, arXiv:1703.09470. [Google Scholar] [CrossRef]
  22. Xiong, Z.; Shi, Z.; Li, H.; Wang, L.; Liu, D.; Wu, F. HSCNN: CNN-Based Hyperspectral Image Recovery from Spectrally Undersampled Projections. In Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops (ICCVW); IEEE: New York, NY, USA, 2017; pp. 518–525. [Google Scholar] [CrossRef]
  23. Yan, Y.; Zhang, L.; Li, J.; Wei, W.; Zhang, Y. Accurate Spectral Super-resolution from Single RGB Image Using Multi-scale CNN. arXiv 2018, arXiv:1806.03575. [Google Scholar]
  24. Han, X.; Yu, J.; Luo, J.; Sun, W. Reconstruction From Multispectral to Hyperspectral Image Using Spectral Library-Based Dictionary Learning. IEEE Trans. Geosci. Remote Sens. 2019, 57, 1325–1335. [Google Scholar] [CrossRef]
  25. Fotiadou, K.; Tsagkatakis, G.; Tsakalides, P. Spectral Super Resolution of Hyperspectral Images via Coupled Dictionary Learning. IEEE Trans. Geosci. Remote Sens. 2019, 57, 2777–2797. [Google Scholar] [CrossRef]
  26. Barman, T.; Deka, B.; Prasad, A. GPU-Accelerated Adaptive Dictionary Learning and Sparse Representations for Multispectral Image Super-resolution. In Proceedings of the 2021 IEEE 18th India Council International Conference (INDICON); IEEE: New York, NY, USA, 2021; pp. 1–7. [Google Scholar] [CrossRef]
  27. Shi, Z.; Chen, C.; Xiong, Z.; Liu, D.; Wu, F. HSCNN+: Advanced CNN-Based Hyperspectral Recovery from RGB Images. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW); IEEE: New York, NY, USA, 2018; pp. 1052–1060. [Google Scholar] [CrossRef]
  28. Zhang, L.; Lang, Z.; Wang, P.; Wei, W.; Liao, S.; Shao, L.; Zhang, Y. Pixel-aware Deep Function-mixture Network for Spectral Super-Resolution. arXiv 2019, arXiv:1903.10501. [Google Scholar] [CrossRef]
  29. Zhao, Y.; Po, L.M.; Yan, Q.; Liu, W.; Lin, T. Hierarchical Regression Network for Spectral Reconstruction from RGB Images. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW); IEEE: New York, NY, USA, 2020; pp. 1695–1704. [Google Scholar] [CrossRef]
  30. Nathan, D.S.; Uma, K.; Vinothini, D.S.; Bama, B.S.; Roomi, S.M.M.M. Light Weight Residual Dense Attention Net for Spectral Reconstruction from RGB Images. arXiv 2020, arXiv:2004.06930. [Google Scholar] [CrossRef]
  31. Cai, Y.; Lin, J.; Lin, Z.; Wang, H.; Zhang, Y.; Pfister, H.; Timofte, R.; Gool, L.V. MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW); IEEE: New York, NY, USA, 2022; pp. 744–754. [Google Scholar] [CrossRef]
  32. Chen, W.; Zheng, X.; Lu, X. Hyperspectral Image Super-Resolution with Self-Supervised Spectral-Spatial Residual Network. Remote Sens. 2021, 13, 1260. [Google Scholar] [CrossRef]
  33. Ma, X.; Liu, W.; Li, S.; Tao, D.; Zhou, Y. Hypergraph p -Laplacian Regularization for Remotely Sensed Image Recognition. IEEE Trans. Geosci. Remote Sens. 2019, 57, 1585–1595. [Google Scholar] [CrossRef]
  34. Arad, B.; Ben-Shahar, O.; Timofte, R.; Van Gool, L.; Zhang, L.; Yang, M.H. NTIRE 2018 Challenge on Spectral Reconstruction from RGB Images. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW); IEEE: New York, NY, USA, 2018; pp. 1042–1051. [Google Scholar] [CrossRef]
  35. Arad, B.; Timofte, R.; Ben-Shahar, O.; Lin, Y.T.; Finlayson, G.; Givati, S. NTIRE 2020 Challenge on Spectral Reconstruction from an RGB Image. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW); IEEE: New York, NY, USA, 2020; pp. 1806–1822. [Google Scholar] [CrossRef]
  36. Wu, C.; Li, J.; Song, R.; Li, Y.; Du, Q. RepCPSI: Coordinate-Preserving Proximity Spectral Interaction Network with Reparameterization for Lightweight Spectral Super-Resolution. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5508313. [Google Scholar] [CrossRef]
  37. Chun, M.; Golomb, J.; Turk-Browne, N. A Taxonomy of External and Internal Attention. Annu. Rev. Psychol. 2009, 62, 73–101. [Google Scholar] [CrossRef] [PubMed]
  38. Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE: New York, NY, USA, 2018; pp. 7132–7141. [Google Scholar] [CrossRef]
  39. Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the Computer Vision–ECCV 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 3–19. [Google Scholar]
  40. Cao, Y.; Xu, J.; Lin, S.; Wei, F.; Hu, H. GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW); IEEE: New York, NY, USA, 2019; pp. 1971–1980. [Google Scholar] [CrossRef]
  41. Li, J.; Wu, C.; Song, R.; Xie, W.; Ge, C.; Li, B.; Li, Y. Hybrid 2-D–3-D Deep Residual Attentional Network with Structure Tensor Constraints for Spectral Super-Resolution of RGB Images. IEEE Trans. Geosci. Remote Sens. 2021, 59, 2321–2335. [Google Scholar] [CrossRef]
  42. He, J.; Li, J.; Yuan, Q.; Shen, H.; Zhang, L. Spectral Response Function-Guided Deep Optimization-Driven Network for Spectral Super-Resolution. IEEE Trans. Neural Networks Learn. Syst. 2022, 33, 4213–4227. [Google Scholar] [CrossRef]
  43. Liu, S.; Liu, S.; Zhang, S.; Li, B.; Hu, W.; Zhang, Y.D. SSAU-Net: A Spectral–Spatial Attention-Based U-Net for Hyperspectral Image Fusion. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5542116. [Google Scholar] [CrossRef]
  44. Zhao, H.; Gallo, O.; Frosio, I.; Kautz, J. Loss Functions for Image Restoration with Neural Networks. IEEE Trans. Comput. Imaging 2017, 3, 47–57. [Google Scholar] [CrossRef]
  45. Dong, W.; Qu, J.; Zhang, T.; Li, Y.; Du, Q. Context-Aware Guided Attention Based Cross-Feedback Dense Network for Hyperspectral Image Super-Resolution. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5530814. [Google Scholar] [CrossRef]
  46. Zhu, Q.; Zhang, M.; Chen, Y.; Zheng, G.; Luo, J. Spectral Correlation-Based Fusion Network for Hyperspectral Image Super-Resolution. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5500314. [Google Scholar] [CrossRef]
  47. Liu, N.; Li, W.; Tao, R.; Du, Q.; Chanussot, J. Multigraph-Based Low-Rank Tensor Approximation for Hyperspectral Image Restoration. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5530314. [Google Scholar] [CrossRef]
  48. Su, X.; Zhang, Z.; Yang, F. Fast Hyperspectral Image Denoising and Destriping Method Based on Graph Laplacian Regularization. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5511214. [Google Scholar] [CrossRef]
  49. Zhu, Z.Y.; Huang, T.Z.; Huang, J.; Wu, L. Tensor singular value decomposition and low-rank representation for hyperspectral image unmixing. Signal Process. 2025, 230, 109799. [Google Scholar] [CrossRef]
  50. Wang, H.; Xu, Y.; Wu, Z.; Wei, Z. Unsupervised Hyperspectral and Multispectral Image Blind Fusion Based on Deep Tucker Decomposition Network with Spatial–Spectral Manifold Learning. IEEE Trans. Neural Networks Learn. Syst. 2025, 36, 12721–12735. [Google Scholar] [CrossRef] [PubMed]
  51. Qu, Y.; Qi, H.; Kwan, C. Unsupervised Sparse Dirichlet-Net for Hyperspectral Image Super-Resolution. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE: New York, NY, USA, 2018; pp. 2511–2520. [Google Scholar] [CrossRef]
  52. Wang, L.; Zhou, J.; Li, Z.; Zhao, X.; Wu, C.; Xu, M. Adversarial MixUp with implicit semantic preservation for semi-supervised hyperspectral image classification. Signal Process. 2023, 211, 109116. [Google Scholar] [CrossRef]
  53. Li, Y.; Fu, M.; Zhang, H.; Xu, H.; Zhang, Q. Hyperspectral Image Fusion Algorithm Based on Improved Deep Residual Network. Signal Process. 2023, 210, 109058. [Google Scholar] [CrossRef]
  54. Chhapariya, K.; Ientilucci, E.J.; Buddhiraju, K.M.; Kumar, A. A Spectral-Spatial Classification Network for Hyperspectral Images using a Residual Attention Network. In Proceedings of the 2023 13th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing (WHISPERS), Athens, Greece, 31 October–2 November 2023; IEEE: New York, NY, USA, 2023; pp. 1–5. [Google Scholar]
  55. Yang, J.; Du, B.; Xu, Y.; Zhang, L. Can Spectral Information Work While Extracting Spatial Distribution?—An Online Spectral Information Compensation Network for HSI Classification. IEEE Trans. Image Process. 2023, 32, 2360–2373. [Google Scholar] [CrossRef]
  56. Chakrabarti, A.; Zickler, T. Statistics of Real-World Hyperspectral Images. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2011; pp. 193–200. [Google Scholar]
  57. Yasuma, F.; Mitsunaga, T.; Iso, D.; Nayar, S.K. Generalized Assorted Pixel Camera: Postcapture Control of Resolution, Dynamic Range, and Spectrum. IEEE Trans. Image Process. 2010, 19, 2241–2253. [Google Scholar] [CrossRef]
  58. Yokoya, N.; Iwasaki, A. Airborne Hyperspectral Data over Chikusei; Technical Report; The University of Tokyo: Tokyo, Japan, 2016. [Google Scholar]
  59. Hu, T.; Cheng, S.; Liu, C. DiriNet: An Estimation Network for Spectral Response Function and Point Spread Function. J. Beijing Inst. Technol. 2024, 33, 287–297. [Google Scholar]
  60. Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the International Conference on Learning Representations, Banff, AB, Canada, 14–16 April 2014; pp. 1–15. [Google Scholar]
  61. He, W.; Cai, Y.; Ren, Q.; Ruze, A.; Jia, S. Adaptive Expert Learning for Hyperspectral and Multispectral Image Fusion. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5528615. [Google Scholar] [CrossRef]
Figure 1. Detailed architecture of our GDIS2Net. SG and SA are the abbreviations for spectral graph and spatial attention, respectively. Our S2R procedure is as follows. The degraded LMSI is interpolated into a coarse LHSI; then, the spectral graph is generated by the SG module. Next, reconstructions for X and Y ˜ SG are executed via S2R and SGR subnets, respectively, where the feature interaction and fusion between the two subnets are achieved via IRM in a level-by-level way. Finally, ERM further refines the fusion result and generates the S2R target.
Figure 1. Detailed architecture of our GDIS2Net. SG and SA are the abbreviations for spectral graph and spatial attention, respectively. Our S2R procedure is as follows. The degraded LMSI is interpolated into a coarse LHSI; then, the spectral graph is generated by the SG module. Next, reconstructions for X and Y ˜ SG are executed via S2R and SGR subnets, respectively, where the feature interaction and fusion between the two subnets are achieved via IRM in a level-by-level way. Finally, ERM further refines the fusion result and generates the S2R target.
Remotesensing 18 00875 g001
Figure 2. Spectral enhancement results on the Harvard dataset. The 13th band, false-color images (R:29, G:14, B:3), and error heat maps of different methods are placed from top to bottom in sequence. The red squares indicate the zoomed-in regions.
Figure 2. Spectral enhancement results on the Harvard dataset. The 13th band, false-color images (R:29, G:14, B:3), and error heat maps of different methods are placed from top to bottom in sequence. The red squares indicate the zoomed-in regions.
Remotesensing 18 00875 g002
Figure 3. Spectral enhancement results on the CAVE dataset. The 13th band, false-color images (R:31, G:16, B:5), and error heat maps of different methods are placed from top to bottom in sequence. The red squares indicate the zoomed-in regions.
Figure 3. Spectral enhancement results on the CAVE dataset. The 13th band, false-color images (R:31, G:16, B:5), and error heat maps of different methods are placed from top to bottom in sequence. The red squares indicate the zoomed-in regions.
Remotesensing 18 00875 g003
Figure 4. Spectral enhancement results on the Chikusei dataset. The 13th band, false-color images (R:67, G:37, B:15), and error heat maps of different methods are placed from top to bottom in sequence. The red squares indicate the zoomed-in regions.
Figure 4. Spectral enhancement results on the Chikusei dataset. The 13th band, false-color images (R:67, G:37, B:15), and error heat maps of different methods are placed from top to bottom in sequence. The red squares indicate the zoomed-in regions.
Remotesensing 18 00875 g004
Figure 5. The average spectral curves over the randomly selected 50 × 50 cubes. (a,b) Harvard. (c,d) CAVE. (e) and (f) Chikusei.
Figure 5. The average spectral curves over the randomly selected 50 × 50 cubes. (a,b) Harvard. (c,d) CAVE. (e) and (f) Chikusei.
Remotesensing 18 00875 g005
Figure 6. Original images and spectral enhancement results on GaoFen-5. The second band and false-color images (R:50, G:35, B:25 for LHSI and S2R results; R:3, G:2, B:1 for HMSI) are placed from top to bottom in sequence. The red squares indicate the zoomed-in regions.
Figure 6. Original images and spectral enhancement results on GaoFen-5. The second band and false-color images (R:50, G:35, B:25 for LHSI and S2R results; R:3, G:2, B:1 for HMSI) are placed from top to bottom in sequence. The red squares indicate the zoomed-in regions.
Remotesensing 18 00875 g006
Figure 7. Noise robustness experiments on the Harvard dataset. The 2nd of MSI and 25th bands of H2SI are placed at 1st and 2–9 columns, respectively. The red squares indicate the zoomed-in regions.
Figure 7. Noise robustness experiments on the Harvard dataset. The 2nd of MSI and 25th bands of H2SI are placed at 1st and 2–9 columns, respectively. The red squares indicate the zoomed-in regions.
Remotesensing 18 00875 g007
Table 1. Discussion for λ and μ (PSNR/SAM).
Table 1. Discussion for λ and μ (PSNR/SAM).
λ / μ 0.10.50.81.0
0.536.1865/10.369137.6313/6.548237.5263/7.261635.4165/9.8110
0.836.9631/8.603137.7579/6.394737.1598/7.792033.4017/15.6594
1.037.4418/7.176038.8669/5.031037.4843/5.684238.3544/6.1862
1.236.7791/6.989236.8550/6.021735.5214/11.227136.8120/10.1811
Table 2. Discussion for parameters n and k.
Table 2. Discussion for parameters n and k.
Parameternk
234578910
PSNR37.717838.866937.753237.262637.926137.022138.866936.8870
SAM6.44635.03105.65546.36196.17286.35805.03106.3815
Table 3. Quantitative comparisons of all the considered methods on the Harvard dataset. The best results are shown in bold, and the second-best results are underlined.
Table 3. Quantitative comparisons of all the considered methods on the Harvard dataset. The best results are shown in bold, and the second-best results are underlined.
LabelMethodsPSNRSSIMSAMSIDMRAE
H2SIMST++35.19990.95207.47060.41260.1640
MSCNN37.42200.96427.15300.30310.1801
FMNet38.26790.97177.59960.22060.1432
HSCNN+38.06420.97606.32770.20050.1157
RepCPSI38.15470.98026.58400.18070.1224
HRNet38.24610.97996.18140.17770.1140
SSRAN38.26950.97736.81050.20680.1280
LHSISSR-MS36.34860.97146.85020.24670.1293
HR-MS36.33410.97326.45640.22570.1241
Rep-MS37.56390.97816.25610.19010.1211
Ours38.86690.98585.03100.12720.0962
Table 4. Quantitative comparisons of all the considered methods on the CAVE dataset. The best results are shown in bold, and the second-best results are underlined.
Table 4. Quantitative comparisons of all the considered methods on the CAVE dataset. The best results are shown in bold, and the second-best results are underlined.
LabelMethodsPSNRSSIMSAMSIDMRAE
H2SIMST++35.26030.941714.34000.45610.3188
MSCNN34.87970.926117.48410.54930.3468
FMNet35.28880.956215.69420.34780.2073
HSCNN+34.97490.966319.57720.37850.1709
RepCPSI34.53750.943016.63760.46110.2721
HRNet35.53540.978312.09050.26600.1409
SSRAN35.31510.967414.43010.32580.1961
LHSISSR-MS34.12590.957615.30930.39140.2099
HR-MS33.15900.943317.21880.57370.2513
Rep-MS34.54910.962714.64070.37150.1933
Ours34.03880.971011.10620.32910.1498
Table 5. Quantitative comparisons of all the considered methods on the Chikusei dataset. The best results are shown in bold, and the second-best results are underlined.
Table 5. Quantitative comparisons of all the considered methods on the Chikusei dataset. The best results are shown in bold, and the second-best results are underlined.
LabelMethodsPSNRSSIMSAMSIDMRAE
H2SIMST++36.26560.96117.34730.45760.2069
MSCNN32.80500.90696.43801.00990.2348
FMNet37.18670.741932.67071.67670.8571
HSCNN+32.24150.89767.24310.87230.2438
RepCPSI38.77020.95187.36940.52800.2210
HRNet37.58580.93438.31800.56700.2455
SSRAN39.61780.95956.73860.57180.1906
LHSISSR-MS37.43640.95096.64820.50230.1819
HR-MS38.26030.92179.53940.78270.2699
Rep-MS37.69620.93368.15930.64810.2326
Ours39.76440.96845.83010.90790.1752
Table 6. Quantitative comparisons on GaoFen-5. The best results are shown in bold, and the second-best results are underlined.
Table 6. Quantitative comparisons on GaoFen-5. The best results are shown in bold, and the second-best results are underlined.
Method D λ D s QNR ↑
HR-MS0.26030.22770.5714
Rep-MS0.25610.21740.5824
SSR-MS0.24340.20960.5981
Ours0.22810.204690.6101
Table 7. Quantitative noise robustness comparisons of different methods on the Harvard dataset. The best results are shown in bold, and the second-best results are underlined.
Table 7. Quantitative noise robustness comparisons of different methods on the Harvard dataset. The best results are shown in bold, and the second-best results are underlined.
Method/InputSNR = 20SNR = 25SNR = 30Clean
SSRAN37.5296/6.962238.0124/6.878638.1857/6.840338.2695/6.8105
HRNet37.7757/6.340138.0928/6.243538.1937/6.210038.2461/6.1814
RepCPSI37.5995/7.161837.9656/6.809638.0960/6.666638.1547/6.5840
SSR-MS35.6851/6.976335.9821/6.773836.0922/6.702936.3486/6.8502
HR-MS35.9552/7.050936.1980/6.662736.2832/6.518236.3341/6.4564
Rep-MS36.2196/7.634537.0318/6.814637.3926/6.462337.5639/6.2561
Ours37.3804/6.137938.3008/5.550638.6633/5.246138.8669/5.0310
Table 8. Ablation results.
Table 8. Ablation results.
DescriptionLSSSGRSG lossIRMERMPSNRSSIMSAMSID
GDIS2Net38.86690.98585.03100.1272
SGLC-Net×38.33620.98545.40190.1445
SGL-Net××38.07720.98045.70550.1829
SG-Net×××37.78250.97976.74270.1929
Base×××××35.45980.96687.33180.2827
Table 9. Computational efficiency of different methods on the Chikusei dataset. The best results are shown in bold, and the second-best results are underlined.
Table 9. Computational efficiency of different methods on the Chikusei dataset. The best results are shown in bold, and the second-best results are underlined.
MethodParams (M)FLOPs (G)Model Size (MB)Inference Time (ms)
MST++1.7616.8086.7217.806
MSCNN26.69310.472101.8732.078
FMNet2.98748.92011.3966.482
HSCNN+1.18819.4664.5323.557
RepCPSI2.14835.1238.1966.291
HRNet8.38210.94631.9764.397
SSRAN1.46223.9105.5753.633
Ours5.63892.65526.05614.085
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, S.; Hu, T.; Cheng, S.; Li, Z.; Sun, Z.; Jia, K.; Feng, J. A Graph-Regularized Double-Path Interactive Spectral Super-Resolution Network for Hyperspectral Image Reconstruction. Remote Sens. 2026, 18, 875. https://doi.org/10.3390/rs18060875

AMA Style

Wang S, Hu T, Cheng S, Li Z, Sun Z, Jia K, Feng J. A Graph-Regularized Double-Path Interactive Spectral Super-Resolution Network for Hyperspectral Image Reconstruction. Remote Sensing. 2026; 18(6):875. https://doi.org/10.3390/rs18060875

Chicago/Turabian Style

Wang, Shuo, Ting Hu, Siyuan Cheng, Zhe Li, Zhonghua Sun, Kebin Jia, and Jinchao Feng. 2026. "A Graph-Regularized Double-Path Interactive Spectral Super-Resolution Network for Hyperspectral Image Reconstruction" Remote Sensing 18, no. 6: 875. https://doi.org/10.3390/rs18060875

APA Style

Wang, S., Hu, T., Cheng, S., Li, Z., Sun, Z., Jia, K., & Feng, J. (2026). A Graph-Regularized Double-Path Interactive Spectral Super-Resolution Network for Hyperspectral Image Reconstruction. Remote Sensing, 18(6), 875. https://doi.org/10.3390/rs18060875

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop