4.1. Datasets and Metrics
Some simulation experiments on the public Harvard [
56], CAVE [
57], and Chikusei [
58] datasets, as well as some verification experiments on the real GaoFen-5 dataset, are performed to demonstrate the effectiveness, superiority, and practicality of the proposed GDIS
2Net.
The Harvard dataset covering the range of 420–720 nm at 10 nm intervals, contains 50 HSIs of size . Subimages of size are cut from the original HSIs to simulate the ground truth, i.e., H2SIs. These simulated H2SIs are degraded through the SRF of the Nikon D700 camera to generate HMSIs. Subsequently, LHSIs are produced by applying a Gaussian PSF of to spatially blur the H2SIs. In total, 40 pairs of HMSI/LHSI are randomly selected for training, with the remaining being applied for testing.
To evaluate the robustness of S2R methods across different data distributions, the CAVE dataset is further utilized here. The dataset comprises 32 HSIs, which consist of pixels and 31 spectral bands ranging from 400 nm to 700 nm. Similarly, the original HSIs are used as H2SIs, while the corresponding HMSIs and LHSIs are simulated via the Nikon D700 SRF and a Gaussian PSF of . Seven HMSI/LHSI pairs are randomly chosen as the testing set, while the rest serve as the training set.
To examine the influence of different degradation conditions on S2R methods, the SRF of the IKONOS sensor and a Gaussian PSF of are employed to perform linear degradation operations on the Chikusei dataset, synthesizing HMSIs and LHSIs. The Chikusei dataset is an HSI of size , which was acquired over Chikusei, Japan, in the wavelength range from 363 nm to 1018 nm. The Chikusei image is divided into 34 simulated H2SIs of size . Correspondingly, 34 pairs of HMSI/LHSI are synthesized, which are randomly split into the training and testing dataset, including 25 and 9 pairs, respectively.
Experiments on the real GaoFen-5 dataset are also conducted to verify the applicability of the considered super-resolution methods. The GaoFen-5 dataset, which was captured by the hyperspectral sensor onboard the GaoFen-5 satellite with the wavelength range of [390, 2513] nm, contains an HMSI of size
and an LHSI of size 1185 × 1342 × 285. Both the HMSI and LHSI are cropped into 72 subimages of size 147 × 147 × 4 and 441 × 441 × 285, where 57 pairs of multi- and high-spectral subimages are randomly chosen for training, and the remaining are allocated for testing. All PSFs and SRFs, which are necessary when training S
2R networks, whether on the real or the simulated datasets, are estimated via our previous DiriNet [
59].
Five common metrics, including the peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), spectral angle mapper (SAM), spectral information divergence (SID), and mean relative absolute error (MRAE), are used to quantitatively evaluate the performance of all considered S2R methods. PSNR and MRAE are estimates of the overall numerical error, SAM and SID are measures of spectral differences, and SSIM is an evaluation of spatial structure restoration. Generally, the higher the PSNR and SSIM, and the lower the SAM, SID, and MRAE, the better the super-resolution performance.
4.3. Experiments on Simulated Datasets
Seven state-of-the-art S
2R networks, including MSCNN [
23], HSCNN+ [
27], FMNet [
28], HRNet [
29], MST++ [
31], RepCPSI [
36], and SSRAN [
16], are compared with the proposed GDIS
2Net to reveal its effectiveness. These compared nets are originally trained by H
2SI. For a fair and comprehensive comparison, the three most advanced nets, i.e., HRNet, RepCPSI, and SSRAN, are trained via pairs of LMSI/LHSI without changing their architectures, where the LMSI is produced from the HMSI via PSF. The three super-resolution approaches that change labels are named as HR-MS, Rep-MS, and SSR-MS, respectively.
The quantitative results of all considered super-resolution methods on the Harvard, CAVE, and Chikusei datasets are listed in
Table 3,
Table 4 and
Table 5. As
Table 3,
Table 4 and
Table 5 show, these methods can be divided into super-resolution supervised by H
2SI and LHSI. For a convenient analysis, the best and second-best results in both classes are highlighted in bold and underlined. One worth noting is that SID sometimes reveals opposite performance compared with other metrics, especially in
Table 5. This might be because SID could be heavily penalized for minute absolute numerical fluctuations in low-intensity or low-reflectance bands, which result from the logarithmic calculation. Therefore, performance discussions are conducted from a comprehensive perspective rather than one specific metric. Among all methods that are supervised via ground truth, it can be seen from the results on the Harvard and CAVE datasets that HRNet almost achieves the best spectral restoration. On the Chikusei dataset, HRNet is inferior to RepCPSI and SSRAN. Therefore, HRNet might be of insufficient robustness to the resolution difference, because compared to Harvard and CAVE datasets, the resolution reduction degree of Chikusei is stronger. However, results on the three datasets all show that our GDIS
2Net accomplishes comparable or even better performances to the three better super-resolution methods with H
2SI supervision, namely, HRNet, RepCPSI, and SSRAN. For example, GDIS
2Net attains the highest PSNR and SSIM values on the Harvard and Chikusei datasets, along with the best SAM results on all datasets, indicating its superior spectral reconstruction accuracy. That is, our super-resolution method achieves competitive and advanced spectral reconstruction when supervised by the accessible LHSI. Moreover, our GDIS
2Net appears to have stable super-resolution performance, meaning that it is robust to data distribution and degradation conditions. If SSR-MS, HR-MS, and Rep-MS are separately compared to SSRAN, HRNet, and RepCPSI, it can be found that their reconstruction effects are slightly weaker. This phenomenon illustrates that the adopted linear degradation-based supervision achieves success. Compared with SSR-MS, HR-MS, and Rep-MS, the proposed GDIS
2Net executes the best spectral reconstruction. Hence, the proposed graph-regularized double-path interactive net is superior to state-of-the-art networks geared towards spectral resolution enhancement.
Some visual presentations produced by all of the considered methods on the Harvard, CAVE, and Chikusei datasets are illustrated in
Figure 2,
Figure 3 and
Figure 4, containing single bands, false-color images, and error heat maps. Error heat maps display the average error distribution between the reconstructed and referenced H
2SI in space, where a darker blue color represents lower reconstruction errors. Full-sized error maps in
Figure 4 sharing the same color range seem to illustrate that all reconstructions are similar to the reference, which is actually caused by the much worse result of FMNet. Therefore, accurate error analyses for
Figure 4 should refer to the enlarged maps. As can be seen from
Figure 2,
Figure 3 and
Figure 4, the proposed GDIS
2Net produces sharper super-resolution images and darker error maps overall, which indicates its better spatial preservation ability.
In addition, the average spectral curves over the randomly selected 50 × 50 cubes (as shown in the red square of
Figure 2,
Figure 3 and
Figure 4) from the above reconstruction results are shown in
Figure 5. Spectral curves could exhibit the average difference between the reconstructed and reference H
2SI cubes in space, where a method corresponding to the curve that fits the red curve better is of better spectral fidelity. Compared to others, the proposed S
2R method generates results closely aligned to reference curves across the entire spectrum. The curves produced by our GDIS
2Net exhibit smaller fluctuations, especially in regions with rapid spectral variations, which means effective preservation for both spectral shape and amplitude. In contrast, spectral curves with stronger fluctuations are created by competing methods, particularly on the Chikusei dataset with a higher spatial–spectral resolution difference. Thus, the proposed GDIS
2Net could offer better spectral fidelity and stronger robustness under worse resolution degradation conditions.
Comprehensively, the proposed GDIS2Net produces an effective and superior spectral enhancement effect, as well as showing strong robustness and generalization to different datasets and degradation conditions.
4.4. Experiments on Real Dataset
To further assess the practical applicability of GDIS2Net, experiments are conducted on the real-world GaoFen-5. As this dataset lacks ground truth (H2SI), existing super-resolution methods that rely on the H2SI label cannot be directly applied. Therefore, in addition to GDIS2Net, three hybrid supervised methods (SSR-MS, HR-MS, and Rep-MS) are evaluated for a fair comparison.
For a quantitative evaluation, the spectral distortion index (
), spatial distortion index (
), and the quality with no reference (QNR) are calculated under the Wald protocol as some existing studies [
61]. Specifically, the reconstructed H
2SIs are degraded using the pre-estimated PSF to generate simulated HSIs. The degraded results are then compared with the observed HSI to compute
and
, and the overall QNR is obtained accordingly.
The quantitative results are reported in
Table 6. Obviously, our GDIS
2Net achieves superior performance across all three metrics. In addition, the single-band and false-color results in
Figure 6 illustrate that GDIS
2Net achieves better visual texture restoration. Then, the superior net design and better practical potential of our method are validated.
4.6. Ablation Studies
A series of ablation studies was conducted on the Harvard dataset for the proposed GDIS2Net to thoroughly evaluate its contributions. Sufficient analysis has been accomplished for the effectiveness of training super-resolution networks with LHSI in the above experiments. Thus, no further discussion about this will be conducted in this section. Ablation experiments were designed by progressively adding components of spectral graph restoration, spectral continuity loss, IRM, and ERM to a baseline model.
Specifically, four network variants are defined. (1) SGLC-Net: GDIS2Net without ERM. (2) SGL-Net: SGLC-Net without the feature interaction of IRM, i.e., the two SA units and cross-feature multiply operations are deleted. (3) SG-Net: SGL-Net without the spectral continuity loss. (4) Base: SG-Net without the SGR subnet but cascading SSRAN. What needs to be explained is that Base would provide insufficient reconstruction without SSRAN, and SSRAN is connected for feature-to-image reconstruction because of its simple architecture.
All quantitative results are summarized in
Table 8. Obviously, SG-Net outperforms Base, and SGL-Net achieves further improvements over SG-Net. Thus, both the spectral continuity constraint and the SGR subnet work well. Namely, the proposed spectral graph restoration procedure is quite effective. SGLC-Net surpasses SGL-Net, particularly towards SAM, indicating that attention-based feature interactions are beneficial for precise spectral recovery. The complete GDIS
2Net delivers the best performance, which confirms the further refinement of the ERM. In conclusion, the proposed spectral graph restoration and two spatial attention-based residual modules contribute to the superior super-resolution effect of the whole GDIS
2Net.