1. Introduction
High spatial resolution hyperspectral images are of great significance in agriculture [
1], military [
2,
3,
4], image processing [
5,
6], and remote sensing [
7,
8,
9,
10] because of their ability to possess rich spectral and spatial information at the same time. However, due to limitations of the physical components, the LR-HSI can only be obtained with low spatial information but high spectral information, and the HR-MSI of the same scene has high spatial information but low spectral information. The most common economical way to obtain HR-HSI is, therefore, to fuse the LR-HSI and HR-MSI.
As we know, HR-MSI, LR-HSI, and HR-HSI are images of the same scene with different degrees of spatial and spectral information. When applying most fusion methods, HR-MSI is considered to contain most of the spatial information of HR-HSI, while LR-HSI contains most of the spectral information of HR-HSI. For example, Dian et al. [
11] assumed that LR-HSI contains a large amount of spectral information of HR-HSI, and the spectral basis was obtained from LR-HSI by TMSVD. Long et al. [
12] found a significant correlation between the singular values of HR-HSI and LR-HSI. For both methods, a TMSVD operation is carried out on LR-HSI, and the different factor matrices from LR-HSI are used as prior information in these proposed methods. We assume the HR-MSI should have properties similar to LR-HSI and HR-HSI.
As mentioned, these works [
11,
12] have shown that the first two TMSVD factor matrices of HR-HSI and LR-HSI have strong similarity. A feature of singular value decomposition (SVD) is that most of the image information can be saved by only keeping the first few terms. Inspired by this, we can reasonably assume that HR-HSI can be represented as the product of three TMSVD factor matrices. The last TMSVD factor matrix of HR-HSI contains a lot of spatial information, and it obviously comes from HR-MSI. The relationship between the second TMSVD factor matrix of LR-HSI and HR-HSI has been clearly experimentally demonstrated in [
12], but the relationship between first TMSVD factor matrix of LR-HSI and HR-HSI was not clearly shown. We will verify the relationship of the first matrix as shown in the experimental results that follow.
As shown in
Figure 1, we found a strong correlation between HR-HSI and HR-MSI. The values of the third TMSVD factor matrix from the HR-MSI and HR-HSI are similar or opposite in the same position. As we can see from
Figure 2, the first factor matrix value from HR-HSI and LR-HSI by TMSVD are also approximately the same at the same position, which can explain why the spectral basis comes from the LR-HSI based on matrix similarity, where there was previously no tangible evidence to explain this. We estimate the three SVD factor matrices of HR-HSI from LR-HSI and HR-MSI using TMSVD, and the three factor matrices are then used to construct the HR-HSI, which is different from estimating the spectral basis and the corresponding spectral coefficient matrix.
However, compared with SVD, TMSVD is limited by a truncated value. It only saves part of the original matrix information, which will cause the loss of some critical information. To further improve the fusion effect, we assume that the first TMSVD factor matrix of HR-MSI also contains some information about HR-HSI. Therefore, we introduce the first TMSVD factor matrix of HR-MSI into the fusion process.
We integrate the prior information of TMSVD on HR-MSI and LR-HSI to reconstruct HR-HSI, obtaining the factor matrices from LR-HSI and HR-MSI by TMSVD. We then recombine the obtained factor matrices to obtain a rough HR-HSI and, through a multiplicative iterative process, can finally solve the optimization question.
The main contributions of this paper are as follows:
We explain that the reason for the spectral basis comes from LR-HSI from the perspective of the matrix similarity after truncated singular value decomposition. No such quantitative analysis has been conducted prior to our work.
We found a strong correlation between HR-MSI and HR-HSI. All prior information about TMSVD was integrated into a new proposed fusion model without using the SRF and based only on the TMSVDfactor matrices from LR-HSI and HR-HSI.
We propose a new idea for the hyperspectral fusion method. We reconstruct the HR-HSI by estimating the three SVD factor matrices of HR-HSI from LR-HSI and HR-MSI.
We test our proposed method on two simulated data sets, Pavia University and CAVE, and a real data set in which the remote sensing images are generated by two different spectral cameras, Sentinel 2 and Hyperion. Compared with the non-blind methods, our proposed method achieves a more effective fusio result while reducing fusing time to less than 1% of such methods. Compared with the blind methods on the simulated data sets, our proposed method can improve the PSNR value by up to 16 dB. Moreover, our proposed method demonstrates a better performance on the real data set, which validates its practicality.
The rest of this paper is organized as follows. In
Section 2, some background knowledge and the representative hyperspectral image super-resolution literature are presented. The FTMSVD method is presented in
Section 3. The experimental details are outlined in
Section 4. Presented in
Section 5 are our experimental results and analysis, and the conclusion is found in
Section 6.
2. Related Work
In recent years, the exploration of hyperspectral image (HSI) super-resolution methods has been gradually increasing. These methods can be broadly divided into three categories: matrix factorization-based methods, tensor factorization-based methods, and other methods.
In the matrix factorization-based methods, it is assumed that HR-HSI is a matrix formed by the multiplication of the spectral basis and the corresponding spectral coefficient matrix. The fusion problem is transformed into the problem of how to estimate the spectral basis and coefficient matrix. There are many ways to estimate the spectral basis and coefficient matrix, such as non-negative dictionary learning [
13], K-SVD [
14], and online dictionary learning [
15]. Wycoff et al. [
16] made full use of the prior information of non-negativity and sparsity of HR-HSI and conducted non-negative sparse matrix factorization for HR-MSI and LR-HSI to obtain the approximation of a non-negative spectral basis and sparse coefficient matrix, and the alternative direction multiplier method (ADMM) was then used in their optimization to obtain the HR-HSI. Akhtar et al. [
17] estimated a non-negative dictionary based on the principle of local similarity of images. In [
18], the spectral basis and coefficient matrix were learned from LR-HSI and MR-HSI under some prior information. Huang et al. [
19] proposed obtaining the learning spectral basis by using K-SVD on LR-HSI. Han et al. [
20] clustered image blocks and assumed that similar image blocks can linearly represent the given block, learning of the local similarity of HR-HSI was based on image segmentation. The method we propose in this paper is also a matrix factorization-based method. However, our FTMSVD method differs in two aspects. Firstly, in our FTMSVD method, information is directly obtained from LR-HSI and HR-MSI by TMSVD without the need for a complex operation such as dictionary learning. Secondly, in our proposed method, it is assumed that HR-HSI is composed of three SVD factor matrices, and we estimate the three SVD factor matrices of HR-HSI from LR-HSI and HR-MSI, but not the spectral basis and coefficient matrix.
Tensor factorization-based methods allow for preserving the 3D characteristics of hyperspectral images to a great extent. That is, tensors represent hyperspectral images, and the structure information of the image is better preserved. The question for fusion then concerns how to estimate the tensors. Dian et al. [
21] used Tucker decomposition to decompose HR-HSI into three dictionaries of three dimensions and used a kernel tensor to describe the relationship between the dictionaries. Li et al. [
22] proposed a method based on coupled sparse tensor representation (CSTF) in which HR-HSI is regarded as a three-dimensional tensor. The tensor could be approximated as a core tensor multiplied by three subtensors. In [
11], subspace representation and low-rank tensor representation are combined. The spectral subspace is approximated by the singular value decomposition of LR-HSI, and the coefficients are estimated by the low-rank tensor. Prvost et al. [
23] obtained the dictionary of the third modes by the operation of TMSVD, and they solved the generalized Sylvester equation to obtain the kernel tensor.
Other methods mainly include the Bayesian-based and deep convolutional neural network (CNN)-based methods. In the Bayesian-based methods, the prior distribution is used to solve the fusion problem. A representative example is the proposal in [
24]. The method based on CNN also plays an important role in the realization of hyperspectral image fusion. Dong et al. [
25,
26] constructed a deep CNN for solving single image super-resolution and achieved excellent performance. Liu et al. [
27] proposed a deep CNN named SSAU-Net, by introducing the spectral-spatial attention module to extract the shallow and deep features information from LR-HSI and HR-HSI. Work [
28] cleverly combined subspace representation with CNN denoisier, which only need to train on the gray images. This method solves the difficult problem about the training for CNN model.
These methods are also roughly divided into two categories, depending on whether PSF and SRF are known during the fusion process, as blind fusion methods and non-blind fusion methods. Methods that are non-blind indicate that both PSF and SRF are known. On the contrary, PSF and SRF are unknown in blind methods. Hysure (blind version) [
29] and CNMF [
18] are blind methods in which the fusion process is achieved by estimating both SRF and PSF from HR-MSI and LR-HSI. The method in [
30] is also a blind method because it only estimates the SRF and does not use the PSF during the fusion process. In either case, our proposed method demonstrates good performance without using SRF. When the PSF is known, we use it directly. If the PSF cannot be directly obtained, we can use the PSF that we set. Our experiments demonstrate that our method achieves good results in both cases.
3. Fusion Model
We denote the target HR-HSI as a three-dimensional tensor
, which has
pixels and
L bands. The LR-HSI is regarded as
, which has
pixels and
L bands.
represents the observed HR-MSI, with
pixels and
l bands. It is observed that
,
, and
.
can be seen as a spatially downsampled version of
, i.e.,
where
and
represent mode-n matricization matrix of
and
, respectively. The matrices
and
represent the convolutional blur and downsampling matrix, respectively. In a practical situation, the PSF represents
, which is uncertain, and
can be obtained by the proportion of spatial dimensions of LR-HSI and HR-MSI.
The
can also be seen as the spectral downsampling version of
, i.e.,
where
and
represent the mode-n matricization matrix of
and the spectral response matrix, respectively.
represents the SRF, which is uncertain in most cases.
In [
31], SVD was used for image compression, because it can denoise and represent most information of an image with a few elements. Therefore, if we can obtain the three SVD factor matrices of
, we can recover the original HR-HSI, i.e.,
where
,
and
are the corresponding factor matrices of Z by SVD. However, it is difficult to directly obtain the SVD factor matrices of
. To approximate the SVD factor matrices of
, we estimate the SVD factor matrices of
from
and
by using TMSVD, i.e.,
where
,
, and
are the corresponding estimation factor matrices obtained by TMSVD, and
is the truncated value. Therefore, Equations (
1) and (
2) can be converted to the following equation.
The fusion problem turns into the problem of estimating three factor matrices
,
, and
from HR-MSI and LR-HSI. To distinguish it from the target matrix
, we denote
as the fusion matrix, i.e.,
where
,
and
are the corresponding factor matrices of fusion result
by TMSVD. Since
, the truncated singular value
is constrained below
l, and we set
l as the truncated singular value for the consistency of TMSVD factor matrix dimensions. Through performing the TMSVD operation on
and
with truncated value
, we can obtain the TMSVD factor matrices of
and
, respectively, i.e.,
where
,
and
, and
,
and
. They are the corresponding TMSVD factor matrices of X and Y.
As shown in
Figure 1 and
Figure 2,
and
, and it was found in [
12] that
, where
is the downsampling factor. Hence, we can obtain the following equations.
However, constrained by the spectral dimension of HR-MSI, the
value is limited in
l. Therefore, if we use
,
and
to reconstruct HR-HSI, much important information about HR-HSI will be lost, and the fusion effect is unsatisfactory. Due to the features of SVD,
also obtains information of
, and
contains most of the spatial information of
, so we introduce it in the reconstruction of
to improve the fusion performance, i.e.,
The detailed process for estimating the SVD factor matrices of HR-HSI is presented in Algorithm 1.
Based on Equations (
1) and (
9), we can further optimize the quality of
as the following equation, i.e.,
where
means the Frobenius norm of
in this paper.
Algorithm 1 Estimate three rough SVD factor matrices of HR-HSI |
Require:obtain the downsampling factor and the truncated value q from and Matrix and to and The truncated singular value decomposition on and with truncated value q return
|
Because of the correlation in
Figure 1 and [
12], we assume that
and
are approximately equal to
and
, which means
and
are fixed during the fusion process. We therefore simplify Equation (
14), i.e.,
We set
as constant throughout the fusion process. The meaning of
here is similar to the coefficient matrix in the work [
11], except that we get it directly by the connection between HSI and MSI, without any additional operations. Note that there are two cases of
. If PSF is known in practice, we use it directly. If PSF is unknown, we use the one we set, which is a
Gaussian blur (standard deviation 1) in this paper, and solve the uncertainty of
in this way. Finally, a multiplicative iterative process as in [
18] is introduced to optimize
in Equation (
15), i.e.,
After several iterative rounds of optimization, we obtain a more accurate
. We reconstruct it using
and
to obtain a satisfactory
. The whole process of our method is shown in Algorithm 2.
Algorithm 2 FTMSVD Algorithm |
Require:Obtain the via TMSVD with a truncated singular value q in Algorithm 1 if PSF is known then Use it directly else [PSF is unknown] Set PSF as a Gaussian blur (standard deviation 1) end if Set Use a multiplicative iterative processe to optimize the for k = 1:K do Update by Equation ( 16) end for return
|