Structure Tensor-Based Algorithm for Hyperspectral and Panchromatic Images Fusion

Restricted by technical and budget constraints, hyperspectral (HS) image which contains abundant spectral information generally has low spatial resolution. Fusion of hyperspectral and panchromatic (PAN) images can merge spectral information of the former and spatial information of the latter. In this paper, a new hyperspectral image fusion algorithm using structure tensor is proposed. An image enhancement approach is utilized to sharpen the spatial information of the PAN image, and the spatial details of the HS image is obtained by an adaptive weighted method. Since structure tensor represents structure and spatial information, a structure tensor is introduced to extract spatial details of the enhanced PAN image. Seeing that the HS and PAN images contain different and complementary spatial information for a same scene, a weighted fusion method is presented to integrate the extracted spatial information of the two images. To avoid artifacts at the boundaries, a guided filter is applied to the integrated spatial information image. The injection matrix is finally constructed to reduce spectral and spatial distortion, and the fused image is generated by injecting the complete spatial information. Comparative analyses validate the proposed method outperforms the state-of-art fusion methods, and provides more spatial details while preserving the spectral information.


Introduction
Hyperspectral (HS) remote sensing is an emerging discipline.Traditional remote sensing sensors obtain the image in a few discrete bands, and lose a large amount of useful information.A hyperspectral remote sensing sensor is capable of acquiring numerous contiguous narrow bands in a certain wavelength range [1].As a result, the HS imagery has very high spectral resolution, and is a three-dimensional data cube, of which two spatial dimensions contain the space information, and one spectral dimension at each pixel includes the high-dimensional reflectance vectors [2,3].Such HS image with abundant spectral information has been widely utilized in many domains, such as military surveillance [4], environmental monitoring [5], mineral exploration [6,7], and agriculture [8,9].However, due to the constraints of technical difficulties and budget, the HS image usually has low spatial resolution.Although the high spectral resolution is crucial for identifying the materials, high spatial resolution is also important for locating the objects with high accuracy.There are various techniques to improve the spatial resolution of the HS image.Hyperspectral image fusion is one of the important spatial resolution enhancement approaches.Panchromatic (PAN) sensors can provide the PAN imagery with high spatial resolution.Fusion of an HS image and a PAN image is able to obtain a fused HS image with high spectral and spatial resolution by integrating the spectral information of the HS image and the spatial information of the PAN image.
A large number of hyperspectral image fusion methods have been proposed, and can be roughly divided into five families [10].The first family is component substitution (CS), which first separates the spatial and spectral information of an HS image.The separated spatial component is then substituted by the PAN image, and a fused HS image can be obtained by applying the inverse transformation [11].The CS includes algorithms such as intensity-hue-saturation (IHS) [12][13][14], principal component analysis (PCA) [15][16][17], Gram-Schmidt (GS) [18], adaptive GS (GSA) [19], Brovey transform (BT) [20], and partial replacement adaptive CS (PRACS) [21].These CS based methods work well from a spatial aspect [19], and have fast and simple implementation [13].However, they may suffer from serious spectral distortion, cause by the difference between the PAN image and the substituted spatial component [22].The second family is multiresolution analysis (MRA) which aims to extract the spatial details of a PAN image through the multiscale decomposition or spatial filtering.The extracted spatial details are then injected into an HS image.Some well-known examples in the MRA family are smoothing filter based intensity modulation (SFIM) [23], decimated wavelet transform (DWT) [24], Laplacian pyramid [25], modulation transfer function (MTF) generalized Laplacian pyramid (MTF-GLP) [26], and MTF-GLP with high pass modulation (MTF-GLP-HPM) [27].The MRA algorithms have temporal coherence [28], and good spectral preservation performance.On the negative side, these MRA algorithms have heavy computational burden and complicated implementation when compared to CS-based algorithms [28].The CS and MRA approaches are the traditional fusion methods, and have been also extended from multispectral (MS) pansharpening to hyperspectral pansharpening.
The other three families, Bayesian methods, matrix factorization based methods, and hybrid methods, have been proposed recently.Bayesian methods transform a fusion problem into an explicit probabilistic framework, and then define a suitable prior distribution of interest to regularize the optimization model [29].Bayesian sparsity promoted Gaussian prior (Bayesian sparse) [30], Bayesian HySure [31], and Bayesian naive Gaussian prior (Bayesian naive) [32] belong to this class of hyperspectral pansharpening.Matrix factorization based methods employ the nonnegative matrix factorization (NMF) model [33], and utilize the estimated solution of the NMF model to generate the fused HS image.The matrix factorization family contains algorithms such as nonnegative sparse coding (NNSC) [34], and constrained nonnegative matrix factorization (CNMF) [35].Bayesian approaches and matrix factorization approaches perform well in terms of the preservation of spectral information.However, they have high computational cost.Hybrid methods combine algorithms from different families, for example, CS and MRA families, to form a new algorithm.Such obtained new algorithms generally take advantages of algorithms in both families [36].Examples include the curvelet and ICA fusion method [37], the guided filter PCA (GFPCA) method [38], and the non-linear PCA (NLPCA) and indusion method [39].
The key to hyperspectral pansharpening is to provide more spatial information while preserving the spectral information of the original HS image.In order to accomplish this goal, this paper presents a new hyperspectral image fusion algorithm based on structure tensor.In this work, the structure tensor which describes the geometry structure and spatial details is applied to the fusion of HS and PAN images for the first time.Traditional methods extract the spatial details only from the PAN image without considering the structure information of the HS image, and thus, cause spectral distortion or deficient spatial enhancement.The proposed method considers the spatial details of the HS and PAN images simultaneously.The spatial details of the PAN image are extracted by calculating and analyzing the structure tensor and its eigenvalues.The spatial details of the HS image are synchronously generated by the adaptive weighted method.In order to consider the HS and PAN images simultaneously and obtain the complete spatial details, an appropriate weighted fusion strategy is introduced to merge the extracted spatial information from the PAN image with the spatial information obtained from the HS image.To avert artifacts at the boundaries, a guided filter which is an edge-preserving filter is applied to the obtained merged spatial information image.Consequently, we can effectively provide spatial information and accomplish sufficient spatial enhancement.In order to maintain the spectral information, an injection gains matrix is constructed.This gains matrix can also further reduce the spatial distortion by a defined tradeoff parameter.After a desired gains matrix is constructed, the fused HS image is obtained by adding spatial details to the interpolated HS image.Extensive experiments have been conducted on both simulated and real hyperspectral remote sensing datasets to verify the excellent fusion performance in spatial and spectral aspects.
The rest of this paper is organized as follows.Section 2 briefly introduces the basic theory of structure tensor.The proposed hyperspectral image fusion method is described in detail in Section 3. In Section 4, the experimental results and analysis for different datasets are presented.Conclusions are drawn in Section 5.

Related Work
A structure tensor can represent the structure and spatial information of images and has been shown to be an important tool in the field of image analysis [40,41].The structure tensor has been successfully applied to many image processing problems, such as texture analysis [42], anisotropic filtering [43], and motion detection [44].
For a gray image I(x, y), the change generated by a shift (∆x, ∆y) can be described as where (∆x, ∆y) includes {(0, 1), (1, 0), (1, 1), (−1, 1)}, and w is a smooth window, such as a Gaussian window [40].Then, by using the first-order Taylor series I(x + ∆x, y + ∆y) = I(x, y) + I x ∆x + I y ∆y + O(∆x 2 , ∆y 2 ), the change r are described as where I x = ∂I ∂x and I y = ∂I ∂y are the horizontal and vertical components of the gradient vector.For the small shift, the change r can be simplified as where a matrix T is the structure tensor, defined as This structure tensor T is a semi-definite matrix and can be decomposed as where µ 1 and µ 2 are the nonnegative eigenvalues, and e 1 and e 2 are the eigenvectors corresponding to the two eigenvalues.The two nonnegative eigenvalues describe the structure information of an image.When µ 1 ≈ µ 2 ≈ 0, the windowed image region is the flat area.If µ 1 > µ 2 ≈ 0, the area belongs to the edge region.When µ 1 ≥ µ 2 > 0, this indicates a corner.The trace is the sum of the eigenvalues and the determinant is the product of the eigenvalues, and a thresholding is used to classify and detect a edge or a corner [40].For one pixel of an image, structure tensor matrix J is defined as where ∇I = [I x , I y ] T is the gradient operator, and • is matrix product.

Proposed Hyperspectral Image Fusion Algorithm
Figure 1 shows a diagram of the proposed method, which consists of the following steps.First, the spatial information of an HS image is obtained by using an adaptive weighted method.Then, an image enhancement approach is applied to the PAN image to sharpen the spatial information.This is followed up by a structure tensor which is introduced to extract the spatial details of the enhanced PAN image.Subsequently, the extracted spatial information of the HS and PAN images is merged via a matching weighted fusion method, and a guided filter is performed on the merged spatial information to prevent artifacts.Finally, an injection gains matrix is constructed to avoid the spectral and spatial distortion, and a fused image is produced through injecting the integrated spatial details into each band of the interpolated HS image. is the sum of the eigenvalues and the determinant is the product of the eigenvalues, and a thresholding is used to classify and detect a edge or a corner [40].For one pixel of an image, structure tensor matrix J is defined as x y I I I is the gradient operator, and  is matrix product.

Proposed Hyperspectral Image Fusion Algorithm
Figure 1 shows a diagram of the proposed method, which consists of the following steps.First, the spatial information of an HS image is obtained by using an adaptive weighted method.Then, an image enhancement approach is applied to the PAN image to sharpen the spatial information.This is followed up by a structure tensor which is introduced to extract the spatial details of the enhanced PAN image.Subsequently, the extracted spatial information of the HS and PAN images is merged via a matching weighted fusion method, and a guided filter is performed on the merged spatial information to prevent artifacts.Finally, an injection gains matrix is constructed to avoid the spectral and spatial distortion, and a fused image is produced through injecting the integrated spatial details into each band of the interpolated HS image.

Upsamping and Adaptive Weighted for the HS Image
For the same scene, let

X
. Here, m and M ( < m M) denote the image height of the HS and PAN images, respectively.n and N ( < n N ) denote the image width of these two images, and d denotes the number of the HS image bands.The low spatial resolution HS image is upsampled to the scale of the PAN image by the finite impulse response (FIR) filter interpolation method.The FIR filter interpolation method first performs zero interpolation on  Y l H =↑ Y l H (7) for l = 1, 2, . . ., d, where ↑ is the upsampling operation, Y H ∈ R M×N×d is the interpolated HS image, Y l H ∈ R M×N is the lth band of the interpolated HS image, and Y l H ∈ R m×n is the lth band of the original HS image.
For the purpose of extracting the spatial information of the HS image, an adaptive weighted method [19] is applied to the interpolated HS image.
where S H ∈ R M×N is the spatial information of the HS image, and [ω 1 , ω 2 , . . ., ω d ] T is the weight vector.To obtain the weights {ω l } l=1,...,d , the PAN image is first reduced to the same spatial scale of the low spatial resolution HS image.Let us denote this reduced PAN image as Y P .Then, let us assume that ŜH = ∑ d l=1 ω l Y l H .The optimal set of weights {ω l } l=1,...,d can be calculated by linear ridge regression to minimize the mse between Y P and ŜH .We obtain the closed-form solution of the weight ω l as follows for l = 1, 2, . . ., d, where () T is the transpose operation, and () −1 is the matrix inversion.In Equation ( 9), Y l H ∈ R m×n and ŶP ∈ R m×n are converted to the mn × 1 dimensional form to calculate the solution.

Image Enhancement and Structure Tensor Processing for the PAN Image
To sharpen the spatial structure information of the PAN image, image enhancement processing is applied to the PAN image.The spatial filtering method is adopted to sharpen the PAN image.Compared with the Laplace algorithm, Laplacian of Gaussian (LOG) image enhancement algorithm can improve the robustness to noise and discrete points.We choose the LOG enhancement algorithm to sharpen the PAN image.The LOG algorithm first reduces noise by Gaussian convolution filtering.Subsequently, Laplace operator is utilized to enhance the spatial details.The Laplacian filtered image is finally combined with the PAN image to obtain the enhanced PAN image.This LOG enhancement procedure can be described as where ŶP ∈ R M×N denotes the enhanced PAN image, f LOG (x, y) denotes the kernel function of LOG operator, * denotes the convolution operator, and c is a constant.If the central coefficient of the kernel In this work, the size of the kernel is set to 15 × 15.The central coefficient of the kernel is a negative value, and c is −1.Based on the principle of the LOG operator, the kernel function f LOG (x, y) is defined as where f G (x, y) is the Gaussian convolution function, defined as where σ is the standard deviation, and σ is set to 0.43.Thus, the kernel function f LOG (x, y) is calculated by In order to extract the spatial details of the enhanced PAN image, the structure tensor processing is introduced.Based on Equation ( 6), structure tensor matrix of the enhanced PAN image at pixel i is defined by where ŶPx = ∂ ŶP ∂x and ŶPy = ∂ ŶP ∂y are the horizontal and vertical components of the gradient vector on the enhanced PAN image, and T i ∈ R 2×2 is the structure tensor matrix at pixel i on the enhanced PAN image.To include the local spatial structure information, a Gaussian kernel function is convoluted with the above structure tensor. Px,i where g r represents a Gaussian kernel with standard deviation r, the kernel size and the standard deviation of the Gaussian kernel are set to 1 × 2 and 0.5, T i represents the resulting structure tensor matrix at pixel i, and P 11 P 12 P 12 P 22 simply represents the tensor T i .According to the content of related work, the structure tensor T i can be decomposed as the form shown in Equation (5).Two nonnegative eigenvalues which represent the spatial structure information are calculated using the following formula where ξ 1 and ξ 2 are the nonnegative eigenvalues for the structure tensor matrix shown in Equation (15).The values of the two nonnegative eigenvalues divide the structure information into three types, i.e., flat area, edge area, and corner.If ξ 1 and ξ 2 are near zero, the area of this pixel is the flat area.When ξ 1 > ξ 2 ≈ 0 and ξ 1 ≥ ξ 2 > 0, the area of this pixel belongs to the edge area and corner, respectively.
For an image, we consider that the effective spatial information includes edge region and corner.
The trace of a matrix, denoted by R, is the sum of the eigenvalues, and is also the sum of P 11 and P 22 .The determinant denoted by D, is the product of the eigenvalues.We test on a large number of enhanced PAN images to study the trace and determinant at each pixel.Figure 2 shows the trace and determinant at each pixel on two of the enhanced PAN images.For a pixel, if R is near zero, the two eigenvalues are all near zero, and the area is the flat area.If R is larger than zero, at least one of the nonnegative eigenvalues is greater than zero, and the area at this pixel belongs to edge region or corner.Similarly, when D is near zero, at least one of the eigenvalues is near zero, and the area is flat area or edge region.when D is larger than zero, the two eigenvalues at this pixel are all larger than zero, and this pixel is a corner.The edge regions and corners are important spatial information.Based on the analysis and study on numerous enhanced PAN images, we suggest the following guidelines.When R > 1 * 10 (−5) , the pixel is identified as edge region or corner, and the area of this pixel is the effective spatial information.Thus, the value of this pixel should be retained.Otherwise, the area of this pixel is classified as the flat area, and the value of this pixel is not retained.This procedure of extracting the spatial details of the enhanced PAN image can be described as where S P ∈ R M×N is the spatial information of the enhanced PAN image, S P,i is the value of the spatial information at pixel i, and ŶP,i is the value of the enhanced PAN image at pixel i. where is the spatial information of the enhanced PAN image, P ,i S is the value of the spatial information at pixel i , and  P,i Y is the value of the enhanced PAN image at pixel i .Figure 3 shows the spatial information obtained by the gradient methods and the structure tensor method.Figure 3a shows a PAN image.Figure 3b shows the enhanced PAN image which is sharpened by using the LOG image enhancement algorithm.Figure 3c,d show the spatial information extracted by the horizontal gradient processing and the vertical gradient processing, respectively.According to Equation (17), the spatial information extracted by the structure tensor method is obtained and shown in Figure 3e.As shown in Figure 3, the extracted spatial information of the horizontal gradient method and the vertical gradient method only retain part of edge information of the original image.By contrast, the spatial information obtained by the structure tensor method contains most of the edge and structure information.This illustrates the structure tensor processing method in this subsection can effectively extract the spatial details of the enhanced PAN image.Figure 3 shows the spatial information obtained by the gradient methods and the structure tensor method.Figure 3a shows a PAN image.Figure 3b shows the enhanced PAN image which is sharpened by using the LOG image enhancement algorithm.Figure 3c,d show the spatial information extracted by the horizontal gradient processing and the vertical gradient processing, respectively.According to Equation (17), the spatial information extracted by the structure tensor method is obtained and shown in Figure 3e.As shown in Figure 3, the extracted spatial information of the horizontal gradient method and the vertical gradient method only retain part of edge information of the original image.By contrast, the spatial information obtained by the structure tensor method contains most of the edge and structure information.This illustrates the structure tensor processing method in this subsection can effectively extract the spatial details of the enhanced PAN image.

Weighted Fusion of Spatial Details
As shown in Figure 1, H S contains the spatial information of the HS image, and P S retains the spatial details of the enhanced PAN image.For a same scene, the HS image and the PAN image all include spatial information, and the spatial information of the two images is different and complementary.The PAN image has more spatial information, but may not include the details about the spatial structure of the HS image.Most conventional approaches only extract the spatial

Weighted Fusion of Spatial Details
As shown in Figure 1, S H contains the spatial information of the HS image, and S P retains the spatial details of the enhanced PAN image.For a same scene, the HS image and the PAN image all include spatial information, and the spatial information of the two images is different and complementary.The PAN image has more spatial information, but may not include the details about the spatial structure of the HS image.Most conventional approaches only extract the spatial information from the PAN image, and not consider the spatial structure of the HS image.They may lead to spectral distortion or inadequate spatial enhancement.To obtain the complete spatial details and consider the spatial information of the HS and PAN images simultaneously, a weighted fusion method is presented to integrate the spatial information of the HS image with the spatial information of the PAN image.
where λ 1 and λ 2 are weight coefficients, S F ∈ R M×N is the complete spatial details, S F,i is the value of S F at pixel i, S P and S H are the spatial information of the enhanced PAN image and the HS image, respectively, S P,i and S H,i are the values of S P and S H at pixel i.Since the PAN image contains more spatial details compared with the HS image, λ 1 and λ 2 are set to 0.9 and 0.1, respectively.Subsequently, to avoid artifacts at the boundaries, a guided filter is applied to the obtained fused spatial information image.The guided filter is an edge-preserving filter.It can smooth the input image while transferring the structure information from the guidance image to the output image [45].The fused spatial information S F is served as both the guidance image and the input image.The filtered image which is the continuous and smooth result of the input image has the spatial structure information of the guidance image.Thus, the filtered output image which is continuous can avert artifacts at the boundaries, and preserves the spatial details of the fused spatial information S F , simultaneously.
According to the principle of the guided filter, the output image is a local linear transformation of the guidance image.This procedure are described as where S ∈ R M×N is the output image, v k is a local square window centered at pixel k, the local window size is set to 40, a k and b k are linear coefficients assumed to be constant, a i and b i are the average coefficients of all windows overlapping i, and |s| is the number of pixels in v k .The linear coefficients a k and b k are computed by minimizing the difference between the input image S F and the output image S while maintaining the linear transformation in the window v k .The solution of a k and b k is obtained by calculating the linear ridge regression model.
where θ k and χ 2 k are the mean and variance of the guidance image S F in v k , S F,k is the mean of the input image S F in v k , ε is a regularization parameter, and parameter ε is set to 10 −4 .

Constructing Gains Matrix and Injecting Spatial Details
Before including the integrated continuous spatial details into the interpolated HS image, a injection gains matrix is constructed to control the spectral and spatial distortion.To reduce the spectral distortion, the ratios between each pair of the HS bands should preserve unchanged.It is significant for maintaining the spectral information to preserve such ratios.It is depicted as for l = 1, 2, . . ., d, where G ∈ R M×N×d denotes the injection gains matrix, and G l ∈ R M×N denotes the lth band of the gains matrix.For the sake of ensuring the spatial quality, we define a following tradeoff parameter to regulate the amount of the injected spatial details.
for l = 1, 2, . . ., d, where τ is the defined tradeoff parameter.The influence and setting of the tradeoff parameter τ have been expounded in the experimental part.Then, the spatial details are injected into the interpolated HS image to generate the fused HS image for each band.
where • is element-wise multiplication.

Experimental Results and Discussion
In this section, we design the experimental setup, and analyze the setting of the tradeoff parameter.To evaluate the fusion performance of the proposed method, four hyperspectral remote sensing datasets are used for experiments.

Experimental Setup
The proposed STF method is tested on four public hyperspectral datasets, which are shown in Table 1.Table 1 summarizes their characteristic.Pavia University dataset, Moffett field dataset, and Washington DC dataset are semi-synthetic dataset.Given a reference high spatial resolution HS image, the simulated low spatial resolution HS image and the simulated PAN image are generated.The simulated PAN image is generated by averaging the bands of the visible range of the reference image.According to the Wald's protocol [46], the low spatial resolution HS image is simulated by applying a 9 × 9 Gaussian kernel blurring and downsampling to the reference HS image, and the downsampling factor is 5. Hyperion dataset is a real dataset to evaluate the capability of the proposed method in real hyperspectral remote sensing image.
The proposed method is compared with six hyperspectral pansharpening methods, namely MTF-GLP with High Pass Modulation (MTF-GLP-HPM) [27], Bayesian sparsity promoted Gaussian prior (Bayesian Sparse) [30], constrained nonnegative matrix factorization (CNMF) [35], guided filter PCA (GFPCA) [38], Brovey transform (BT) [20] and principal component analysis (PCA) [15].MTF-GLP-HPM (abbreviated as MGH) belongs to multiresolution analysis (MRA) class.Bayesian Sparse fusion method (abbreviated as BSF) is one of the Bayesian methods.The CNMF algorithm and the GFPCA fusion approach belong to matrix factorization based methods and hybrid methods, respectively.These four methods which give the state-of-the-art fusion performance were all presented in recent years.The BT and PCA method which are the simple and classical fusion methods belong to component substitution (CS) family.These compared methods cover the recent effective works and the existing five categories which have been described in introduction section.In the experiments, the number of endmembers is set to 20 for the CNMF approach.For the GFPCA algorithm, the window size and the blur degree of the guided filter are set to 17 and 10 −6 respectively.The pixel values of every test image are normalized to the range of 0-1.0 to reduce the amount of calculation.
To assess the capability of the proposed fusion method, several widely used evaluation indices are adopted, i.e., cross correlation (CC) [47], spectral angle mapper (SAM) [47], root mean squared error (RMSE), and erreur relative global adimensionnelle de synthse (ERGAS) [48].CC is a spatial index and the best value is 1.SAM measures the degree of spectral similarity.The RMSE and ERGAS indices show the global quality of the fused image.The optimal value of SAM, RMSE, and ERGAS are 0. The experiments for the four datasets were all performed using MATLAB R2015b, and tested on a PC with an Intel Core i5-7300HQ CPU @ 2.50 GHz and 8 GB memory.

Tradeoff Parameter Setting
In the proposed method, the complete spatial details are finally included into the interpolated HS image.In order to reduce the spatial distortion, we define the tradeoff parameter τ to control the amount of the injected spatial details.The setting of the tradeoff parameter τ has an important impact on the spatial quality.Since the tradeoff parameter regulates the spatial distortion, the best value of τ can be chosen via the spatial index.Thus, for the sake of concluding the influence of τ, the proposed approach is tested on the Moffett field dataset and the Washington DC dataset to observe the CC values with different τ settings.Figure 4 shows the CC index values with different tradeoff parameter settings.When the tradeoff parameter τ is set to 0.1, the proposed method acquires the optimal CC values.We have also performed on numerous hyperspectral remote sensing images, and discovered that τ = 0.1 also provides the largest CC values.Therefore, for the proposed method, the tradeoff parameter τ is set as 0.1.
HS image.In order to reduce the spatial distortion, we define the tradeoff parameter τ to control the amount of the injected spatial details.The setting of the tradeoff parameter τ has an important impact on the spatial quality.Since the tradeoff parameter regulates the spatial distortion, the best value of τ can be chosen via the spatial index.Thus, for the sake of concluding the influence of τ , the proposed approach is tested on the Moffett field dataset and the Washington DC dataset to observe the CC values with different τ settings.Figure 4 shows the CC index values with different tradeoff parameter settings.When the tradeoff parameter τ is set to 0.1, the proposed method acquires the optimal CC values.We have also performed on numerous hyperspectral remote sensing images, and discovered that τ = 0.1 also provides the largest CC values.Therefore, for the proposed method, the tradeoff parameter τ is set as 0.1.

Experiments on Simulated Hyperspectral Remote Sensing Datasets
In this part, the experiments are performed on three simulated hyperspectral remote sensing datasets to evaluate the fusion performance of the proposed method.Three datasets are Pavia University dataset, Moffett field dataset, and Washington DC dataset, respectively.

Pavia University Dataset
Figure 5a shows the reference high resolution HS image of Pavia University dataset.Figure 5dj shows the fused HS images of each method for the Pavia University dataset.By comparing the fused images with the reference HS image visually, it can be observed that the GFPCA method looks blurry.This is because the GFPCA method utilizes the guided filter to transfer the spatial details from the PAN image to the HS image, but the spatial details are injected insufficiently.The BT approach provides enough spatial information, but the fused image obtained by the BT approach has spectral distortion in some areas, such as the trees and roads.Although the CNMF method has good fidelity of the spectral information, the CNMF method has deficient improvement of the spatial quality in some marginal areas, such as the edges of the trees and roofs.By contrast, we find that the PCA, BSF, MGH, and proposed STF method have the satisfactory fusion performance, and the MGH and STF methods achieve the better capability in preserving the spectral information compared with the PCA and BSF methods.In order to further compare the fusion performance, Figure 6 shows the error images (absolute values) of the competing methods for Pavia University dataset.Yellow means large

Experiments on Simulated Hyperspectral Remote Sensing Datasets
In this part, the experiments are performed on three simulated hyperspectral remote sensing datasets to evaluate the fusion performance of the proposed method.Three datasets are Pavia University dataset, Moffett field dataset, and Washington DC dataset, respectively.

Pavia University Dataset
Figure 5a shows the reference high resolution HS image of Pavia University dataset.Figure 5d-j shows the fused HS images of each method for the Pavia University dataset.By comparing the fused images with the reference HS image visually, it can be observed that the GFPCA method looks blurry.This is because the GFPCA method utilizes the guided filter to transfer the spatial details from the PAN image to the HS image, but the spatial details are injected insufficiently.The BT approach provides enough spatial information, but the fused image obtained by the BT approach has spectral distortion in some areas, such as the trees and roads.Although the CNMF method has good fidelity of the spectral information, the CNMF method has deficient improvement of the spatial quality in some marginal areas, such as the edges of the trees and roofs.By contrast, we find that the PCA, BSF, MGH, and proposed STF method have the satisfactory fusion performance, and the MGH and STF methods achieve the better capability in preserving the spectral information compared with the PCA and BSF methods.In order to further compare the fusion performance, Figure 6 shows the error images (absolute values) of the competing methods for Pavia University dataset.Yellow means large differences, and blue means small differences.From Figure 6, it can be seen that the proposed SFT method shows the smallest differences between the fused HS image and the reference HS image.
Quantitative results of different fusion methods are shown in Table 2, which indicates that the proposed method achieves the best performance.The SAM, RMSE, and ERGAS values of the proposed method are the best, and the CC value of the proposed method is the second best.These results demonstrate that the proposed STF algorithm performs well in both the objective and subjective evaluations.method shows the smallest differences between the fused HS image and the reference HS image.
Quantitative results of different fusion methods are shown in Table 2, which indicates that the proposed method achieves the best performance.The SAM, RMSE, and ERGAS values of the proposed method are the best, and the CC value of the proposed method is the second best.These results demonstrate that the proposed STF algorithm performs well in both the objective and subjective evaluations.differences, and blue means small differences.From Figure 6, it can be seen that the proposed SFT method shows the smallest differences between the fused HS image and the reference HS image.
Quantitative results of different fusion methods are shown in Table 2, which indicates that the proposed method achieves the best performance.The SAM, RMSE, and ERGAS values of the proposed method are the best, and the CC value of the proposed method is the second best.These results demonstrate that the proposed STF algorithm performs well in both the objective and subjective evaluations.

Moffett Field Dataset
The fusion results obtained by each method for Moffett field dataset are displayed in Figure 7d-j.Visually, the PCA and BT methods have high fidelity in rendering the spatial details, but cause spectral distortion.This is due to the mismatching between the PAN image and the replaced spatial component.Compared with the PCA and BT approaches, the GFPCA seems to have less spectral distortion, but the spatial details are not sufficient.The fused result obtained by the CNMF method has good spectral fidelity, but the edges and spatial structures are not sharp enough, especially in the rural areas.The visual analysis shows that the BSF, MGH, and STF methods give the better fused results.The MGH, and STF algorithms are clearer, especially in the rural regions and rivers.However, the pansharpened image obtained by the MGH approach is too sharp in some areas, such as the tall buildings in urban areas.By contrast, the proposed STF method has superior performance in terms of providing the spatial information while preserving the spectral information.Table 3 reports the objective quantitative results for each method.From Table 3, we can apparently see that the proposed STF method has the largest CC value, and smallest SAM, RMSE, and ERGAS values.

Moffett Field Dataset
The fusion results obtained by each method for Moffett field dataset are displayed in Figure 7dj.Visually, the PCA and BT methods have high fidelity in rendering the spatial details, but cause spectral distortion.This is due to the mismatching between the PAN image and the replaced spatial component.Compared with the PCA and BT approaches, the GFPCA seems to have less spectral distortion, but the spatial details are not sufficient.The fused result obtained by the CNMF method has good spectral fidelity, but the edges and spatial structures are not sharp enough, especially in the rural areas.The visual analysis shows that the BSF, MGH, and STF methods give the better fused results.The MGH, and STF algorithms are clearer, especially in the rural regions and rivers.However, the pansharpened image obtained by the MGH approach is too sharp in some areas, such as the tall buildings in urban areas.By contrast, the proposed STF method has superior performance in terms of providing the spatial information while preserving the spectral information.Table 3 reports the objective quantitative results for each method.From Table 3, we can apparently see that the proposed STF method has the largest CC value, and smallest SAM, RMSE, and ERGAS values.The spectral reflectance curve difference values between the reference image and each fused image on one single pixel are compared to assess the spectral preservation performance.Figure 8 shows the spectral reflectance difference values on four pixels which are marked in yellow in Figure 7a.As shown in Figure 8, a gray dotted line is served as the benchmark.The closer the spectral reflectance difference values between the reference image and the fused image get to the dotted line, the more the spectral information is preserved.From Figure 8, it can be observed that the spectral reflectance difference values of the proposed method are most approximate to the dotted line  The spectral reflectance curve difference values between the reference image and each fused image on one single pixel are compared to assess the spectral preservation performance.Figure 8 shows the spectral reflectance difference values on four pixels which are marked in yellow in Figure 7a.As shown in Figure 8, a gray dotted line is served as the benchmark.The closer the spectral reflectance difference values between the reference image and the fused image get to the dotted line, the more the spectral information is preserved.From Figure 8, it can be observed that the spectral reflectance difference values of the proposed method are most approximate to the dotted line (benchmark line) on the whole.These results validate the proposed method has the smallest difference when compared to other fusion methods.
(benchmark line) on the whole.These results validate the proposed method has the smallest difference when compared to other fusion methods.

Washington DC Dataset
The visual experimental results obtained by each method for the Washington DC dataset are shown in Figure 9d-j.In spite of good spatial quality, the fused images produced by the PCA and BT approaches cause spectral distortion in the roads and buildings.According to visual comparison of these results, the fused image generated by the MGH method has good fidelity of the spectral information.However, the MGH method suffers from spectral distortion in some areas, such as the roof areas.Compared with the PCA and MGH methods, the GFPCA algorithm has less spectral distortion.But the result of the GFPCA method has insufficient enhancement in the spatial aspect, and the fused image is blurry.The BSF, and STF method provide more spatial details compared to the CNMF method, since the CNMF method loses a little spatial information in the edges, such as in the roads and buildings.In contrast, the BSF, and STF method enhance more spatial information while preserving the spectral information of the original HS image.

Washington DC Dataset
The visual experimental results obtained by each method for the Washington DC dataset are shown in Figure 9d-j.In spite of good spatial quality, the fused images produced by the PCA and BT approaches cause spectral distortion in the roads and buildings.According to visual comparison of these results, the fused image generated by the MGH method has good fidelity of the spectral information.However, the MGH method suffers from spectral distortion in some areas, such as the roof areas.Compared with the PCA and MGH methods, the GFPCA algorithm has less spectral distortion.But the result of the GFPCA method has insufficient enhancement in the spatial aspect, and the fused image is blurry.The BSF, and STF method provide more spatial details compared to the CNMF method, since the CNMF method loses a little spatial information in the edges, such as in the roads and buildings.In contrast, the BSF, and STF method enhance more spatial information while preserving the spectral information of the original HS image.(benchmark line) on the whole.These results validate the proposed method has the smallest difference when compared to other fusion methods.

Washington DC Dataset
The visual experimental results obtained by each method for the Washington DC dataset are shown in Figure 9d-j.In spite of good spatial quality, the fused images produced by the PCA and BT approaches cause spectral distortion in the roads and buildings.According to visual comparison of these results, the fused image generated by the MGH method has good fidelity of the spectral information.However, the MGH method suffers from spectral distortion in some areas, such as the roof areas.Compared with the PCA and MGH methods, the GFPCA algorithm has less spectral distortion.But the result of the GFPCA method has insufficient enhancement in the spatial aspect, and the fused image is blurry.The BSF, and STF method provide more spatial details compared to the CNMF method, since the CNMF method loses a little spatial information in the edges, such as in the roads and buildings.In contrast, the BSF, and STF method enhance more spatial information while preserving the spectral information of the original HS image.To further compare the fusion capability, the error images (absolute values) of different approaches for the Washington DC dataset are shown in Figure 10.Yellow means large differences, and blue means small differences.As shown in Figure 10, the SFT method shows the smallest differences in most regions, which testifies the preeminent fusion performance of the proposed method.The values of objective quality evaluation of each method for the Washington DC dataset are tabulated in Table 4.As shown in Table 4, for the proposed method, the CC, SAM, and RMSE values are the best, which prove once again that the proposed method is superior to the compared hyperspectral pansharpening methods.To further compare the fusion capability, the error images (absolute values) of different approaches for the Washington DC dataset are shown in Figure 10.Yellow means large differences, and blue means small differences.As shown in Figure 10, the SFT method shows the smallest differences in most regions, which testifies the preeminent fusion performance of the proposed method.The values of objective quality evaluation of each method for the Washington DC dataset are tabulated in Table 4.As shown in Table 4, for the proposed method, the CC, SAM, and RMSE values are the best, which prove once again that the proposed method is superior to the compared hyperspectral pansharpening methods.

Experiments on Real Hyperspectral Remote Sensing Datasets
In this part, the experiments are performed on the real hyperspectral remote sensing dataset to assess the fusion capability of the proposed method.The real HS dataset is the Hyperion dataset.Figure 11a,b show the low spatial resolution original HS image and the high spatial resolution PAN image.The fusion results of the competing methods are shown in Figure 11d-j.By a visual comparison of the pansharpened images, the PCA method has significant spectral distortion.For the GFPCA method, the spatial details are injected insufficiently, and the fused HS image looks fuzzy.The BSF method is better than the PCA method in preserving the spectral information, while the spatial details is a little less in the regard to some regions, such as the roads and grass.By contrast, the BT, CNMF, MGH, and STF method achieve the superior property.Since the low spatial resolution original HS image is unclear, the spectral information of the BT, CNMF, MGH, and STF method cannot accurately be compared.In the spatial aspect, the proposed STF method has the better performance, since it adds more spatial details.

Experiments on Real Hyperspectral Remote Sensing Datasets
In this part, the experiments are performed on the real hyperspectral remote sensing dataset to assess the fusion capability of the proposed method.The real HS dataset is the Hyperion dataset.Figure 11a,b show the low spatial resolution original HS image and the high spatial resolution PAN image.The fusion results of the competing methods are shown in Figure 11d-j.By a visual comparison of the pansharpened images, the PCA method has significant spectral distortion.For the GFPCA method, the spatial details are injected insufficiently, and the fused HS image looks fuzzy.The BSF method is better than the PCA method in preserving the spectral information, while the spatial details is a little less in the regard to some regions, such as the roads and grass.By contrast, the BT, CNMF, MGH, and STF method achieve the superior property.Since the low spatial resolution original HS image is unclear, the spectral information of the BT, CNMF, MGH, and STF method cannot accurately be compared.In the spatial aspect, the proposed STF method has the better performance, since it adds more spatial details.For the real HS dataset, a reference high spatial resolution HS image is commonly not available.The original low resolution HS image can be served as the reference image.According to the Wald's protocol [45], the available original HS image is degraded to generate a degraded HS image.The available PAN image is also degraded to obtain a degraded PAN image.The degraded HS and PAN images are fused by each method to obtain the fusion results.These fusion results are compared to the original HS image to evaluate the objective fusion performance of different methods.Table 5 reports the objective fusion results for each method.From Table 5, we can apparently see that the proposed STF has the largest CC value, and smallest SAM and ERGAS values.These results demonstrate that the proposed algorithm obtains the excellent fusion performance.

Conclusions
In this paper, a novel hyperspectral remote sensing image fusion using structure tensor approach is presented.The proposed method is believed to be the first work using the structure tensor to fuse the HS and PAN images.The PAN image is first sharpened by the LOG image enhancement method.Then, structure tensor is applied to the enhanced PAN image to extract the spatial information, while the spatial details of the HS image are obtained by an adaptive weighted method, simultaneously.To obtain the complete spatial details and accomplish spatial consistency, a suitable weighted fusion algorithm is proposed to integrate the extracted spatial details of the HS and PAN images.Experimental results from the Pavia University, Moffett field, Washington DC, and Hyperion datasets have shown that the proposed method is superior to the other fusion methods in retaining the spectral information and improving the spatial quality.In the future, we will investigate the issue of how to determine the weight coefficients λ 1 and λ 2 adaptively.For the real HS dataset, a reference high spatial resolution HS image is commonly not available.The original low resolution HS image can be served as the reference image.According to the Wald's protocol [45], the available original HS image is degraded to generate a degraded HS image.The available PAN image is also degraded to obtain a degraded PAN image.The degraded HS and PAN images are fused by each method to obtain the fusion results.These fusion results are compared to the original HS image to evaluate the objective fusion performance of different methods.Table 5 reports the objective fusion results for each method.From Table 5, we can apparently see that the proposed STF has the largest CC value, and smallest SAM and ERGAS values.These results demonstrate that the proposed algorithm obtains the excellent fusion performance.

Conclusions
In this paper, a novel hyperspectral remote sensing image fusion using structure tensor approach is presented.The proposed method is believed to be the first work using the structure tensor to fuse the HS and PAN images.The PAN image is first sharpened by the LOG image enhancement method.Then, structure tensor is applied to the enhanced PAN image to extract the spatial information, while the spatial details of the HS image are obtained by an adaptive weighted method, simultaneously.To obtain the complete spatial details and accomplish spatial consistency, a suitable weighted fusion algorithm is proposed to integrate the extracted spatial details of the HS and PAN images.Experimental results from the Pavia University, Moffett field, Washington DC, and Hyperion datasets have shown that the proposed method is superior to the other fusion methods in retaining the spectral information and improving the spatial quality.In the future, we will investigate the issue of how to determine the weight coefficients λ 1 and λ 2 adaptively.

Figure 1 .
Figure 1.Diagram of the proposed hyperspectral image fusion algorithm.( m and M ( < m M) represent the image height of the original HS and PAN images, respectively.n and N ( < n N) represent the image width of the two images, and d represents the number of the HS image bands.).

Y
represent the high spatial resolution PAN image.Fusing the HS and PAN images aims to obtain a fused high spatial resolution HS image

Figure 1 .
Figure 1.Diagram of the proposed hyperspectral image fusion algorithm.(m and M (m < M ) represent the image height of the original HS and PAN images, respectively.n and N (n < N ) represent the image width of the two images, and d represents the number of the HS image bands.).

3. 1 .
Upsamping and Adaptive Weighted for the HS Image For the same scene, let Y H ∈ R m×n×d represent the original low spatial resolution HS image, and Y P ∈ R M×N×1 represent the high spatial resolution PAN image.Fusing the HS and PAN images aims to obtain a fused high spatial resolution HS image X H ∈ R M×N×d .Here, m and M (m < M) denote the image height of the HS and PAN images, respectively.n and N (n < N) denote the image width of these two images, and d denotes the number of the HS image bands.The low spatial resolution HS image is upsampled to the scale of the PAN image by the finite impulse response (FIR) filter interpolation method.The FIR filter interpolation method first performs zero interpolation on the low spatial resolution HS image, and then carries out the FIR filter processing to obtain the interpolated image.

Figure 3 .
Figure 3. Spatial information of an enhanced PAN image extracted by the gradient methods and the structure tensor method.(a) PAN image; (b) Enhanced PAN image; (c) Horizontal gradient method; (d) Vertical gradient method; (e) Structure tensor method.

Figure 3 .
Figure 3. Spatial information of an enhanced PAN image extracted by the gradient methods and the structure tensor method.(a) PAN image; (b) Enhanced PAN image; (c) Horizontal gradient method; (d) Vertical gradient method; (e) Structure tensor method.

•
Pavia University dataset: Pavia University dataset was acquired by the Reflective Optics System Imaging (ROSIS) over Pavia, Italy.The HS image consists of 115 bands covering the spectral range 0.4-0.9µm.The dimensions of the experimental PAN image are 250 × 250 with the spatial resolution of 1.3 m.The test HS image is of size 50 × 50 pixels with the spatial resolution of 6.5 m.For Pavia University dataset, 103 bands are applied to experimentation.• Moffett field dataset: Moffett field dataset is a standard data product which has been provided by the Airborne Visible Infrared Imaging Spectrometer (AVIRIS) [29].This dataset contains 224 bands in the spectral range of 0.4-2.5 µm.The size of the PAN and HS images that are used for experimentation are 250 × 160 and 50 × 32.The spatial resolution of the experimental PAN and HS images are 20 m and 100 m, respectively.The water absorption and noise corrupted bands are removed, and 176 bands are used for experimentation.• Washington DC dataset: Washington DC dataset is an airborne hyperspectral data over the Washington DC Mall.This dataset includes 210 bands in the spectral range of 0.4-2.4µm.Bands in the opaque atmosphere region are removed from the dataset, and 191 bands are left for experimentation.The test PAN image is of size 250 × 250 pixels, and the size of the HS image is of 50 × 50 pixels.• Hyperion dataset: The EO-I spacecraft launched in 2000, and carried two primary instruments which were Advanced Land Imager (ALI) and Hyperion [29].Hyperion instrument can provide the HS image which contains 242 bands covering the spectral range of 0.4-2.5 µm.ALI instrument is capable of providing the PAN image.For Hyperion dataset, 128 bands are applied to experimentation.The size of the test PAN image is 216 × 174 with the spatial resolution of 10 m.The experimental HS image is of size 72 × 58 pixels with the spatial resolution of 30 m.

Figure 4 .
Figure 4. CC values with different tradeoff parameter settings.

Figure 4 .
Figure 4. CC values with different tradeoff parameter settings.

Figure 8 .
Figure 8. Spectral reflectance difference values comparison on four single pixels shown in Figure 7a.

Figure 8 .
Figure 8. Spectral reflectance difference values comparison on four single pixels shown in Figure 7a.
Remote Sens. 2018, x, x FOR PEER REVIEW 14 of 19

Figure 8 .
Figure 8. Spectral reflectance difference values comparison on four single pixels shown in Figure 7a.

Table 1 .
Characteristic of the four used datasets.

Table 2 .
Quantitative results of different fusion methods for Pavia University dataset.

Table 2 .
Quantitative results of different fusion methods for Pavia University dataset.

Table 2 .
Quantitative results of different fusion methods for Pavia University dataset.

Table 3 .
Quantitative results of different fusion methods for Moffett field dataset.

Table 3 .
Quantitative results of different fusion methods for Moffett field dataset.

Table 4 .
Quantitative results of different fusion methods for Washington DC dataset.

Table 4 .
Quantitative results of different fusion methods for Washington DC dataset.

Table 5 .
Quantitative results of different fusion methods for Hyperion dataset.

Table 5 .
Quantitative results of different fusion methods for Hyperion dataset.