Multiband Image Fusion via Regularization on a Riemannian Submanifold

: Multiband image fusion aims to generate high spatial resolution hyperspectral images by combining hyperspectral, multispectral or panchromatic images. However, fusing multiband images remains a challenge due to the identiﬁability and tracking of the underlying subspace across varying modalities and resolutions. In this paper, an efﬁcient multiband image fusion model is proposed to investigate the latent structures and intrinsic physical properties of a multiband image, which is characterized by the Riemannian submanifold regularization method, nonnegativity and sum-to-one constraints. An alternating minimization scheme is proposed to recover the latent structures of the subspace via the manifold alternating direction method of multipliers (MADMM). The subproblem with Riemannian submanifold regularization is tackled by the projected Riemannian trust-region method with guaranteed convergence. The effectiveness of the proposed method is demonstrated on two multiband image fusion problems: (1) hyperspectral and panchromatic image fusion and (2) hyperspectral, multispectral and panchromatic image fusion. The experimental results conﬁrm that our method demonstrates superior fusion performance with respect to competitive state-of-the-art fusion methods.


Introduction
With the development of Earth observation satellites, multiband images (e.g., hyperspectral (HS) images, multispectral (MS) images and panchromatic (PAN) images), can be obtained.These data are widely applied in various applications [1,2], such as monitoring land use [3] and ice sheets [4].However, these images have critical trade-offs between the spatial resolution and the spectral resolution due to hardware limitations.For example, an HS image has a high spectral resolution with a reduced spatial resolution.An MS image offers a moderate spatial resolution with only a few bands.PAN images provide a much better spatial resolution.
Due to the increasing availability of optical imaging systems, they are of interest to multiband image fusion, including the fusion of HS and PAN images (HS-PAN image fusion), hyperspectral and multispectral (HS-MS) image fusion or simultaneous hyperspectral, multispectral and panchromatic (HS-MS-PAN) image fusion.Specifically, the multiband image fusion problem refers to the process of recovering a 3D data cube from two or three degraded data cubes.Thus, proper modeling of these data plays an essential role in achieving these goals [5].
There has been a remarkable effort in the community toward multiband image fusion [6][7][8][9].For example, HS-PAN image fusion (also known as hyperspectral pan sharpening) [6,8,9] can be broadly divided into six classes: component substitution (CS), multiresolution analysis (MRA), Bayesian, matrix factorization, deep learning and hybrid meth-ods.For improvement of the fusion performance, jointly fusing HS, MS and PAN images has been investigated [10].However, this method did not take into account the lowdimensional structure of multiband image, which has recently gained much interest [11,12].More importantly, it is a common trend to identify and represent the intrinsic structure of a multiband image.
A promising method for multiband imaging is subspace representation [12][13][14][15][16], which assumes the latent subspace to be of a low rank.A first successful attempt was presented in [13,14] by exploiting the local low-rank subspace of the spectral signatures in each patch.In the work of Zhang et al. [15], a similar local geometry represented by the group spectral embedding regularization method was proposed for HS-MS image fusion.Furthermore, Kanatsoulis et al. [16] exploited the multi-dimensional structure of HS imaging and approximated the spectral images via canonical polyadic decomposition.It has been demonstrated that these methods give competitive fusion performance.However, there are several challenges to identifying and tracking the underlying subspace across varying modalities.It is difficult to assert the uniqueness of the latent subspace to preserve the significant structure in the data, which is sensitive to the existence of outliers.
To address the aforementioned issues, a Riemannian submanifold regularization method is proposed to identify and capture the local geometry of the estimated HS image by exploiting the geometry structure in an abundance matrix [14].Specifically, this regularization method is implicitly imposed on the linear spectral mixture model (i.e., abundance matrix).The advantages of the Riemannian submanifold regularization method are that it can represent the abundance correlation and recover the subspace embedded in the HS image.These advancements motivated us to investigate the smooth structure of the underlying subspace for integrating the spatial and spectral information.

Scope and Contributions
In this work, this paper proposes an efficient multiband image fusion method with Riemannian submanifold regularization and additional constraints.The key idea is to establish a new multiband image fusion model to estimate the endmember and the abundance simultaneously.It overcomes the two problems in the following ways.First, the problem related to the identifiability of the latent subspace is addressed by the Riemannian submanifold regularization method, nonnegativity and sum-to-one constraints.Second, the subproblem of tracking the underlying subspace is reformulated as an optimization problem on a Riemannian submanifold.
The contributions can be summarized as follows: 1.
An efficient multiband image fusion model utilizing the Riemannian submanifold regularization method is proposed.This model is characterized by rank equality constraints with matrix manifold, nonnegativity and sum-to-one constraints.This is a new problem formulation for investigating the latent structures across varying modalities and resolutions.

2.
An alternating minimization scheme is proposed to recover the latent structures of the subspace using the framework of the manifold alternating direction method of multipliers.An efficient projected Riemannian trust region method with guaranteed convergence is adopted to track the latent subspace.

3.
The proposed method is validated in two applications: (1) hyperspectral and panchromatic image fusion and (2) the fusion of hyperspectral, multispectral and panchromatic images.The experimental results show that the proposed method is more effective than the competitive state-of-the-art fusion methods.

Related Work 1.2.1. HS-PAN Image Fusion
HS-PAN image fusion aims to perform pan sharpening with HS imaging [6].It is a special case of HS-MS image fusion [17].The early HS-PAN image fusion methods involve combining an MS image with a PAN one.Traditional pan sharpening methods are extended for HS-PAN image fusion.Two representative methods include CS and MRA.Some well-known CS methods are principal components analysis (PCA) [18], Brovey transform (BT) [19] and the intensity-hue-saturation (IHS) method [20].MRA methods are characterized by a multi-resolution decomposition, such as wavelet transform methods [21,22], intensity modulation with a smoothing filter [23] and the generalized Laplacian pyramid [24].Recently, many MS-PAN image fusion methods have been extended to fusing HS images with PAN ones.Vivone et al. [25] extended and investigated MRAand CS-based methods for fusing HS images.A guided filter in the PCA domain was proposed [18] for sharpening HS images, and it provided good performance.Recently, deep learning methods within the framework of a convolutional neural network [8,26] (CNN) were investigated.These methods showed promising fusion performance because of the ability to extract high-level features.However, deep learning methods require extremely large datasets for training.

HS-MS Image Fusion
HS-MS image fusion has attracted increasing interest in recent years [7,27].Among the existing state-of-the-art HS-MS image fusion methods, three dominant frameworks are matrix factorization (MF) approaches, tensor factorization (TF) approaches and deep learning approaches.
The key idea of MF approaches is to factorize the HS image into the spectral basis and the corresponding coefficients.The advantages of MF methods derive from the spectral signature represented by sparse representation [28,29] or a low rank [5,13,30].The sparse representation methods model the HS image via sparse dictionary learning.Recently, Dong et al. [31] promoted the nonlocal self-similarities in an HS image via the promising alternative framework of nonlocal sparse representation.The low-rank methods regard the spectral signatures represented by a low-dimensional subspace.For example, Simoẽs et al. [30] made use of the total variation regularization method to promote sparsity in the gradient domain.However, these methods rely on accurate estimation of the sensor parameters, such as the spectral response function.
TF approaches have been an active topic.These methods view an HS image datacube as a three-dimensional tensor, where the HS image is factored into different lowdimensional subspaces.Dian et al. [32] proposed a sparse tensor factorization-based fusion model by utilizing the properties of nonlocal self-similarity in the spatial domain and global correlation of an HS image in the spectral domain.Coupling the sparse tensor factorization method with a core tensor regularizer [11] was proposed for the fusion of HS and MS images, which provides promising fusion performance.Furthermore, the fusion framework with the spatial spectral graph-regularized low-rank tensor decomposition method was developed for HS-MS image fusion [33].In the work of Dian et al. [12], the prior low tensor train rank of the grouped 4D tensors was exploited.However, the TF methods involve the operation of singular value decomposition (SVD), which may lead to high computational cost.
Deep learning approaches have been successfully applied to this task [34][35][36].For example, Yang et al. [34] developed a deep two-branche convolutional neural network for image fusion, which takes the spectral correlation of HS and MS images into consideration.Huang et al. [35] proposed a deep hyperspectral image fusion network (DHIF-Net) via an iterative spatio-spectral regularization scheme.Specifically, they exploited the spatiospectral regularization and physical imaging models simultaneously.Recently, Xie et al. [36] introduced an interpretable deep network, utilizing the low-rank knowledge of an HS image and the imaging models of HS and MS images.However, deep learning-based approaches still have some challenges, such as estimating the architecture and complexity of the deep network and the high inference cost.
Furthermore, most existing MF-based methods for HS-MS image fusion share two common limitations: (1) identifiability of the underlying subspace across modalities and (2) tracking the underlying subspace due to the non-uniqueness of SVD.Although, these methods construct the subspace representation for a pair of multiband images with varied modalities or resolutions [30], few studies established the relationship between the identifiability and the tracking of the latent subspace for HS-MS image fusion.Furthermore, fusing multiband images more than two times is still an open issue.Meanwhile, the HS image is assumed to be of a low rank [30,37], which indicates that it lies in a common subspace across modalities.This motivated us to propose a new multiband image fusion model for recovering and tracking the latent subspace.It provides an alternative way to investigate the underlying structure of these high-dimensional datasets.
This paper is organized as follows.In Section 2, some preliminaries about the manifold optimization method are presented.In Section 3, the fusion model and the notations are presented.In Section 4, the details of the proposed alternating minimization scheme are presented.In Section 5, the experimental results are illustrated.A detailed discussion is presented in Section 6.At last, the conclusion is provided in Section 7.

Preliminaries for Riemannian Manifold Optimization
In this section, some ingredients of Riemannian manifold optimization are provided for finding the optimal solution to the problem [38,39].The Riemannian manifold is referred to as a manifold due to its tangent spaces endowed with a smoothly varying metric.This paper focuses on the manifold of low-rank matrices of rank r embedded in R n×m (i.e., M r ), where r < m, n.Embedded manifolds have many inherited properties from R n×m .
We focus on the optimization of functions f defined on Riemannian manfolds (M r , g): where g is a Riemannian metric defined by an inner product on the tangent spaces.However, there exists a challenge in this problem.
Recently, a class of operations called retractions was introduced to deal with this problem [38].In the context of Euclidean optimization, a gradient step update x t − ∇ f (x t ) takes the model problem outside of the manifold M r at every step t of the optimization algorithm.Thus, it has to be pulled back onto the manifold.An ideal operation is called exponential mapping, which maps the tangent vector to a point along a geodesic curve.Unfortunately, it may be computationally expensive to calculate the geodesic curve.

Riemannian Gradient and Tangent Space
A Riemannian optimization algorithm typically conducts a line search or solves a model problem in a tangent space [38].Therefore, the gradient and the Hessian of an objective function on a Riemannian manifold are two basic concepts.
Let it be given that M is a smooth submanifold of a Euclidean space, and let x ∈ M. The tangent space of M at x, denoted by T x M, is a group of derivatives of all the smooth curves passing through x, and The tangent space is a vector space, and its element in T x M corresponds to a linear mapping from the set of smooth, real-valued functions in the neighborhood from x to R. T x M is equipped with an inner product (or metric) Consider a smooth function f : M → R defined on a Riemannian submanifold.The Riemannian gradient of f at x (i.e., grad f (x)) is the unique tangent vector is the directional derivative of f along the direction ξ x , specifically when M r is a Riemannian submanifold embedded in R m×n .A useful property of embedded manifolds is that their Riemannian gradients are defined as the orthogonal projection on the tangent space of the gradient of f ; in other words, we have where ∇ f (x) is the Euclidean gradient of f at x.It should be noted that P T x M is viewed as a retraction [38,40].
The Hessian of f at x (i.e., Hess f (x)) is a mapping from T x M to T x M. Furthermore, when M is a Riemannian submanifold of a Euclidean space R m×n , the Riemannian Hessian of f is expressed as follows: where Dgrad f (x)[ξ x ] denotes a classical directional derivative of grad f (x) along the direction ξ x .Similarly, an orthogonal projection is performed.

Retractions
Retractions can find a point on the manifold that is in the direction of the gradient.They can approximate the geodesic gradient flow on the manifold [38].In other words, retractions allow us to move on a manifold (i.e., moving in the direction of a tangent vector) while staying on the manifold.It provides an alternative to exponential mapping when designing optimization algorithms on manifolds.For the manifold M r , a retraction is defined as follows: Mathematically, the retractions are defined as a mapping, where R x is a smooth mapping from T x M to M such that R x (0) = x.The corresponding differential of retraction at zero is an identity mapping (i.e., d dt R x (tξ)| t=0 = ξ, ∀ξ ∈ T x M), which is referred to as the local rigidity condition [38].

Degradation Model for Multiband Imaging
Let P ∈ R n s ×n denote the matricization of the latent multiband image.We denote n s to be the number of spectral bands, while n = n x × n y is the vectorization of the HS image at the band n x .The goal is to recover an HS image of the high spatial resolution P from K observations (i.e., Y k ∈ R n s k ×n k , k = 1, • • • , K, where n s k and n k represent the number of bands and pixels, respectively).Moreover, we assume that n s k ≤ n s , and n k = n/d 2 k .d k denotes the scale factor.
Usually, the responses of the imaging sensors are treated as a number of linear transformations.The corresponding spectral and spatial degradations are denoted as the operators.The input multiband images are assumed to be geometrically registered.The degradation model can be expressed as where R k ∈ R n k ×n s is the spectral response of the imaging sensor, B k denotes the point spread function of the kth imaging sensor, S k represents a downsampling operator for the spatial dimensions subject to S T k S k = I n k and N k indicates a spectral independent perturbation matrix.In particular, Y k involves some preprocessing steps [41], such as radiometric calibration and geometric correction.
One commonly used approach for the spectral representation of an observed multiband image is the linear mixture model [42] (LMM), which reveals the relationship between mixtures and endmembers.This model assumes that every pixel of an HS image can be decomposed into a convex combination of a small number m n s of endmembers [43].This indicates that the number of endmembers is usually sparse (i.e., the number of nonzero entries is very small compared with the size of the abundances).The latent multiband image is the product of an endmember matrix E and the abundance matrix X.Thus, the LMM model can be expressed as follows: where E ∈ R n s ×m represents the spectral signatures matrix with respect to the endmembers, X ∈ R m×n is the matrix of endmember abundances stored in columns and N ∈ R n s ×n denotes the measurement noise or model error [42].It can be seen that each row of the matrix P consists of all the pixels in a given spectral band.By substituting Equation (2) into Equation (1), we can obtain a multiple multiband image formation model: where Nk denotes the kth multiband image's additive perturbation matrix, which can be is expressed as follows: Given the observation Y k and its corresponding endmember matrix E, the measurement P can be recovered from the the abundance matrix X.However, additional constraints on E or X should be used to improve the quality of the solutions.Thus, additional information is introduced (i.e., the Riemannian submanifold regularization method, nonnegativity and sum-to-one constraints).

Proposed Fusion Model
For the multi-source information modeling, naturally, the prior information about X is required.To reconstruct P, the proposed fusion model can be formulated as an optimization problem: where M r = {X ∈ M|rank(X) = r} represents the rank equality constraints with the matrix manifold, the symbol rank(•) denotes the rank of the matrix, r is a positive integer, the term α 2 X 2 F is the Riemannian submanifold regularization (i.e., the constraint set M r ), α is a positive regularization parameter, I Ω (•) denotes an indicator function and the expression Ω = {X|X ≥ 0, 1 T m X = 1 T n } indicates that all the elements of X are greater than or equal to zero and sum to one.The proposed framework for multiple multiband image fusion is presented in Figure 1.

Riemannian Submanifold Regularization
We observe that multiband image lies on a low-dimensional manifold M. Specifically, we take a different approach which exploits the problem structure of rank equality constraints with a matrix manifold (i.e., the latent structures of the abundance matrix).More specifically, the structural information of the abundance matrix is captured and represented by the geometric structure of the matrix manifold M r .This paper proposes a regularization method to extract a common subspace from multiple observations indirectly, which provides more flexibility to handling data with nonlinear structures.It can be seen that this regularization process is implemented through rank equality constraints with a matrix manifold.
Definition 1 (Riemannian submanifold regularization).Let X ∈ M r be a smooth manifold [38] with the constraint set of matrices of rank being r at most, where r < min(m, n) is a positive integer.To stabilize the fusion model in Equation ( 3), additional a priori knowledge about the abundance matrix X is defined as follows: where M r is viewed as a submanifold of a dimension (m + n − r)r embedded in the Euclidean space R m×n .Here, the constraint rank(X) = r denotes the rank of matrix X as equal to r, which can be viewed as a rank equality constraint with a matrix manifold.The Tikhonov regularization term [44] (i.e., α 2 X 2 F ) ensures the available solution.
However, it is challenging to deal with the rank equality constraints with a matrix manifold, where the underlying subspace is spanned by endmembers indirectly.The corresponding subspace of the target image P lies in a low-dimensional manifold.Moreover, these constraints are nonconvex, which may be numerically expensive.
In this paper, the manifold geometry [45] of Riemannian submanifold regularization (i.e., the set of rank equality constraints) is utilized to address this issue.Thus, both the constraint search space and the geometry of the smooth embedded submanifold should be considered.Extensive studies of the optimization method on a Riemannian manifold led to the retraction-based framework using the differential geometry structure of the underlying manifold.

Nonnegativity and Sum-to-One Constraints
In addition to the geometric structure given by Riemannian submanifold regularization, it is reasonable to assume that the coefficients should satisfy the following constraints [42,43]: m means an m × 1 vector with all ones.The combined constraints preserve the inherent characteristics of the solutions.In other words, it makes a solution more feasible.The nonnegativity and sum-to-one constraints can be rewritten as follows: After introducing the indicator function I(X), we can obtain the following formulation: ∈ Ω.

Alternating Minimization Scheme
Given the fusion problem in Equation ( 4), we propose a numerical solution using the framework of the manifold alternating direction method of multipliers.The constrained search space is equipped with the Riemannian structure of the submanifold, and the nonnegativity and sum-to-one constraints are considered.A key challenge of this problem is the geometry of the underlying manifold.
After applying the variable splitting strategy, we solve the problem in an alternating minimization way.Several auxiliary variables Λ 1,••• ,k , W, Z are introduced.We have the following the minimization problem: Then, the augmented Lagrangian formulation can be expressed as where are Lagrangian multipliers and µ ≥ 0 is a penalty parameter.Then, the problem has the following formulations: {Λ Thus, the problem (i.e., Equation ( 7)) breaks down into three subproblems.The details are summarized in Algorithm 1.The computational complexity of the proposed method is max(O(r 3 ), O(mn log(mn))), where r < min(m, n): Note that the subproblem in Equation ( 14) can be cast as an optimization problem on a Riemannian submanifold.This problem can be solved by the retraction-based optimization method on the manifold.In addition, the solution to the subproblem Z has a closed-form expression.The value of µ is updated at each iteration.The optimization of L with respect to each of the variables X, Λ is described in Appendix A. For the variables W and Z, the solutions are described in detail in Appendix B.

Convergence Analysis
This subsection studies the behavior of the convergence of the proposed method.Note that the convergence behavior and results of the ADMM were widely studied in [49].Thus, we analyzed the convergence of Algorithm A1 for the subproblem with Riemannian submanifold regularization, which makes use of the retraction-based manifold optimization theory [38]: Theorem 1.If there exists a global minimizer W * to the problem in Equation (A6), then the first-order necessary optimality condition holds: where T W M r denotes the tangent cone of the constraint set M r at W * .
Proof.Suppose that W n ∈ M r is a sequence generated by Equation (A6); in other words, we have lim Since f n is bounded and coercive, it gives Thus, this indicates that the sequence W n is uniformly bounded.Furthermore, all points belong to the set W, which is compact.Then, it follows that the sequence is bounded.
It is of interest to note that {W n } has an accumulation point W * .As the feasible set M r is closed, and f n : M r → R is continuous, therefore, we conclude that W * has a global minimizer.
Then, the first-order necessary optimality condition of the global minimizer is presented directly: It is noticeable that rank(W * ) = r.Thus, T W M r turns into a linear subspace in R m×n (i.e., the tangent space at W * ).The inequality condition can be where P T W * M r denotes the orthogonal projection onto the linear subspace T W * M r .
Theorem 2. Assume that κ is a positive constant and { n } is a sequence of nonnegative scalars.If {W n } ⊂ M r is a sequence generated by Algorithm A1, then we have If {W n s } is a convergent subsequence of {W n } with the limit point W * ∈ M r , then rank(W * ) = r, and lim s→∞ n s = 0. Finally, W * satisfies the first-order optimality condition for the problem in Equation (A6).

Proof. It can be seen that
where the conditions in Equations ( 16) and (17) indicate that lim n→∞ W n − W n+1 = 0.If {W n s } is a subsequence that converges to {W * } with a rank r, then we have rank(W n s ) = r, ∀s.
Since M r is a smooth manifold in a neighborhood of W * , the condition in Equation ( 17) indicates that for all sufficiently large values of s.The continuity of the smooth mappings (W) ∈ M r P T W M r holds true by passing s → ∞.
For studying the convergence of W n+1 , we have the following direct lemma (Lemma 1).In particular, it can be concluded that Lemma 1.Let {W n } ⊂ M r be a sequence generated by Algorithm A2.Then, the following statements hold true: Proof.It can be seen that the statement is the first part of the proof for Theorem 2.

Experimental Settings
We conducted two experiments to assess the performance of the proposed method: (1) hyperspectral and panchromatic image fusion and (2) hyperspectral, multispectral and panchromatic image fusion.All experiments were performed on a laptop with Windows 10 64 bit and implemented and tested in MATLAB 2019b.
For HS-MS-PAN image fusion, the proposed method was compared with some stateof-art methods [5,30,54,59,60].These methods contain two procedures.First, two popular methods for pan sharpening were used separately (i.e., the MT method and band-dependent spatial detail (BD)) [59].Second, RF-, CF-and HY-based methods were used for HS-PAN image fusion.In addition, the endmembers presented in the HS image were extracted with a vertex component analysis (VCA) algorithm [46].Meanwhile, sparse unmixing with the variable splitting and augmented Lagrangian (SUnSAL) method was utilized to extract the abundance matrix [47].

Datasets and Quality Measures
Four real-world datasets were used [61]: the Botswana, Washington DC Mall, Indian Pines and Kennedy Space Center datasets.The tests in our experiments were performed in a semi-synthetic way.The intensities of these images were arranged from 0 to 1.The observed three input images (PAN, MS and HS images) from the referred HS image (ground truth) were generated according to Wald's protocol [62].Gaussian white noise with zero mean was added to each band of the three images such that the band-specific signal-tonoise ratio (SNR) was 30 dB for the MS and HS images and 35 dB for the PAN image.The bands related to the noisy measurements were removed from the reference image.A brief description of the datasets is provided: 1.
Botswana dataset: The HS image was collected by a Hyperion sensor over Okavango Delta, Botswana in 2001-2004.The number of bands in our experiment was 145.The spatial resolution is 400 × 240.The observed scene contains the land cover type information.

2.
Indian Pines dataset: The imaging sensor is the airborne visible infrared image spectrometer (AVIRS) airborne hyperspectral instrument.Images were captured over northern and western Indiana in the USA.The number of bands was 200.The dimensions of the HS images are 400 × 400.The scenery of this dataset includes housing, built structures, and forests.

3.
Kennedy Space Center dataset: This dataset was captured at Kennedy Space Center in Florida, United States by an AVIRIS.This dataset comprises 176 bands with an image size of 500 × 400.The content of the HS image contains various land cover types.4.
Washington DC Mall dataset: This was collected by the hyperspectral digital image collection experiment (HYDICE) over the Washington DC Mall in the United States.
A portion of the original data was used.The resolution of the HS image is 400 × 300.The number of spectral bands was 191.
Four widely used quality measures [27] were used in our experiments.These measures included the relative dimensionless global error in synthesis (ERGAS), spectral angle mapper (SAM) and universal image quality index (UIQI).

Hyperspectral and Panchromatic Image Fusion
Given an HS image, the goal of HS-PAN image fusion is to fuse a PAN image with the same spectral information while keeping some innovative content.All the methods were applied to the three datasets (i.e., Botswana, Indian Pines and Washington DC Mall).
The blurring kernel was Gaussian with a size of 3 × 3 and standard deviation of 0.8.The scaling factor was four.
Table 1 gives a quantitative evaluation of all the competing methods.The best fusion results are marked in bold.We can see that the proposed method delivered better ERGAS, SAM and UIQI scores than its competitors in most cases.Meanwhile, the SFIM-and MT-HPM-based methods introduced significant spectral distortions.Remarkably, the proposed method performed relatively satisfactory on all three datasets, demonstrating the effectiveness of the proposed method.
The visual comparison results are illustrated in Figure 2. The ground truth image is provided in Figure 2a.The visual results of the referred fusion methods are demonstrated.The selected part is enlarged and provided in the bottom right corner.The PCA-GF method seems to be insufficient because the fused image lost some local details.The visual result of the proposed method is presented in Figure 2p.As can be seen from Figure 2, the proposed method gave better fused images than the other methods in terms of smooth areas and texture regions.In other words, the proposed method provided the fused results closest to the ground truth in Figure 2a.We conducted the experiments to jointly fuse HS, MS and PAN images.In our experiments, these images were fused in four ways: PAN + HS, (MS + HS) + PAN, (PAN + MS) + HS and PAN + MS + HS.We compared the proposed method with two stateof-art fusion methods: the RF method [54], and HY method [30].The experiments were performed under the same experimental setting with 100 iterations.
The quantitative results of all the referred methods are presented in Table 2.The best results are indicated in bold.The referenced fusion methods were combined for fusing HS, MS and PAN images.Note that the proposed approach outperformed the other methods in almost all the quality measures.Table 2 also presents the computational time of the fusion methods across the complete datasets.We can see that the proposed method generally demonstrated competitive performance.These results suggest that our method could recover the detailed structures of the underlying HS image well.The visual fusion results are illustrated in Figures 3 and 4. Examples of two datasets (i.e., the Indian Pines and Washington DC Mall datasets) are shown separately in Figures 3a and 4a, respectively.The visual fusion results of our method are presented individually in Figures 3b and 4b.The reconstructed images of the HY method are presented in Figures 3c and 4c one by one.In Figures 3d and 4d, the per pixel residuals of these two datasets are presented visually, where all the pixels of each dataset were sorted by magnitude and plotted.We can observe that our method was better than the other methods in terms of the normalized root mean square error (NRMSE).
The reconstruction errors of the referred methods with respect to the original data are illustrated in Figure 3e and Figure 3f for the Indian Pines dataset and Figure 4e and Figure 4f for the Washington DC Mall dataset, respectively.These figures indicate that our method can achieve better fusion performance compared with the competing methods, which supports the effectiveness of the proposed method.We have shown that the underlying subspace was captured and tracked by the proposed method effectively.The numerical experiments show that the proposed fusion model with hybrid constraints can reconstruct more local details in comparison with the referred fusion methods.The outcomes of the proposed method (e.g., the quality measures in Tables 1 and 2 and visual comparisons from Figures 2-4 provide evidence that the involvement of a Riemannian submanifold regularization method led to the improvement in fusion performance.The fusion experiments indicate that the use of a modeling strategy and the alternating minimization scheme is critical for the success of the proposed method.

Discussion
We showed that the proposed fusion model and its alternating minimization scheme can track the underlying subspace effectively.This difference might be related to the method used.It is possible that the geometric representation of the latent subspace across modalities provides an efficient way to describe the geometry of the subspace.One limitation of the proposed method is that the input multiband images should be registered spatially.
In summary, the proposed method with the Riemannian submanifold regularization method is useful for estimating high-fidelity hyperspectral images from multiband images.Although the local structure of the abundance matrix was selectively exploited, the geometry of the constraint manifold provided a more intuitive understanding of the geometric structure of the underlying subspace.The proposed framework provides new insights into existing multiband image fusion methods.

Conclusions
In this paper, an efficient multiband image fusion method was proposed to obtain a high-fidelity fused image using Riemannian submanifold regularization, nonnegativity and sum-to-one constraints.Furthermore, the identifiability and tracking of the underlying subspace was completed by an alternating minimization scheme.To exploit the internal structures of the abundance matrix, the constrained search subspace was reformulated as an optimization problem on a smooth Riemannian submanifold.An efficient projected Riemannian trust region method was developed.The experiments for two multiband image fusion problems showed that the proposed method outperformed the state-of-the-art competing fusion methods.It should be noted that there also remains room for further study.For example, some developed image fusion models account for spectral variability.a comprehensive introduction to the Riemannian manifold optimization, the reader is referred to [38].For simplicity, the subproblem in Equation ( 14) is rewritten as follows: This problem has a closed-form solution.It can be viewed as the projection of 1 α+µ (X n − G n−1 ) onto manifold M r .The tangent space is denoted by T W M r , which contains all tangent vectors to M r at W. The Riemannian gradient of f n W (W) at W on M r is denoted by grad f n W (W). Every point W ∈ M r corresponds to a tangent space T M r .An element ξ ∈ T W M r is a tangent vector at W. Each tangent space is associated with an inner product, which we denote by •, • , and the corresponding norm, denoted by • .For a tangent vector in the tangent space T W M r , we have where D f n W [ξ] denotes the directional derivative of f n W at W along the direction ξ.The Riemannian gradient with respect to W is derived as follows: Based on the definitions of the tangent vector ξ and tangent space T W M r , the Hessian of f n W at W ∈ M r is denoted by Hess f n W (W), which is a linear operator from T W M r to T W M r , and Hess f n W [ξ] = ∇ ξ grad f n W (W), ∀ξ ∈ T W M r .After considering a factorizationbased second-order retraction, the Riemannian Hessian of f n W (W) with respect to W is derived as follows: where W = UΣV T represents the compact SVD of V, consisting of a diagonal matrix Σ ∈ R r×r and two orthogonal matrices U ∈ R m×r and V ∈ R n×r .In this paper, we assume that grad f n W (W n ) = 0. Inspired by the Riemannian trust region method [64,65], a quadratic function is defined for approximating f n W around W n in the tangent space T W M r .For some ∆ k ≥ 0 in each iteration k, it is very important to obtain a search direction ξ k .To evaluate how well the model (Equation (A7)) approximates in the neighborhood of 0 W n .This can be calculated by Meanwhile, a predefined bound is defined (i.e., ∆ > 0).Moreover, there are different ways to solve the trust region subproblem in Equation (A7).One solution to this subproblem is the truncated conjugate gradient method of Steihaug and Toint [65].

Figure 1 .
Figure 1.Framework of the proposed multiple multiband image fusion method.

Figure 3 .Figure 4 .
Figure 3. Visual fusion results of different algorithms on Indian Pines dataset with color image composites of the bands (30, 24 and 5): (a) ground truth; (b) proposed method and (c) HY method.(d) NRMSE curves from Indian Pines dataset.(e) Reconstruction errors of our method.(f) Reconstruction errors of HY method.

Figure 4 .
Figure 4. Visual fusion results for different algorithms on Washington DC Mall dataset and color image composites of the bands (30, 24 and 5): (a) ground truth; (b) the proposed method and (c) HY method.(d) NRMSE curves from Washington DC Mall dataset.(e) Reconstruction errors of our method.(f) Reconstruction errors of HY method.

Table 1 .
Numerical results of hyperspectral and panchromatic image fusion for three datasets.

Table 2 .
Quantitative evaluation of fusing hyperspectral, multispectral and panchromatic images on four datasets.