Weighted Group Sparsity-Constrained Tensor Factorization for Hyperspectral Unmixing

: Recently, unmixing methods based on nonnegative tensor factorization have played an important role in the decomposition of hyperspectral mixed pixels. According to the spatial prior knowledge, there are many regularizations designed to improve the performance of unmixing algorithms, such as the total variation (TV) regularization. However, these methods mostly ignore the similar characteristics among different spectral bands. To solve this problem, this paper proposes a group sparse regularization that uses the weighted constraint of the L 2,1 norm, which can not only explore the similar characteristics of the hyperspectral image in the spectral dimension, but also keep the data smooth characteristics in the spatial dimension. In summary, a non-negative tensor factorization framework based on weighted group sparsity constraint is proposed for hyperspectral images. In addition, an effective alternating direction method of multipliers (ADMM) algorithm is used to solve the algorithm proposed in this paper. Compared with the existing popular methods, experiments conducted on three real datasets fully demonstrate the effectiveness and advancement of the proposed method. used to improve the sparsity. The ﬁnal experiments prove the superiority of the proposed WSCTF compared with other state-of-the-art methods.


Introduction
Hyperspectral remote sensing images have a wide range of applications in the field of earth observation because of the rich spectral information [1][2][3]. However, the increase in spectral resolution usually results in a decrease in spatial resolution due to the influence of spectral splitting [4]. As a result, hyperspectral images generally have low spatial resolution, which leads to the inclusion of multiple substances in the image pixels. Therefore, mixed pixels are widely present in the hyperspectral images (HSI). The purpose of hyperspectral unmixing is to extract all pure materials (called endmembers) in the HSI and the percentage of endmembers in each pixel (called abundances) [5].
To solve the above-mentioned unmixing problem, numerous models have been proposed, the most popular of which is the linear mixed model (LMM). Under the LMM, each of the mixed pixels is regarded as the linear expression form of the endmembers, and the expression coefficients of endmembers constitute the abundance matrix [6] of the image. Among them, nonnegative matrix factorization (NMF) has been favored by researchers for its clear physical meaning and simple optimization process. The purpose of NMF is to find a base matrix of the original data and the expression coefficients under this base matrix [7]. In unmixing task, the optimized base matrix is the endmember matrix while the expression coefficient matrix is the abundance matrix. However, the solution of the simple NMF model is a non-convex problem, which leads to unstable unmixing results [8]. Therefore, to improve the performance of unmixing, many improved NMF-based unmixing methods have been proposed, for instance, sparse-based and spatial-based.
According to the prior of the distribution of ground objects, the endmembers in the mixed pixels in HSI are much smaller than the endmember spectral library [9]. Therefore, the abundance matrix is reasonably sparse. As a result, the sparsity constraint of abundance matrix becomes one of the most popular constraints. The L 0 regularization is the most ideal sparse regularizer [10]. Unfortunately, the NMF-based unmixing algorithm with L 0 sparsity constraint is an NP-hard problem [11]. As an improvement, the L p norm is widely used in the unmixing frameworks based on NMF to improve the stability and effectiveness of the algorithm. Guo et al. [12] uses the L 1 regularization to relax the L 0 regularization while [13] uses the L 2 regularization to constrain the abundance matrix. L 1/2 has also been proven to make the abundance matrix get a good sparse result [14,15]. Some methods also use the L 2,1 regularization to simultaneously explore the sparse structure in the abundance map [16], which has been proven to have good performance in the unmixing algorithm.
Besides, the spatial distribution of ground objects has the characteristics of local similarity and global similarity [17,18]. That is, there is a certain spatial similarity structure in HSI. Therefore, spatial similarity constraints are also really popular technics to improve the effectiveness of NMF-based methods. Lu et al. [19] propose a manifold regularization, which learns the global similarity structure of data to constrain the abundance matrix and force the abundance vectors of similar pixels to be similar. Manifold constraints have also proved by [20,21] to be a really effective way to explore the spatial structures. In [22], it is considered that the abundance of a pixel is the consistence as the average value of its neighboring pixels to maintain the smooth characteristics of HSI. In addition to these, the total variation regularization [23] and its variation [24] are also applied to the NMFbased unmixing algorithms, which can effectively retain the smoothing relationship of the image in the abundance matrix. In [25], the clustering algorithm is used to explore the global similarity relationship of pixels in the HSI, and it is combined with the NMF framework to improve the performance of unmixing.
The NMF-based methods mentioned above have significantly improved the performance of unmixing. Unfortunately, this kind of algorithms has a great flaw. The NMF-based framework reshapes the original 3-D HSI into a 2-D matrix before unmixing. This method of image reshaping severely destroys the 3-D cube structure of HSI, and it is difficult to completely explore the missing spatial information even if spatial regularization constraints are subsequently used [26]. To slow down the loss of information and make better use of all the spatial and spectral information in HSI, many researchers have begun to use nonnegative tensor factorization (NTF) instead of NMF framework to carry out unmixing tasks. The NTF framework directly treats HSI as a 3-D tensor, and regards the image as a whole during the optimization process, which can completely transfer all the spatial relationships of the original image to the abundance tensor.
Zhang et al. [27] innovatively applies tensor factorization (TF) to unmixing for the first time, which obtains performance comparable to NMF framework. The canonical polyadic decomposition-based (CPD) method is used to extract endmember matrix in [28]. However, this kind of methods decomposes the image into a finite number of rank-1 tensors, the number of which is generally much larger than the actual number of endmembers. Another TF method, tucker factorization, is also applied to the unmixing task in [29]. Regrettably, tucker factorization often fails to correspond the unmixing result with the actual distribution of objects. To make the result of TF method has a clear physical meaning, Qian et al. [26] propose a matrix-vector-based NTF (MV-NTF) method, which decomposes the HSI into multiple component tensors, then each component tensor is expressed by the outer product of a matrix and a vector. This method greatly promotes the application of NTF framework in unmixing task. In addition, many researchers add some effective regularizations to improve the effectiveness of the MV-NTF method. For instance, a spatial low-rank tensor regularization is proposed by [30,31] to maintain the sparsity of the abundance tensor. Sun et al. [32] uses a local spatial similarity constraint to explore the local smooth features of the data. A regularization for exploring the low-rank spatial structure of data is also introduced in [33]. The purpose is to preserve the HSI properties completely in the abun-dance tensor to improve the performance of unmixing. In addition, TV regularization has also been added to the NTF framework, such as [23,34].
Although the above methods have good performance in hyperspectral unmixing, the NTF-based models still have room for improvement in combining spatial and spectral information. According to [24], the difference image obtained after the image is differentiated along the horizontal and vertical directions is also sparse, which not only has zero elements, but also has zero connected regions in space. Therefore, a weighted group sparsity-constrained tensor factorization (WSCTF) method is proposed in this paper. Specifically, the HSI is differentiated along the horizontal and vertical directions at the same time to obtain two three-dimensional difference images firstly, which can explore the sparse structure of the HSI. Subsequently, the difference images are folded along the mode-1 and mode-2 to obtain two sparse matrices. Then, the L 2,1 norm is used to constrain the sparse structure of difference matrix, which can not only maintain the sparseness of the difference image, but also constrain the zero connected region. To enhance the sparsity, the proposed method WSCTF also provides a weight for the sparsity constraint in vertical and horizontal directions. In addition, the L 2,1 regularization is used to constrain the sparsity of the abundance tensor. In summary, the contributions of this paper can be concluded as: • A sparse constrained regularization is proposed to explore the sparse structure of the HSI differential image in the horizontal and vertical directions. In addition, weight coefficients are used to enhance sparsity. • The L 2,1 norm is utilized to constrain the sparse regularizer, and the L 2,1 norm is embedded in the NTF framework to maintain the sparsity of the abundance tensor. • The proposed algorithm WSCTF uses the alternating direction method of multipliers to iteratively optimize. Experiments on three real data prove the effectiveness of our algorithm.
The paper is arranged as follows. Section 2 introduces the concepts related to tensor and the unmixing framework based on NTF. The proposed method WSCTF and its optimization are described in detail in Section 3. The synthetic and real experimental results are analyzed in Section 4. Section 5 is the conclusion of this paper.

Notation
According to experiences, a matrix is a two-dimensional array, and a tensor is an extension of the matrix. In this paper, an italic letter is used to represent a scalar, e.g., y and I. A vector is expressed by a regular lowercase letter, e.g., y ∈ R I×1 . A matrix is given using a bold captial letter, e.g., M ∈ R I 1 ×I 2 , and a tensor is defined as a calligraphic letter, e.g., Y ∈ R I 1 ×I 2 ×...×I N .
Tensor Mode: Each dimension of a tensor is called a mode. For instance, the tensor Y ∈ R I 1 ×I 2 ×I 3 has 3 modes. Mode-n Unfolding: It can reshape a tensor to a matrix form. For example, given the tensor Y ∈ R I×J×K , the mode-n unfolding of Y is represented as: in which Y <1> is the mode-1 unfolding of tensor Y. Y <2> and Y <3> are the mode-2 and mode-3 unfolding, respectively.
Frobenius Norm: The Frobenius Norm of tensor Y ∈ R I×J×K is given by: Mode-n Product: Given a tensor Y, it can be expressed as Y = A ×n M, in which A ×n denotes the mode-n unfolding of tensor A.

NTF Unmixing Method
According to [35], the linear spectral mixing model can be represented as a linear combination by the endmember matrix and abundance maps. Specifically, given hyperspectral data Y ∈ R I×J×K , then it can be written as: where A ∈ R I×J×P is the abundance tensor. A× 3 denotes the mode-3 unfolding of tensor A. And M ∈ R I×J and N ∈ R I×J×K are the endmember matrix and noise tensor existing in HSI, respectively. Besides, I, J and K are the dimensions of hyperspectral data, and P is the number of endmembers. As a result, the unmixing model based on tensor factorization can be given by in which || · || F denotes the Frobenius norm. Since hyperspectral unmixing is a practical image processing process, there must be non-negative constraint here. In addition, the sum of the abundance coefficients inside each pixel should be one. These two properties create two constraints in the function (4). Specifically, the objective function (4) can be updated as: in which A× 1 is mode-1 unfolding of the abundance tensor A. Unfortunately, the above function (5) is a non-convex problem, which needs to be optimized for both the abundance A and the endmember matrix M. To improve the performance of the algorithm, Ref. [36] adds the L 2,1 sparsity constraint to the NTF unmixing model. Therefore, it can be given as: In addition, some spatial constraints are also added to the NTF unmixing model to explore the spatial information of HSI, especially TV regularization [34]. Specifically, the NTF unmixing model based on the TV regularization can be described as follows: where A TV is the TV regularization, and λ TV denotes the weight parameter. Since the TV regularization has achieved excellent results in the spatial relationships of data, especially in maintaining the smoothness of the data, it has achieved comparable performance on the hyperspectral unmixing, which has been proved in [37,38]. However, the existing methods are still unable to fully explore some important characteristics of data. For instance, few work consider continuity in the direction of the spectral dimension. In addition, according to [39], the three-dimensional difference images obtained after also have group sparsity. Based on these ideas, this paper explores the three-dimensional spatial structure information of the image, including the spectral similarity of the difference images and the sparsity of the spatial group.

Proposed Method
In response to the questions mentioned above, a new regularization is designed to explore the spectral continuity and group sparsity of difference images of HSI. As a result, a weighted group sparsity constraint non-negative tensor factorization (WSCTF) method for hyperspectral unmixing is proposed. The proposed algorithm and its optimization method are described in detail in this section.

WSCTF Model
According to the existing experience, the TV regularization is widely used to explore the spatial smooth structure in HSI. Specifically, the TV regularizer can be expressed as follows: where D denotes the difference operator. D 1 and D 2 are vertical and horizontal difference, respectively. || · || 1 is the l 1 norm. In detail, given the spatial coordinate of a abundance tensor (i, j, p), then the difference results in the vertical and horizontal directions can be described as: Based on the above analysis, it can be concluded that the difference images in the horizontal and vertical directions are also three-dimensional tensors. According to the prior, the difference image has continuity in the spectral dimension. In addition, the difference image has extremely high sparsity. In the difference image, not only many elements are zeros, but also some connected regions with zeros elements. Therefore, a new weighted sparsity regularization is designed to explore these properties. Specifically, the regularization can be given as: in which W 1 and W 2 are aiming at enhancing the sparsity of difference images. A ×1 and A ×2 are the mode-1 and mode-2 unfolding of tensor A. || · || 2,1 denotes the l 2,1 norm, which can be used to explore the group sparsity of difference image. Therefore, the objective function (7) can be improved by: where λ is regularization parameter. In addition, to be more prepared to explore the sparse structure of the abundance tensor A, a weight constraint is also added to the sparse regularization. Finally, the objective function of WSCTF proposed in this paper can be expressed as: in which W s is the weight to promote the sparsity of abundance tensor. In the proposed method, not only the spatial neighborhood smoothing information is considered, but also the sparsity characteristics of the difference image, especially the group sparsity, are described. The optimization method of objective function (12) is explained in detail in the next subsection.

Optimization
The Formula (12) is a non-convex problem, which is directly difficult to solve. According to the exist experience, the ADMM algorithm is used to update each variable, which can be solved by decomposing the original problem into multiple sub-problems.
(1) Update M: For the function (12), it first considers the update of the endmember matrix M. At this time, the abundance tensor A is fixed, so the objective function can be expressed as: It can directly obtain the updated rule of endmember M as: in which Y ×3 and A ×3 are the mode-3 unfolding of tensors. . * and ./ are multiplication and division operators. In addition, by introducing six intermediate variables G 1 , G 2 , G 3 , G 4 , G 5 and G 6 , the Formula (12) can be rewritten as: In the above Formula (15), G 5 and G 6 are used to meet the abundance nonnegative constraint (ANC) and abundance sum-to-one constraint (ASC). They are defined as: in which 1 denotes a matrix with all ones. Hence, the Lagrangian function of objective function (15) is: (2) Update A: In problem (17), the subproblem of A can become: In problem (18), it can be directly solved by: where where C is a matrix with a locking loop structure. It can be easily concluded that M T M is a symmetric matrix. According to [39], 2-D fast Fourier transformation and SVD is employed to solve C and M T M, respectively. Then it can be obtained that C = F T 2 Ψ 2 F 2 and M T M = U 2 ∑ U T 2 , in which F 2 is a 2-D discrete Fourier transformation matrix. Then the update rule of A is in which It can be easily concluded that the update formula of G 1 According to [40], the update rules related to G 2 can be expressed as: Observing the problem (19), it can be seen the intermediate variables G 2 , G 3 , and G 4 have similar forms. Hence, their solving rules are also similar, then the closed solutions of G 3 and G 4 can be directly obtained as: It is worth noting that both W s , W 1 and W 2 are weight matrixes, which are used to increase the degree of the sparsity. Therefore, their update rules can be described as: in which ε is a small positive number.
Besides, G 5 and G 6 are aiming at ensuring ANC and ASC of abundance. Therefore, the update rule of them can be written as: G 6 ← (A − H 6 ) + repmat(N, I(1), 1)), (4) Update the Lagrange multipliers H i : In general, the framework of the algorithm WSCTF proposed in this paper is summarized in Algorithm 1.

Input:
Y ∈ R I×J×K -the mixing hyperspectral data; P-the number of endmembers; Parameter λ S , and λ.
Update the variables G 1 by (24)

Output:
A-the abundance tensor; M-the endmember matrix.

Experiments
To verify the effectiveness of the WSCTF algorithm proposed in this paper, this section conducts related experiments on two sythetic data and real data sets of Cuprite, Jasper Ridge and Samon, and discusses the experimental results in detail. Since the proposed WSCTF is based on a non-negative tensor decomposition algorithm, the five comparison algorithms used here are: ULTRA-V [30], SULRSR-TV [24], MV-NTF [26], NMF-QMV [41], NL-TSUn [33]. It is worth noting that ULTRA-V, MV-NTF and NL-TSUn are all based on NTF, while SULRSR-TV and NMF-QMV are based on NMF. In addition, the SULRSR-TV algorithm also uses the TV regularization to constrain the local smooth structure of the abundance. The purpose of setting up these comparative experiments is to comprehensively verify the effectiveness of our algorithm.
Spectral angle distance (SAD) and root-mean-square error (RMSE) are the most widely used evaluation indicators in unmixing tasks. Among them, SAD is used to describe the spectral angular distance between the estimated endmember and the ground truth. Correspondingly, the distance of the estimated abundance and the ground truth is described by RMSE. They can be expressed in mathematical formulas as: in whichÂ andM are the ground truth of abundances and endmembers, respectively. Besides, to ensure the fairness of comparison, all experiments use the same experimental platform and use the vertex component analysis (VCA) and fully constrained least squares (FCLS) algorithm to initialize the endmembers and abundances. In addition, it is worth noting that the average and variance of the results of all experiments have been presented after 20 runs, which can illustrate the stability of algorithms.

Sythetic Data
Firstly, experiments are conducted on two synthetic data. All pure endmember spectra in the data are randomly sampled from the USGS spectral library [42].
(1) Dataset 1 (DS1): Set the number of endmembers p = 5, the size of the HSI is 75 × 75 × 224, where 224 is the number of spectral band. The specific data synthesis method is similar to the paper [34]. It is worth noting that each pixel here should comply with the LMM and meet the abundance ANC and ASC at the same time. Figure 1a shows the false color image of DS1. (2) Dataset 2 (DS2): DS2 can form a control group with DS1. Here, the method described in [34] is used to synthesize DS2. Set the image size of DS2 to 90 × 90 × 224, where 224 is the spectral band. Figure 1b shows the false color image of DS2. According to the above, two clean synthetic data can be obtained. Subsequently, 10 dB, 20 dB and 30 dB Gaussian noise are added to DS1 and DS2 respectively to form six data interfered by different noises. All experiments are all performed on these six data, which can prove the effectiveness of algorithms under different image sizes, different noise conditions, and different numbers of endmembers.
The SAD and RMSE results of the proposed WSCTF and five comparison algorithms on these six data are shown in Tables 1 and 2. According to the results of Tables 1 and 2, we can analyze the results: (1) On the DS1 and DS2 data, as the noise continues to increase, the performance of all algorithms, including the proposed WSCTF, is declining. However, the performance of WSCTF method changes less under the influence of different noises and the performance is more stable than other methods, since it fully considers the spatial and spectral similarity. (2) MV-NTF performs poorly on both data, because MV-NTF is based on the NTF basic algorithm, and other algorithms are mostly improved methods on it. (3) Since DS2 is more complicated than DS1 and the number of endmembers is more, all algorithms perform slightly worse on DS2. The ULTRA-V algorithm is mainly proposed for the variability with endmembers, but the absence of variable endmembers in the settings of DS1 and DS2 leads to poor unmixing performance. (4) The lack of pure pixels in the DS2 is the main reason for the low performance of the SULRSR-TV method. SULRSR-TV assumed that there must be pure pixels in the image when unmixing. In summary, compared with other comparison algorithms, the proposed WSCTF has certain advantages under the conditions of different noises, different image sizes and different numbers of endmembers. The above analysis can also be verified in the visualization results of abundance. As shown in Figure 2, the abundance and groundtruth obtained by all algorithms are displayed on DS1 with 20 dB. The abundance map estimated by the WSCTF method is obviously better than other comparison algorithms. Due to the large number of endmembers in DS2 with 20 dB, only five abundances are shown in Figure 3 here.

Real Datasets
Secondly, experiments are conducted on three real datasets.
(1) Jasper Ridge Data: Jasper Ridge data is a data set widely used to test the performance of unmixing algorithms, covering the Jasper Ridge Natural Reserve in California, USA. The data contains 224 spectral bands from 300 nm to 2500 nm, and its spectral resolution is 9.46 nm. Figure 1c is a pseudo-color image of Jasper Ridge data. Here we use an image with a size of 100 × 100 pixels to verify the performance of all algorithms. Because some bands are severely interfered by moisture and noise during the acquisition process, 188 bands are finally used for testing after removing these bands. According to a prior, there are four end members in this area, including soil, water, road and trees. Table 3 shows the SAD results of the endmembers extracted by the proposed WSCTF compared with ground truth. The bold font indicates the best result. It can be concluded that the WSCTF method proposed in this paper has great advantages in the extraction of two endmembers, and the final SAD result is also optimal. In addition, the endmember spectra extracted by all algorithms are shown in Figure 4. Compared with the reference spectrum, it can be seen that the proposed algorithm has considerable advantages in endmember extraction compared with other methods. Figure 5 is the comparison result of all abundances, which can be concluded that the WSCTF algorithm is the closest to the reference. In addition, compared with the distribution of ground objects in the original image in Figure 1c, the accuracy of the proposed algorithm can also be inferred.   (2) Samon Data: According to [43], the samon data set is collected by the Florida Environment Research Institute through the Samon sensor. The data contains 156 bands and the spectral resolution is 3.13 nm. Here we are experimenting on an image with a size of 100 × 100 pixels. There are three types of ground features in this area, including water, soil and trees. Figure 1d is a pseudo-color image of Samon data. The SAD and its variance results extracted by the six methods on the Samon data are introduced in Table 4. Obviously, it can be concluded that all unmixing algorithms, including the WSCTF proposed in this paper, perform slightly worse on the endmember of water. Since the edge of water when it is mixed with the land is not easy to determine, the abundance map of the water body endmember is greatly disturbed. Similar conclusions can also be drawn from the visualization results in Figures 6 and 7. However, it is gratifying that the method proposed in this paper has a great advantage in the separation of the endmember of water compared to other algorithms, which proves the superiority and stability of WSCTF proposed in this paper.   (3) Cuprite Data: This data set is also widely used in unmixing task. Its coverage area is a mining area, so there are many highly mixed pixels. Here, a subgraph with a size of 200 × 150 × 188 is intercepted to test the effectiveness of the algorithm, where 188 is the number of bands. Figure 1e is a pseudo-color image of Cuprite data. According to a prior, there are 12 pure mineral endmembers in this area. Unlike the Samon and Jasper Ridge data, the Cuprite data does not provide true values, so the ground truth of endmembers here are set using the method in [34].
Cuprite data mainly contains 12 minerals, namely Alunite, Kaolinite #1, Kaolinite #2, Sphene, Andradite, Nontronite, Muscovite, Pyrope, Chalcedony, Dumortierite, Montmorillonite, and Buddingtonite. Table 3 shows the SAD comparison results of the endmembers extracted by the six algorithms and the reference endmembers. As shown in Table 5, the algorithm proposed in this paper has obvious advantages in each endmember and average endmember results, and there are 7 optimal results. In addition, Figures 8 and 9 are the visualized results of endmembers and abundance obtained after unmixing, respectively. In Figure 8, compared with the other five methods, it is obvious that the endmember spectrum obtained by the WSCTF algorithm is closer to the reference spectrum. The abundance maps are shown in Figure 9. Compared with [34], it can be seen that the proposed WSCTF can also obtain more accurate abundance results.

Complexity Analysis
Here, three real datasets are taken as examples to analyze the time complexity of the proposed WSCTF and other comparison algorithms. All experiments in this paper are performed on Intel (R) Xeon (R) CPU E5-2697 v3 at 2.60 GHz and 64G RAM platform, which can guarantee the fairness of comparison. The running time of all algorithms is presented in Table 7. It can be seen from the table that the running time of MV-NTF will explode as the image size and the number of endmembers increase. The running time of the WSCTF algorithm proposed in this paper is very short, because it does not involve complex tensor operations. In summary, WSCTF also has certain advantages in terms of time complexity compared to other algorithms.

Conclusions
In this paper, a regularization for exploring the group sparse structure in difference images of HSI is proposed, which not only maintains the structural features of the raw data space in the abundance tensor, but also keeps the continuity in the spectral direction. First, the raw image can be constrained by the TV regularization to obtain the difference images, which are 3-D tensors. Then, the difference images with group sparse structures are separately reshaped in two spatial dimensions. In particular, the L 2,1 norm is used to constrain the sparsity, which can mine the global sparse connected structure. In addition, weights are also used to improve the sparsity. The final experiments prove the superiority of the proposed WSCTF compared with other state-of-the-art methods.