Hyperspectral Unmixing via Double Abundance Characteristics Constraints Based NMF

Abstract: Hyperspectral unmixing aims to obtain the hidden constituent materials and the corresponding fractional abundances from mixed pixels, and is an important technique for hyperspectral image (HSI) analysis. In this paper, two characteristics of the abundance variables, namely, the local spatial structural feature and the statistical distribution, are incorporated into nonnegative matrix factorization (NMF) to alleviate the non-convex problem of NMF and enhance the hyperspectral unmixing accuracy. An adaptive local neighborhood weight constraint is proposed for the abundance matrix by taking advantage of the spatial-spectral information of the HSI. The spectral information is utilized to calculate the similarities between pixels, which are taken as the measurement of the smoothness levels. Furthermore, because abrupt changes may appear in transition areas or outliers may exist in spatially neighboring regions, any inappropriate smoothness constraint on these pixels is removed, which can better express the local smoothness characteristic of the abundance variables. In addition, a separation constraint is used to prevent the result from over-smoothing, preserving the inner diversity of the same kind of material. Extensive experiments were carried out on both simulated and real HSIs, confirming the effectiveness of the proposed approach.


Introduction
Past decades have witnessed the great success of hyperspectral imaging in a wide range of applications, due to its capacity to synchronously acquire both spatial and spectral information [1,2].In hyperspectral images (HSIs), the spectral vector of each pixel contains hundreds or even thousands of elements, which provides rich spectral information to efficiently identify and distinguish different types of land cover [3].However, due to the limited spatial resolution and the complexity of surface features [4], mixed pixels are common in HSIs.The existence of numerous mixed pixels conflicts with the demands for accurate recognition and interpretation of the material properties of the pixels.Hyperspectral unmixing (HU), which decomposes the mixed pixels into a set of constituent materials called "endmembers", as well as the corresponding mixture coefficients called "abundances", was developed to alleviate the mixed pixels problem [5].HU makes it possible to reveal the material properties of pixels, so that the recognition and interpretation of pixels can be carried out at the sub-pixel scale, such as sub-pixel mapping [6] and sub-pixel target detection [7].
The linear mixing model (LMM) and the nonlinear mixing model (NLMM) are the two basic models used in HU, depending on the mixing degree of each type of material [4].The LMM assumes that the mixed pixel spectrum is a linear combination of the pure material signatures weighted by the corresponding abundance fractions, and has been more widely applied than the NLMM in the past decade, on account of its simplicity and suitability, as well as its clear physical meaning.It has therefore attracted a lot of attention in remote sensing fields [8][9][10][11].Based on the LMM, three general approaches for unmixing have been developed: (1) geometric theory based approaches; (2) sparse regression based approaches; and (3) statistical theory based approaches [2].
Considering the relationship between geometric theory and the LMM, it can be found that a hyperspectral dataset actually lies in a simplex whose vertices correspond to the endmembers.Typical geometric theory based methods include the pixel purity index (PPI) [12], N-FINDR [13], the simplex growing algorithm (SGA) [14], vertex component analysis (VCA) [15], and so on.These methods all assume that there is at least one pure pixel per endmember.The practicability of the above methods is limited if the pure pixel assumption is violated, since they would not be able to find the exact endmembers in the image.The minimum volume based methods work whether the pure-pixel assumption is fulfilled or not.They aim at determining the simplex to enclose the observed data with the minimum volume.The endmembers are generated instead of selected in the image.The minimum volume enclosing simplex (MVES) [16], minimum volume simplex analysis (MVSA) [11], and the simplex identification via split augmented Lagrangian (SISAL) algorithm [17] are typical methods in this class.Sparse regression based approaches are relatively new developments, which conduct HU in a semi-supervised fashion by assuming that a spectral library of materials in the scene is available and that the observed data can be expressed in the form of linear combinations of a certain amount of signatures from the spectral library [18,19].Sparse theory based methods have been developed to find the optimal combinations from the library since the number of materials in one pixel is far less than the number of signatures in the library [20][21][22], and the low-rank constraint based on the utilization of spatial correlation has been used to enhance the unmixing result [23,24].The statistical theory based approaches can work under a highly mixed situation [8], and they generally obtain the endmembers and the corresponding abundances simultaneously.Independent component analysis (ICA) [25] and nonnegative matrix factorization (NMF) [26] are typical statistical approaches, both of which are blind source separation (BSS) approaches since they estimate the signals from the observations in the absence of prior knowledge.ICA is able to separate the original source signals under the assumption that the sources are statistically independent.However, this assumption is invalid under the LMM of HU since the sum of abundance fractions within each pixel is constant, which compromises the performance of the ICA algorithm in hyperspectral unmixing [27].A number of methods [28][29][30] have also been proposed to improve the performance by means of adding auxiliary constraints to the ICA.
NMF approximately factorizes a nonnegative matrix into the product of two nonnegative matrices by adopting a multiplicative algorithm [26].NMF has shown great potential for solving HU since it can obtain nonnegative results with physical significance [31].Unfortunately, the algorithm may fall into many local minima due to the non-convexity of the objective function, and may not produce an accurate result [32].Additional constraints on the NMF model according to the properties of the HSIs are needed.A variety of methods based on constrained NMF have been developed in either (or both) of two ways: imposing constraints on the spectral matrix or on the abundance matrix.In terms of the geometric features of HSIs, the minimum volume constrained NMF (MVCNMF) [33] method incorporates a minimum volume constraint into the NMF formulation to force the endmembers to enclose the data cloud.Considering the properties of the spectra, Wang et al. [34] proposed the endmember dissimilarity constrained NMF (EDCNMF) method by assuming the endmember spectra should be smooth and different from each other.Another method named abundance separation and smoothness constrained NMF (ASSNMF) [35] introduces two constraints, namely, the abundance separation constraint and abundance smoothness constraint, into the basic NMF.The ASSNMF method is based on two properties of HSIs: the correlation between different endmembers is weak, and ground objects usually vary slowly.The abundance matrix is generally supposed to be sparse since most pixels are mixed by only a few endmembers [9].Accordingly, this feature has been widely exploited in HU methods.Among the various sparsity-constrained methods, L 1{2 sparsity-constrained NMF (L 1{2 ´N MF) [36] is a very popular approach.The objective of the L q p0 ď q ď 1q regularizer is to minimize the L q p0 ď q ď 1q norm of the abundance matrix and the definition of the L q p0 ď q ď 1q norm can refer to [37].L 1{2 ´N MF is superior to the L 1 regularization method since the L 1 regularizer cannot enforce further sparsity when the full additivity constraint of the material abundance is used [38].Furthermore, the sparsity-constrained method has been generalized to L q ´N MF for 0 ă q ă 1, and the sparsity imposed by the regularizers upon the unmixing task has been investigated [36,37], which demonstrated the superiority of the L 1{2 regularizer over the L 1 regularizer.
In recent years, researchers have made attempts to use the spatial information between different pixels as prior knowledge to enhance HU [39][40][41][42].To further improve the performance of the sparse NMF algorithm, the graph-regularized L 1{2 ´N MF (GLNMF) [42] method was proposed.This method utilizes the latent manifold structure of the data during the decomposition by incorporating an additional manifold regularization term into L 1{2 ´N MF, which can keep the close link between the original image and the material abundance maps.Although the existing abundance characteristic based NMF methods have achieved good performances, we still believe that there is room for improvement.
Firstly, the smoothness levels may not be correctly described.Take Figure 1 as an example.Figure 1a shows that the ground objects between adjacent pixels in homogeneous areas vary slowly; that is to say, HSIs are spatially smooth.The abundances also possess the same smoothness feature since each abundance map characterizes the distribution of a certain kind of ground object.The method in [35] exploits the smoothness feature by assuming that two pixels are more similar if they are spatially closer, and assigns the same smoothness weight to the pixels which are the same spatial distance from the observation pixel.However, in some cases, as shown in the close-ups of Figure 1a, it is not always true that the smoothness levels of two-pixel pairs (a pixel pair is composed of the observation pixel and one of its spatially neighboring pixels in the surrounding local window) are the same if the spatial distances between them are the same.A more reliable measurement of smoothness levels should be used to reflect this difference.is a very popular approach.The objective of the 0 1 regularizer is to minimize the 0 1 norm of the abundance matrix and the definition of the 0 1 norm can refer to [37]./ − is superior to the regularization method since the regularizer cannot enforce further sparsity when the full additivity constraint of the material abundance is used [38].Furthermore, the sparsity-constrained method has been generalized to − for 0 1, and the sparsity imposed by the regularizers upon the unmixing task has been investigated [36,37], which demonstrated the superiority of the / regularizer over the regularizer.In recent years, researchers have made attempts to use the spatial information between different pixels as prior knowledge to enhance HU [39][40][41][42].To further improve the performance of the sparse NMF algorithm, the graph-regularized / − (GLNMF) [42] method was proposed.This method utilizes the latent manifold structure of the data during the decomposition by incorporating an additional manifold regularization term into / − , which can keep the close link between the original image and the material abundance maps.Although the existing abundance characteristic based NMF methods have achieved good performances, we still believe that there is room for improvement.
Firstly, the smoothness levels may not be correctly described.Take Figure 1 as an example.Figure 1a shows that the ground objects between adjacent pixels in homogeneous areas vary slowly; that is to say, HSIs are spatially smooth.The abundances also possess the same smoothness feature since each abundance map characterizes the distribution of a certain kind of ground object.The method in [35] exploits the smoothness feature by assuming that two pixels are more similar if they are spatially closer, and assigns the same smoothness weight to the pixels which are the same spatial distance from the observation pixel.However, in some cases, as shown in the close-ups of Figure 1a, it is not always true that the smoothness levels of two-pixel pairs (a pixel pair is composed of the observation pixel and one of its spatially neighboring pixels in the surrounding local window) are the same if the spatial distances between them are the same.A more reliable measurement of smoothness levels should be used to reflect this difference.Secondly, the spatial structure information in a local area may not be fully explored.The use of the smoothness constraint is based on the precondition that the pixels being constrained are similar.Secondly, the spatial structure information in a local area may not be fully explored.The use of the smoothness constraint is based on the precondition that the pixels being constrained are similar.
However, this precondition is violated when abrupt changes appear in transition areas or outliers exist in the spatially neighboring regions, as shown in the close-ups of Figure 2a.Methods such as those proposed in [35,39] impose a smoothness constraint on all the spatially neighboring pixels in the local window.The spatial structure information is not fully explored by these methods since they lose sight of the pixels that are inappropriate for the smoothness constraint.In this condition, the smoothness constraint may be imposed on pixels which are actually dissimilar to the observation pixel, leading to extra error in the unmixing result.However, this precondition is violated when abrupt changes appear in transition areas or outliers exist in the spatially neighboring regions, as shown in the close-ups of Figure 2a.Methods such as those proposed in [35,39] impose a smoothness constraint on all the spatially neighboring pixels in the local window.The spatial structure information is not fully explored by these methods since they lose sight of the pixels that are inappropriate for the smoothness constraint.In this condition, the smoothness constraint may be imposed on pixels which are actually dissimilar to the observation pixel, leading to extra error in the unmixing result.Finally, the dispersed characteristic of the abundance variables is not fully taken into consideration.Actually, the dispersed characteristic of the abundance variables describes the statistical characteristics of the abundances, as reflected in Figure 1b,c.The 3-D scatter plots of the abundance samples show that the abundance variables dominated by different ground objects (an abundance variable dominated by a kind of ground object means that the pixel corresponding to this abundance variable is mainly composed of the ground object) have their own dominant regions and they are dispersed in a convex, which indicates that the abundance variables corresponding to different kinds of materials should be separate and only have a faint correlation.The method proposed in [39] only adds the smoothness constraint to NMF, which may lead to over-smooth results with the increase in iteration times.A constraint to exploit the dispersed characteristic can prevent these undesirable results by pulling the variables toward their own dominant region through minimizing the correlation between any two of the abundance variables.
Based on the above problems, we propose a novel double abundance characteristics constrained NMF (DAC 2 NMF) method, taking both the spatial structure information and the statistical distribution of the abundances into consideration.Our contributions can be summarized as: (1) The smoothness levels of each pixel pair are measured according to the similarities between them by taking advantage of the spectral information of the HSIs.In this way, more similar pixels are given a higher smoothness weight, as shown in Figure 2b,c, which is closer to the reality than a smoothness level determined by spatial distance.(2) Incorrect smoothness constraints are avoided by assigning a zero smoothness level to the pixels that are dissimilar to the observation pixel.The dissimilar pixels are excluded from the neighborhood pixels in the local window.The schematic diagrams in Figure 2b,c express this idea.(3) A separation constraint is used to prevent an over-smooth result by utilizing the dispersed characteristic of the abundance variables.A more stable and desirable result can be obtained in the interaction of these two constraints.Finally, the dispersed characteristic of the abundance variables is not fully taken into consideration.Actually, the dispersed characteristic of the abundance variables describes the statistical characteristics of the abundances, as reflected in Figure 1b,c.The 3-D scatter plots of the abundance samples show that the abundance variables dominated by different ground objects (an abundance variable dominated by a kind of ground object means that the pixel corresponding to this abundance variable is mainly composed of the ground object) have their own dominant regions and they are dispersed in a convex, which indicates that the abundance variables corresponding to different kinds of materials should be separate and only have a faint correlation.The method proposed in [39] only adds the smoothness constraint to NMF, which may lead to over-smooth results with the increase in iteration times.A constraint to exploit the dispersed characteristic can prevent these undesirable results by pulling the variables toward their own dominant region through minimizing the correlation between any two of the abundance variables.
Based on the above problems, we propose a novel double abundance characteristics constrained NMF (DAC 2 NMF) method, taking both the spatial structure information and the statistical distribution of the abundances into consideration.Our contributions can be summarized as: (1) The smoothness levels of each pixel pair are measured according to the similarities between them by taking advantage of the spectral information of the HSIs.In this way, more similar pixels are given a higher smoothness weight, as shown in Figure 2b,c, which is closer to the reality than a smoothness level determined by spatial distance.(2) Incorrect smoothness constraints are avoided by assigning a zero smoothness level to the pixels that are dissimilar to the observation pixel.The dissimilar pixels are excluded from the neighborhood pixels in the local window.The schematic diagrams in Figure 2b,c express this idea.(3) A separation constraint is used to prevent an over-smooth result by utilizing the dispersed characteristic of the abundance variables.A more stable and desirable result can be obtained in the interaction of these two constraints.
The remainder of this paper is organized as follows.Section 2 briefly presents the LMM and the basic NMF model.A detailed description of the proposed method is given in Section 3. Sections 4 and 5 evaluate the proposed algorithm using experiments on both synthetic and real hyperspectral data.Section 6 concludes the paper.

The Linear Mixing Model (LMM)
As mentioned in Section 1, the LMM assumes that a pixel in an HSI can be expressed as a linear combination of a set of endmembers and the corresponding abundance fractions.Assuming that the image scene is dominated by P kinds of distinct materials with L bands, mathematically, an observation x P R Lˆ1 can be written as: where x " rx 1 , x 2 , . . ., x L s T is the L ˆ1 obtained pixel vector, A " ra 1 , a 2 , . . ., a P s is an L ˆP matrix, with each column being an endmember signature vector.s " ps 1 , s 2 , . . ., s P q T is a P-dimensional column vector composed of abundance coefficients of each endmember at the observation pixel, and e represents the L ˆ1 additive observation noise and error vector.Assuming that there are altogether N observations in the image, the LMM for all the pixels can be expressed by the matrix notation: where X " rx 1 , x 2 , . . ., x N s, S " rs 1 , s 2 , . . ., s N s, and E " re 1 , e 2 , . . ., e N s.Clearly, the l th column of matrix S is the abundance coefficients of the l th column of matrix X.To be physically meaningful, the LMM is subject to two constraints on the entries of S, namely, the abundance nonnegative constraint (ANC) and the abundance sum-to-one constraint (ASC), which can be explicitly given by s i ě 0, i " 1, 2, . . ., P and 1 T s " 1.

Nonnegative Matrix Factorization (NMF)
NMF approximates a high-dimensional nonnegative matrix with the product of two low-dimensional nonnegative matrices.The non-negativity constraints of NMF lead to a part-based representation because they allow only additive combinations [43].This part-based property makes the NMF method well suited to many applications, such as face recognition [44] and document clustering [45].Using the preceding notations, given a nonnegative matrix X P R LˆN , NMF aims to find the nonnegative matrix factors A P R LˆP and S P R PˆN , such that: Comparing the basic NMF model with the LMM model under a noise-free scenario, it can be found that they can both be seen as seeking linear combinations of a set of basis vectors, and their combination coefficients are both nonnegative.These similarities between the two models make NMF a suitable algorithm for HU. Lee and Seung [26] developed two simple multiplicative algorithms to solve the factorization problem of Equation ( 3), for which the square of the Euclidean distance between X and AS is a commonly used cost function.The objective function can be written as: where the operator ||¨|| F represents the Frobenius norm.Although the minimization problem Equation ( 4) is separately convex in A and S, it is not simultaneously convex in both matrices.The widely used multiplicative algorithm presented in [26] is simple to implement and performs well, and can be generated from the traditional gradient descent algorithm.The gradient of Equation ( 4) can be written as: where p¨q T denotes the transpose of the matrix.The update rules can be given by: where u A and u S are the step sizes.They are set as u A " A.{pASSq and u S " S.{pAASq to meet the nonnegative constraints.Substituting them into Equations ( 7) and ( 8), the update rules can be obtained: S Ð S. ˚AT X.{A T AS (10) where .˚and.{represent the element-wise multiplication and division, respectively.The initialization of A and S should be nonnegative to ensure their non-negativity during the iteration under rules presented by Equations ( 5) and ( 6).The cost function Equation ( 4) is non-increasing after each iteration under the update rules, and it will be convergent to a stationary point.

The Double Abundance Characteristics Constrained NMF Method
Although NMF is quite appealing for HU, there are some challenges to be faced.One of the challenges is the lack of a unique solution to Equation (4) due to the aforementioned non-convexity in both A and S [34].If one solution of A and S is obtained, then for any nonnegative invertible matrix D whose inverse D ´1 is also nonnegative, AD and D ´1S are also a pair of solutions.In order to narrow the solution space and draw the decomposition toward the correct result, constraints or penalty terms are generally used to provide desirable results by considering the different properties of the HSIs.In this paper, we propose a constrained NMF method by considering two characteristics of the abundance variables, which is described in the following parts.

Smoothness Feature of the Abundances
The low spatial resolution of HSIs means that they lack tiny details; that is to say, the ground objects vary smoothly and abrupt changes rarely occur.It can be seen from the close-ups in Figure 1a that in the homogeneous areas, the spatially neighboring pixels are similar to each other and change very little, and sudden changes are found only in transition areas or where anomalies exist.This is an important spatial structure property of HSIs, which can be introduced to guide the unmixing.Remove the noise and error item in Equation (1), and each column of X can be written as: It is easy to see from Equation ( 7) that each pixel in the image can be described as a combination of endmember set A and the corresponding abundance coefficients.Taking A as a set of basis vectors in a space, then S can be regarded as the presentation of the L-dimension vector x in the P-dimension space.Therefore, a kind of projection link can be established between the original hyperspectral dataset and the abundance vectors.More specifically, similar pixels in the original image are expected to have similar abundance fractions after unmixing under this linear projection mode.In this sense, the smoothness feature of pixels is also appropriate for describing the relationship of the corresponding abundances.Based on the above analysis, the smoothness characteristic of the abundances between neighboring pixels in a local window is introduced as a constraint to the objective function of the basic NMF model.
The following two factors are taken into consideration when we design the smoothness constraint: (1) different ground objects may have diverse smoothness levels; and (2) the smoothness feature is violated in some places, such as transition areas.In the proposed method, the similarities between pixel pairs (the central pixel and each of its surroundings pixels in the local window) are utilized as prior knowledge to measure the smoothness levels of their abundances.The diverse smoothness levels are expressed by the assigned smoothness weights according to the different similarities between different pixel pairs.As mentioned before, some spatially neighboring pixel pairs may show sudden changes.In fact, these pairs should be abandoned when imposing the smoothness constraint on the abundances, because this kind of constraint on these pairs is not consistent with reality and will produce extra error instead of generating better results.As a result, we must identify these pixels and exclude them from the smoothness constraint to better describe the local spatial structure information of the image.The high spectral resolution of HSIs provides the observations with hundreds of spectral bands, which makes it possible for each pixel to characterize a certain kind of ground object with an almost continuous spectral curve.This valuable spectral information contained in each pixel vector can be used to account for pixel variability, similarity, and discrimination [46].Inspired by the spectral angle mapper [47] the similarity between x i and x j is calculated by: where a larger β ij stands for a higher similarity degree between x i and x j .The size of the local window is empirically chosen as 5 ˆ5.After the calculation of the similarity values of all the pixel pairs (the central pixel and each of its surrounding pixels in the local window), an ascending order operator is conducted on them, and only the corresponding pixel pairs of the first 45% of the values participate in the abundance smoothness constraint, based on experimental investigation.
Based on the above process, the neighboring candidate pixels of each central pixel in the local window are determined.Given a pixel vector x i , if pixel vector x j is in the neighboring candidate pixel set of x i , then we define j P N piq.It is clear that different pixel pairs may have different smoothness levels, and one may naturally hope that there should be a strategy to describe this kind of spatial structure feature.Here, the diverse smoothness levels are reflected with local neighborhood weights, which are defined as: where Equation ( 13) is known as the heat kernel [42], and σis a scaling parameter of the heat kernel weighting.The selection of σis usually done manually.Considering that the same value of σmay fail to capture the data structure when the data contain multiple scales [48], the adaptive value of σcan be defined as: σ " where k is the number of selected neighboring pixels of pixel x i .It is apparent that the more similar x i and x j are, the bigger W ij is.Through the previous analysis, it is known that the abundance of a given pixel x i is similar to the abundance of the pixel in its neighbor N piq.To achieve the abundance smoothness constraint with consideration of the selection of neighboring pixels and the calculation of different similarity levels, the selected local neighborhood weight regularization term for the abundance smoothness is defined as: minimize J 1 pSq "

Dispersed Characteristic of the Abundance Variables
We can roughly distinguish different classes of objects from an image scene such as Figure 1 by visual judgment because, in the real world, every kind of ground object has its own dominant region, which leads to the dispersed distribution of the abundance variables in HSIs.This distribution feature reveals that the abundance variables corresponding to each endmember seem to be independent of each other.However, the independent assumption is violated by the ASC in the LMM, which indicates that there is some correlation between the abundances of different objects due to mixed pixels."Weak correlation" is more suitable for the relationship between different abundance variables.Mutual information is appropriate to express the above statistical information of the abundance variables, which is a commonly used measurement to describe the independence degree of variables.The mutual information function is given by the K-L divergence of the probability density of the abundance distributions [49].By minimizing the mutual information function to a proper level, the dispersed characteristic of the abundance variables can be properly expressed.However, the distributions of the signals need to be estimated because they are usually not known in advance.One way used in [30] expresses the probability density function by a Gaussian distribution, and has achieved favorable unmixing results.In fact, by minimizing the abundance information divergence (AID) [33] of the abundance variables, the minimization of the mutual information can also be achieved, and, what is more, it needs no prior knowledge about the abundance distributions.AID is derived from the K-L divergence.Given two probability distributions of two discrete random signals P " rP 1 , . . ., P n , . . ., P N s T and Q " rQ 1 , . . ., Q n , . . ., Q N s T , the definition of K-L divergence of Q from P is: where Based on the K-L divergence, the AID of two abundance distributions s i " pS i1 , S i2 , . . ., S iN q T and s j " `Sj1 , S j2 , . . ., S jN ˘T is defined as: where p " s i { ř N n"1 S in and q " s j { ř N n"1 S jn are the normalizations of s i and s j , respectively.Clearly, AID is symmetric and always nonnegative.Due to the fact that the value range of its gradient is p´8, `8q, which may cause divergence during iteration of the gradient descent algorithm, Liu et al. [35] improved the AID to be more suitable and stable for the iterative algorithm, and they named the improved version the "separation function".Attracted by its suitability and effectiveness, we chose this function to describe the dispersed characteristic of the abundance variables and to minimize the mutual information of the abundance variables.This regularization term is defined as: where Q jk " S jk { ř N n"1 S jn ; that is to say, Q is the normalized S. f pxq is a replacement for the logarithm function used in the K-L divergence and AID, which is defined as: The objective function J 2 is always nonnegative, and a larger value represents a lower correlation between abundances, so maximization of J 2 is equal to minimization of the mutual information of the abundances.

Abundance Sum-to-One Constraint
To satisfy the physical meaning of the LMM, the abundance should be subject to the ANC and ASC constraints.The ANC constraint is naturally satisfied by NMF.To satisfy the ASC, we adopt the widely used approach in [50].In the iteration, the original dataset X and the endmember matrix A are augmented as: where 1 T is a vector with all the elements being 1, and δis a parameter to adjust the impact of the ASC.In the implementation, a larger δwill result in a better performance, but the convergence rate is decreased.A appropriate value of δ " 20 is selected to balance the tradeoff between accuracy and efficiency in this paper.

Objective Function and Update Rules of the Proposed Method
By integrating the auxiliary constraints presented in part A and part B into the basic NMF model, the objective function of the constrained NMF method is formed as follows: where u 1 and u 2 are regularization parameters, which balance the tradeoff among the three terms.To minimize Equation ( 21), a gradient descent algorithm based on the optimization of the basic NMF is developed.According to Equations ( 7) and ( 8), the update rules for DAC 2 NMF can be formulated as follows: where the step sizes u A and u S are defined as u A " A.{pASSq and u S " S.{pAASq, respectively.Substituting u A and u S into Equations ( 22) and ( 23), the multiplicative update rules can be written as: The derivatives of J 1 pSq and J 2 pSq with respect to each element in S are given as follows: Remote Sens. 2016, 8, 464 10 of 23

Initialization
The number of endmembers is an essential parameter for most of the unmixing methods when no a priori knowledge is offered, and the accuracy of the number is crucial for the following unmixing result.The determination of the endmember number is a challenging task due to the interference of noise and anomalies.Among the numerous related studies, virtual dimensionality (VD) [51] and hyperspectral signal subspace identification by minimum error (HySime) [52] are two commonly used approaches for estimating the endmember number.Although these methods are effective, they cannot guarantee 100% accuracy.Since our work is focused on the unmixing stage, and the endmember number estimation is another independent topic, further discussion about the endmember number estimation is not included in this paper.Instead of relying on the endmember number estimation methods, we determine the endmember number by comprehensive visual interpretation, VD estimation, and by referring to the endmember numbers used in previous research.
Apart from the endmember number, the initializations of the endmember matrix A and abundance matrix S are also significant with regard to the NMF-based methods.It is impractical to obtain a global minima of the non-convex objective function through iterative optimization, since different initializations will result in various results for the same method.In this study, we used two different initialization methods for the synthetic experiments and the real data experiments, respectively.The simple approach adopted in [34], which utilizes the spectral information divergence (SID) [53], was employed in the synthetic experiments to initialize A. Since the real image scene was more complex than the synthetic data, the VCA approach was used to initialize A for the real data.
The initialization of abundance matrix S is achieved by the maximum likelihood estimation from A, which is defined as: Considering that the initializations of A and S should both be nonnegative to ensure their non-negativity during the optimization, all of the elements in the initialized S are checked and the negative values are forced to be 0 through the following operator:

Stopping Condition
The stopping criterion, which can terminate the procedure when a stationary point is reached, is essential for NMF-based methods.The algorithm is considered to be converged if: where τ is a specified error tolerance.In addition, the maximum number of iterations is also predefined.The procedure should be stopped when either of the stopping criteria is reached.

The Procedure of DAC 2 NMF
The flowchart of DAC 2 NMF is summarized as follows: 1.
Determine the endmember number P; initialize the endmember matrix A by the SID-based algorithm for the synthetic experiments, and VCA algorithm for the real experiments; initialize the abundance matrix S according to Equations ( 28) and (29); 2.
replace matrices A and X with matrices A C and X C according to Equation (20); 4. update S by Equations ( 25)-( 27); 5.
replace matrices A C and X C with matrices A and X; 6.
repeat step 2-step 5 until reaching the maximum number of iterations or Equation ( 30) is satisfied;

Computational Complexity Analysis
Here, the computational complexity of the proposed DAC 2 NMF method is analyzed.There are two additional auxiliary constraints based on NMF.For the smoothness constraint, it needs p24NLq to build the local neighborhood weights.The floating-point calculation times needed for calculating the gradients of the smoothness and separation constraints are 4PNk and 10P 2 N, respectively, where k is the number of selected neighboring pixels.In addition, the multiplicative update of the matrices needs 2 " P 2 pL `Nq `PLN ‰ times.If the procedure is stopped after m times of iterations, then the total cost is O " m `P2 p2L `12Nq `PN p2L `4kq ˘`24NL ‰ .

Synthetic Image Experiments
We conducted a series of experiments to test the performance of the proposed DAC 2 NMF method with synthetic images.To verify the performance, the proposed method was compared with four related methods: ASSNMF, L 1{2 ´N MF, GLNMF, and MVCNMF.

Performance Metrics
Two metrics were used to assess the quantitative accuracy of the estimated endmembers and their abundances: the spectral angle distance (SAD) [34] and the root-mean-square error (RMSE) [42].The SAD was used to evaluate the shape similarity between the estimated endmember signature a and the true endmember signature â, and is defined as: SAD pa, âq " arccos ˆaT â ||a||||â|| ˙(31) Since the SAD value describes the spectral angle distance between two endmember signatures, a smaller value indicates a better estimation result.The similarity between the estimated abundance d and the corresponding reference abundance d was measured by the RMSE, which is defined as: RMSE ´d, d¯" where d stands for a row vector of the estimated abundance matrix S. Similar to SAD, a smaller value of RMSE represents a better estimation result for the abundance map.

Generation of Synthetic Images
The simulated synthetic images were of a size of 64 ˆ64, with 182 bands, except for the image used to investigate the algorithm sensitivity to the image size.Ten spectral signatures were chosen from the U.S. Geological Survey (USGS) digital spectral library to create the synthetic data.Apart from the image used for testing the algorithm sensitivity to the number of endmembers in the image, seven spectra were used to generate the synthetic images.The seven spectral signatures are shown in Figure 3a.The generation of the abundance fractions was similar to the method used in [33], and can be described as follows: (1) an image of size 64 ˆ64 was divided into units of 8 ˆ8 blocks; (2) each block was randomly covered by one of the endmember classes; (3) a spatial low-pass filter of size 9 ˆ9 was utilized to generate mixed pixels; (4) according to the required mixing degree, some unsatisfactory abundance fractions were replaced by 1{P.To be more explicit, if the desired mixing degree was 0.8, then we replaced all the pixels whose abundance was larger than 0.8 with a mixture composed of all the endmembers, with abundances of 1{P.One example of abundance maps with a mixing degree of 0.8 is shown in Figure 3b, as well as the first band of the synthetic image.
seven spectra were used to generate the synthetic images.The seven spectral signatures are shown in Figure 3a.The generation of the abundance fractions was similar to the method used in [33], and can be described as follows: (1) an image of size 64 × 64 was divided into units of 8 × 8 blocks; (2) each block was randomly covered by one of the endmember classes; (3) a spatial low-pass filter of size 9 × 9 was utilized to generate mixed pixels; (4) according to the required mixing degree, some unsatisfactory abundance fractions were replaced by 1/ .To be more explicit, if the desired mixing degree was 0.8, then we replaced all the pixels whose abundance was larger than 0.8 with a mixture composed of all the endmembers, with abundances of 1/ .One example of abundance maps with a mixing degree of 0.8 is shown in Figure 3b, as well as the first band of the synthetic image.By the above procedure, synthetic images without pure pixels were generated.In reality, images are generally corrupted by noise or possible errors.To evaluate the algorithm performance under the existence of noise, zero-mean white Gaussian noise was added into the synthetic data, and different signal-to-noise ratio (SNR) values were used to obtain the noisy synthetic data.The SNR is defined as [33]: where and represent a pixel vector and the noise on it, respectively.
• denotes the expectation operator.

Performance Evaluation
To evaluate the performance of the proposed method, five synthetic image experiments were designed.In the first experiment, the selections of the two regularization parameters and the convergence of the proposed method are analyzed.In the second experiment, the algorithms' robustness to noise corruption was studied by adding different levels of noise to the image.The third experiment was aimed at performing a sensitivity analysis of the proposed method to different mixing degrees.The fourth experiment was designed to investigate the algorithm's performance under different numbers of endmembers.In the final experiment, the algorithms were evaluated by different numbers of pixels in the HSI dataset in order to illustrate the impact of data quantity on the algorithm performance.To fairly evaluate the performance of the algorithms, for all the experiments, the algorithms used the same initial values and the same maximum number of iterations, which was set to 1000 in all the simulated experiments.The SAD values and RMSE values were obtained by an By the above procedure, synthetic images without pure pixels were generated.In reality, images are generally corrupted by noise or possible errors.To evaluate the algorithm performance under the existence of noise, zero-mean white Gaussian noise was added into the synthetic data, and different signal-to-noise ratio (SNR) values were used to obtain the noisy synthetic data.The SNR is defined as [33]: where x and e represent a pixel vector and the noise on it, respectively.E r¨s denotes the expectation operator.

Performance Evaluation
To evaluate the performance of the proposed method, five synthetic image experiments were designed.In the first experiment, the selections of the two regularization parameters and the convergence of the proposed method are analyzed.In the second experiment, the algorithms' robustness to noise corruption was studied by adding different levels of noise to the image.The third experiment was aimed at performing a sensitivity analysis of the proposed method to different mixing degrees.The fourth experiment was designed to investigate the algorithm's performance under different numbers of endmembers.In the final experiment, the algorithms were evaluated by different numbers of pixels in the HSI dataset in order to illustrate the impact of data quantity on the algorithm performance.To fairly evaluate the performance of the algorithms, for all the experiments, the algorithms used the same initial values and the same maximum number of iterations, which was set to 1000 in all the simulated experiments.The SAD values and RMSE values were obtained by an average of 10 individual implementations for each method.Furthermore, the number of endmembers was regarded as being known a priori for all the synthetic experiments.

Parameters Selection and Convergence Analysis
In this experiment, the selections of the two regularization parameters u 1 and u 2 are considered when SNR = 30 dB, and purity = 0.8.We change parameter u 2 P " 1, 1 ˆ10 3 ‰ with u 1 P r0, 0.4s and plot the performance of DAC 2 NMF in terms of SAD and RMSE values, setting u 1 as the x coordinate as shown in Figure 4. We can see that, when we fix parameter u 2 , the algorithm achieves the best performance with u 1 " 0.1 in terms of the SAD values, and with u 1 falling in the range r0.1, 0.2s in terms of the RMSE values.When we fix parameter u 1 , a larger u 2 will lead to better results in terms of the SAD values; while a larger u 2 will lead to worse results in terms of RMSE when u 1 is set in the range r0, 0.12s.From the parameter analysis, the parameters of DAC 2 NMF were set as: u 1 " 0.1, u 2 " 600 for both synthetic and real experiments.Furthermore, the error tolerance τis set to 0.01.The parameters for the other methods were chosen according to the experiments and analyses in the corresponding references for both synthetic and real experiments.
considered when SNR = 30 dB, and purity = 0.8.We change parameter ∈ 1,1 × 10 with ∈ 0,0.4 and plot the performance of DAC 2 NMF in terms of SAD and RMSE values, setting as the coordinate as shown in Figure 4. We can see that, when we fix parameter , the algorithm achieves the best performance with = 0.1 in terms of the SAD values, and with falling in the range 0.1,0.2 in terms of the RMSE values.When we fix parameter , a larger will lead to better results in terms of the SAD values; while a larger will lead to worse results in terms of RMSE when is set in the range 0,0.12 .From the parameter analysis, the parameters of DAC 2 NMF were set as: = 0.1, = 600 for both synthetic and real experiments.Furthermore, the error tolerance τis set to 0.01.The parameters for the other methods were chosen according to the experiments and analyses in the corresponding references for both synthetic and real experiments.
Under the condition of the former parameter analysis, the approximation error of DAC 2 NMF with iterations is plotted in Figure 5.The approximation error is calculated according to Equation ( 4), and the red solid line represents the real approximation error.We can see that the approximation error curve of DAC 2 NMF can converge to the reality for about 800 iterations.Under the condition of the former parameter analysis, the approximation error of DAC 2 NMF with iterations is plotted in Figure 5.The approximation error is calculated according to Equation ( 4), and the red solid line represents the real approximation error.We can see that the approximation error curve of DAC 2 NMF can converge to the reality for about 800 iterations.average of 10 individual implementations for each method.Furthermore, the number of endmembers was regarded as being known a priori for all the synthetic experiments.

Parameters Selection and Convergence Analysis
In this experiment, the selections of the two regularization parameters and are considered when SNR = 30 dB, and purity = 0.8.We change parameter ∈ 1,1 × 10 with ∈ 0,0.4 and plot the performance of DAC 2 NMF in terms of SAD and RMSE values, setting as the coordinate as shown in Figure 4. We can see that, when we fix parameter , the algorithm achieves the best performance with = 0.1 in terms of the SAD values, and with falling in the range 0.1,0.2 in terms of the RMSE values.When we fix parameter , a larger will lead to better results in terms of the SAD values; while a larger will lead to worse results in terms of RMSE when is set in the range 0,0.12 .From the parameter analysis, the parameters of DAC 2 NMF were set as: = 0.1, = 600 for both synthetic and real experiments.Furthermore, the error tolerance τis set to 0.01.The parameters for the other methods were chosen according to the experiments and analyses in the corresponding references for both synthetic and real experiments.
Under the condition of the former parameter analysis, the approximation error of DAC 2 NMF with iterations is plotted in Figure 5.The approximation error is calculated according to Equation ( 4), and the red solid line represents the real approximation error.We can see that the approximation error curve of DAC 2 NMF can converge to the reality for about 800 iterations.

Noise Robustness Analysis
In this experiment, the SNR was changed from 15 dB to 35 dB with a step size of 5 dB.The purity of the image was fixed at 0.8 and the image size was 64 ˆ64.There were seven endmembers in the image.Figure 6 shows the results, where the bar and error bar represent the mean and standard deviation, respectively.It is clear that DAC 2 NMF leads to the best performance in terms of both SAD values and RMSE values.The superiority of DAC 2 NMF to the other algorithms is particularly evident when the SNR is equal to 15 dB.In terms of SAD, ASSNMF gives the second-best performance, followed by GLNMF.GLNMF outperforms L 1{2 ´N MF with regard to SAD.Although MVCNMF gives the worst performance in terms of SAD, it performs better in the estimation of abundances.
Overall, it can be concluded that the proposed algorithm is robust with respect to different levels of noise.
In this experiment, the SNR was changed from 15 dB to 35 dB with a step size of 5 dB.The purity of the image was fixed at 0.8 and the image size was 64 × 64.There were seven endmembers in the image.Figure 6 shows the results, where the bar and error bar represent the mean and standard deviation, respectively.It is clear that DAC 2 NMF leads to the best performance in terms of both SAD values and RMSE values.The superiority of DAC 2 NMF to the other algorithms is particularly evident when the SNR is equal to 15 dB.In terms of SAD, ASSNMF gives the second-best performance, followed by GLNMF.GLNMF outperforms / − with regard to SAD.Although MVCNMF gives the worst performance in terms of SAD, it performs better in the estimation of abundances.Overall, it can be concluded that the proposed algorithm is robust with respect to different levels of noise.

Robustness Analysis to Degree of Mixing
The mixing degree of different real hyperspectral data may change with different spatial resolutions and the diverse complexity of ground objects.This experiment was aimed at evaluating the unmixing performance of the algorithm under various degree of mixing.The purity of the synthetic image was changed from 0.6 to 1 with a step size of 0.1.The image size, number of endmembers, and noise level were assigned as 64 × 64, 7, and 20 dB, respectively.It is clear in Figure 7a that ASSNMF slightly outperforms / − , GLNMF, and MVCNMF for the SAD values, and DAC 2 NMF still results in the smallest SAD values under different degrees of mixing.Particularly when the purity equals 0.9 and 1, the SAD values of DAC 2 NMF are clearly superior to the other methods.GLNMF obtains a smaller SAD value but a larger RMSE value than / − when the purity varies from 0.8 to 1.In terms of RMSE, DAC 2 NMF and ASSNMF are comparative when the purity equals 0.6 and 0.8, as shown in Figure 7b, and DAC 2 NMF outperforms ASSNMF under the other conditions.MVCNMF gives the smallest RMSE when the purity equals 0.6, followed by / − and GLNMF; DAC 2 NMF and ASSNMF can be seen to have larger RMSE values than the other methods under this condition, while they give the best performance with regard to SAD.Overall, the results reveal that DAC 2 NMF can contribute to a better estimation of endmember signatures, and the abundance estimation is better when the purity ≥ 0.7.

Robustness Analysis to Degree of Mixing
The mixing degree of different real hyperspectral data may change with different spatial resolutions and the diverse complexity of ground objects.This experiment was aimed at evaluating the unmixing performance of the algorithm under various degree of mixing.The purity of the synthetic image was changed from 0.6 to 1 with a step size of 0.1.The image size, number of endmembers, and noise level were assigned as 64 ˆ64, 7, and 20 dB, respectively.It is clear in Figure 7a that ASSNMF slightly outperforms L 1{2 ´N MF, GLNMF, and MVCNMF for the SAD values, and DAC 2 NMF still results in the smallest SAD values under different degrees of mixing.Particularly when the purity equals 0.9 and 1, the SAD values of DAC 2 NMF are clearly superior to the other methods.GLNMF obtains a smaller SAD value but a larger RMSE value than L 1{2 ´N MF when the purity varies from 0.8 to 1.In terms of RMSE, DAC 2 NMF and ASSNMF are comparative when the purity equals 0.6 and 0.8, as shown in Figure 7b, and DAC 2 NMF outperforms ASSNMF under the other conditions.MVCNMF gives the smallest RMSE when the purity equals 0.6, followed by L 1{2 ´N MF and GLNMF; DAC 2 NMF and ASSNMF can be seen to have larger RMSE values than the other methods under this condition, while they give the best performance with regard to SAD.Overall, the results reveal that DAC 2 NMF can contribute to a better estimation of endmember signatures, and the abundance estimation is better when the purity ě 0.7.

Robustness Analysis to the Number of Endmembers
This experiment evaluated the performance of the algorithm when the synthetic image contained different number of endmembers, and the number of endmembers was changed from 3 to 10 with a step size of 1.The other conditions for the image were assigned as: SNR equal to 20 dB, purity equal to 0.8, and the image size was 64 ˆ64.The average values and standard deviations of SAD and RMSE are shown in Figure 8a,b, respectively.It can be seen from the figure that ASSNMF results in the smallest SAD value but the largest RMSE value when there are three endmembers in the image, and DAC 2 NMF performs the best in terms of both SAD and RMSE, except for the next-best SAD value when P = 3.The performance of ASSNMF is slightly better than L 1{2 ´N MF, GLNMF, and MVCNMF when P varies from 5 to 10. MVCNMF obtains the largest SAD values but smaller RMSE values than L 1{2 ´N MF.Overall, the performances are weakened when the endmember number increases.

Robustness Analysis to the Image Size
In this experiment, the issue of algorithm sensitivity to different image sizes was studied.The synthetic images were designed with three different sizes: 64 × 64, 96 × 96, and 128 × 128.The other conditions were the same: the images were corrupted by an SNR of 20 dB; there were seven endmembers in the images; and the purity was equal to 0.8.The results are shown in Figure 9.We can again see that the proposed DAC 2 NMF outperforms the other algorithms for the three image scenes, and ASSNMF is the next best./ − and GLNMF are comparative, and MVCNMF

Robustness Analysis to the Image Size
In this experiment, the issue of algorithm sensitivity to different image sizes was studied.The synthetic images were designed with three different sizes: 64 ˆ64, 96 ˆ96, and 128 ˆ128.The other conditions were the same: the images were corrupted by an SNR of 20 dB; there were seven endmembers in the images; and the purity was equal to 0.8.The results are shown in Figure 9.We can again see that the proposed DAC 2 NMF outperforms the other algorithms for the three image scenes, and ASSNMF is the next best.L 1{2 ´N MF and GLNMF are comparative, and MVCNMF results in the largest SAD values, while it produces a better estimation of the abundances than L 1{2 ´N MF and GLNMF.As the number of pixels increases, the performances of all the algorithms slightly decrease in terms of SAD.As for RMSE, the performances are comparative under the size of 96 ˆ96 and 128 ˆ128.Overall, it is concluded that the proposed algorithm is suitable for different sizes of images.
Finally, we analyze the computational complexity of the algorithms.Table 1 shows the computational complexity of the algorithms calculated with the updated rules in one iteration, where N denotes the number of pixels in the image, L denotes the number of bands, P denotes the number of endmembers, k denotes the number of selected neighboring pixels of each pixel, and r and c denote the number of rows and columns of the image, respectively.From Table 1, we can see that the computational complexity of all the NMF-based algorithms increases rapidly with the increasing number of pixels, especially for L 1{2 ´N MF and GLNMF.The computational complexity of MVCNMF increases faster than other algorithms with the increasing of endmember number.
results in the largest SAD values, while it produces a better estimation of the abundances than / − and GLNMF.As the number of pixels increases, the performances of all the algorithms slightly decrease in terms of SAD.As for RMSE, the performances are comparative under the size of 96 × 96 and 128 × 128.Overall, it is concluded that the proposed algorithm is suitable for different sizes of images.Finally, we analyze the computational complexity of the algorithms.Table 1 shows the computational complexity of the algorithms calculated with the updated rules in one iteration, where N denotes the number of pixels in the image, L denotes the number of bands, P denotes the number of endmembers, k denotes the number of selected neighboring pixels of each pixel, and r and c denote the number of rows and columns of the image, respectively.From Table 1, we can see that the computational complexity of all the NMF-based algorithms increases rapidly with the increasing number of pixels, especially for / − and GLNMF.The computational complexity of MVCNMF increases faster than other algorithms with the increasing of endmember number.

Algorithms
Computational Complexity

HYDICE Dataset
We also applied the DAC 2 NMF method to two real hyperspectral datasets.The first image was collected by the Hyperspectral Digital Imagery Collection Experiment (HYDICE) sensor, covering an area in Washington DC.The image has 210 spectral channels, with a spectral resolution of 10 nm, covering the wavelength range of 0.4-2.4um.A subset of 150 ˆ150 was extracted from the original image for use in the experiment.The false-color image is shown in Figure 10.

HYDICE Dataset
We also applied the DAC 2 NMF method to two real hyperspectral datasets.The first image was collected by the Hyperspectral Digital Imagery Collection Experiment (HYDICE) sensor, covering an area in Washington DC.The image has 210 spectral channels, with a spectral resolution of 10 nm, covering the wavelength range of 0.4-2.4um.A subset of 150 × 150 was extracted from the original image for use in the experiment.The false-color image is shown in Figure 10.The image in Figure 10 is composed of six main kinds of ground objects: roof, grass, water, tree, path, and street, so the number of endmembers was defined as 6.The parameters were the same as those in the synthetic experiments, and VCA was utilized for initialization of the endmember matrix.The SAD values for DAC 2 NMF, ASSNMF, / − , GLNMF, and MVCNMF are provided in Table 2, and the best performance of each endmember is denoted in bold.It can be seen from Table 2 that DAC 2 NMF has the maximum number of best-performance cases, and ASSNMF, GLNMF, and MVCNMF each have one best-performance case, respectively.Figure 11 plots the mean SAD values The image in Figure 10 is composed of six main kinds of ground objects: roof, grass, water, tree, path, and street, so the number of endmembers was defined as 6.The parameters were the same as those in the synthetic experiments, and VCA was utilized for initialization of the endmember matrix.
The SAD values for DAC 2 NMF, ASSNMF, L 1{2 ´N MF, GLNMF, and MVCNMF are provided in Table 2, and the best performance of each endmember is denoted in bold.It can be seen from Table 2 that DAC 2 NMF has the maximum number of best-performance cases, and ASSNMF, GLNMF, and MVCNMF each have one best-performance case, respectively.Figure 11 plots the mean SAD values for all the algorithms, and it shows that DAC 2 NMF performs the best, successively followed by GLNMF, L 1{2 ´N MF, and ASSNMF.MVCNMF has the highest average SAD value.Figure 12 displays the abundance maps of the six types of materials.The abundance maps are grayscale maps, where a brighter pixel indicates a larger value of abundance fraction.We can clearly see the distribution of the six materials in the abundance maps.The abundance maps show that most of the materials cluster together or present a pattern of distribution in their own dominant area, which is consistent with the strategies designed in DAC 2 NMF.The smoothness constraint in DAC 2 NMF can facilitate the abundances of the same kind of material to be similar, and the separation constraint is used to prevent the abundance variables from over-smoothing and to ensure that each kind of material is distributed in its own dominant area.ASSNMF has no consideration for the pixels that are inappropriate for the smoothness constraint, which may bring error into the results.The performance of GLNMF is better than L 1{2 ´N MF due to the strategy of integrating the latent manifold structure regularization term into L 1{2 ´N MF.

AVIRIS Dataset
The second image was collected by the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) over Cuprite, Nevada.There are 224 bands in the image, covering the wavelength range of 0.37-2.48um, with a spectral resolution of 10 nm.For our experiment, a block with the size of 250 × 190 was cut from the original data.The noisy bands (1-3 and 221-224) and water absorption bands (104-115 and 148-170) were removed, leaving 182 bands.The false-color image is shown in Figure 13.

AVIRIS Dataset
The second image was collected by the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) over Cuprite, Nevada.There are 224 bands in the image, covering the wavelength range of 0.37-2.48um, with a spectral resolution of 10 nm.For our experiment, a block with the size of 250 ˆ190 was cut from the original data.The noisy bands (1-3 and 221-224) and water absorption bands (104-115 and 148-170) were removed, leaving 182 bands.The false-color image is shown in Figure 13.The number of endmembers of this image is difficult to determine due to the fact that: (1) the existence of abundant rare minerals in this area means that the minerals are highly mixed, which leads to difficulty in interpreting the dataset; (2) the same kind of mineral may have different spectra with different chemical compositions.Rogge et al. [54] revealed that there are about 12 endmembers in this image, and this number was adopted in our experiment.The SAD values between the reference endmembers and the estimated endmembers by the algorithms are shown in Table 3, in The number of endmembers of this image is difficult to determine due to the fact that: (1) the existence of abundant rare minerals in this area means that the minerals are highly mixed, which leads to difficulty in interpreting the dataset; (2) the same kind of mineral may have different spectra with different chemical compositions.Rogge et al. [54] revealed that there are about 12 endmembers in this image, and this number was adopted in our experiment.The SAD values between the reference endmembers and the estimated endmembers by the algorithms are shown in Table 3, in which the bold font represents the best result.The corresponding abundance maps are displayed in Figure 14.In the results, nontronite, montmorillonite, and kaolinite are divided into two endmembers, respectively, which is a result of the signature variability.Figure 15 plots the average SAD values.The proposed DAC 2 NMF method outperforms the other methods in terms of both numbers of best-performance cases and average SAD values.The other methods all obtain two best-performance cases, while MVCNMF shows the worst performance in terms of average SAD value.It is clear that the distributions of the ground objects in this dataset are more complicated than for the Washington DC dataset.There are more materials in the dataset, and the distributions are more scattered, leading to an increase in the number of pixels located in transition areas.DAC 2 NMF again achieves the best results for this dataset due to the strategy of removing the dissimilar pixels from the smoothness constraint, and the separation constraint ensures the results are not over-smooth.ASSNMF imposes a smoothness constraint on all of the neighboring pixels, which is not appropriate for the pixels in transition areas.The performance of GLNMF is better than L 1{2 ´N MF for this dataset due to the use of the manifold structure information.

Discussion
In this study, we investigated the validity of the abundance smoothness and dispersed characteristics for the NMF-based hyperspectral unmixing method.The endmember spectra and abundances estimated by the proposed DAC 2 NMF method fit well with the references.However, some issues still need to be resolved or improved for further research.
First, all of the NMF-based unmixing methods have regularization parameters to be set before the unmixing task.This is inevitable because different images may have different feature

Discussion
In this study, we investigated the validity of the abundance smoothness and dispersed characteristics for the NMF-based hyperspectral unmixing method.The endmember spectra and abundances estimated by the proposed DAC 2 NMF method fit well with the references.However, some issues still need to be resolved or improved for further research.
First, all of the NMF-based unmixing methods have regularization parameters to be set before the unmixing task.This is inevitable because different images may have different feature distribution characteristics, and the optimal parameter may differ.Although the NMF methods with constraints are more suitable and effective for unmixing than the original NMF method, the complexity is also increased.Therefore, how to make a trade-off between efficiency and performance is a problem that can be further considered.
Second, the number of endmembers is a necessary parameter for the ummixing method, but the NMF-based method is not able to adaptively determine the number, which cannot meet the real-time unmixing task.In the future, it is preferred to establish an unmixing system that also contains the estimation of the number of endmembers.
Third, the NMF-based methods did not take the intraclass spectral variability into consideration, which determines that they are not applicable to high-demand tasks that need to distinguish endmembers within class.The advantage of the multiple endmember analysis lies on that it uses various spectra of each endmember class to unmix each pixel.It is more accurate when there are plenty of spectral differences within each class.While the advantage of the NMF-based methods is that they can estimate the endmember spectral even if the hyperspectral is highly mixed and the endmember spectral is not present in the image.Therefore, finding a way to integrate the two advantages may be a direction for further research.

Conclusions
In this paper, a new algorithm, double abundance characteristics constrained NMF (DAC 2 NMF), which utilizes both the abundance smoothness and the dispersed characteristic of the abundance variables, has been proposed.The local spatial smoothness structure of the abundances can be more accurately expressed by virtue of the adaptively selected local neighborhood weight regularization strategy.By removing the dissimilar pixel pairs from the abundance smoothness constraint, extra errors can be avoided, thus enhancing the unmixing result.Synthetic and real HSI experiments were analyzed to study the performance of the DAC 2 NMF method and other state-of-the-art algorithms.Experimental results show that in most cases, DAC 2 NMF method is superior to the other competing algorithms, which indicates that the adopted abundance characteristics in DAC 2 NMF are effective.

Figure 1 .
Figure 1.Three observations from the figure: the adjacent pixels vary slowly; pixels of a closer spatial distance are not always more similar; the abundance variables dominated by different ground objects have their own dominant regions and they are dispersed in a convex.(a) An HSI and close-ups of two patches in it; (b) 3-D scatter plot of the abundance samples of zoom 1 in (a); (c) 3-D scatter plot of the abundance samples of zoom 2 in (a).

Figure 1 .
Figure 1.Three observations from the figure: the adjacent pixels vary slowly; pixels of a closer spatial distance are not always more similar; the abundance variables dominated by different ground objects have their own dominant regions and they are dispersed in a convex.(a) An HSI and close-ups of two patches in it; (b) 3-D scatter plot of the abundance samples of zoom 1 in (a); (c) 3-D scatter plot of the abundance samples of zoom 2 in (a).

Figure 2 .
Figure 2. The concept of the selected local neighborhood weight method.(a) A hyperspectral image and close-ups of two patches in it; (b) An example of local neighborhood weights for , which locates in the boundary of two classes; (c).An example of local neighborhood weights for , which is spatially adjacent to outliers; In (b) and (c), the different colors represent different materials, and different shades of the same color represent the diversity of the same material; the dots denote a zero value of the assigned weight; the stars denote an assigned weight that is greater than zero, and a larger star represents a larger weight for more similar pixels.

Figure 2 .
Figure 2. The concept of the selected local neighborhood weight method.(a) A hyperspectral image and close-ups of two patches in it; (b) An example of local neighborhood weights for x i , which locates in the boundary of two classes; (c).An example of local neighborhood weights for x j , which is spatially adjacent to outliers; In (b) and (c), the different colors represent different materials, and different shades of the same color represent the diversity of the same material; the dots denote a zero value of the assigned weight; the stars denote an assigned weight that is greater than zero, and a larger star represents a larger weight for more similar pixels.

Figure 3 .
Figure 3. Example of the synthetic data.(a) Endmember spectra; (b) Abundance maps and the first band of the synthetic data.

Figure 3 .
Figure 3. Example of the synthetic data.(a) Endmember spectra; (b) Abundance maps and the first band of the synthetic data.

Figure 4 .
Figure 4. Parameter analysis of DAC 2 NMF in terms of (a) SAD and (b) RMSE.

Figure 5 .
Figure 5.The approximation error of DAC 2 NMF with iterations.

Figure 4 .
Figure 4. Parameter analysis of DAC 2 NMF in terms of (a) SAD and (b) RMSE.

Figure 4 .
Figure 4. Parameter analysis of DAC 2 NMF in terms of (a) SAD and (b) RMSE.

Figure 5 .
Figure 5.The approximation error of DAC 2 NMF with iterations.Figure 5.The approximation error of DAC 2 NMF with iterations.

Figure 5 .
Figure 5.The approximation error of DAC 2 NMF with iterations.Figure 5.The approximation error of DAC 2 NMF with iterations.

Figure 7 .
Figure 7. Experimental results with different degrees of mixing.(a) SAD; (b) RMSE.4.3.4.Robustness Analysis to the Number of Endmembers This experiment evaluated the performance of the algorithm when the synthetic image contained different number of endmembers, and the number of endmembers was changed from 3 to 10 with a step size of 1.The other conditions for the image were assigned as: SNR equal to 20 dB, purity equal to 0.8, and the image size was 64 × 64.The average values and standard deviations of SAD and RMSE are shown in Figure 8a,b, respectively.It can be seen from the figure that ASSNMF

Figure 10 .
Figure 10.Sub-scene extracted from the Washington DC dataset.

Figure 10 .
Figure 10.Sub-scene extracted from the Washington DC dataset.

Figure 11 .
Figure 11.Average SAD values for the Washington DC dataset.

Figure 11 .
Figure 11.Average SAD values for the Washington DC dataset.

Figure 11 .Figure 12 .
Figure 11.Average SAD values for the Washington DC dataset.

Figure 15 .
Figure 15.Average SAD values for the Cuprite dataset.

Figure 15 .
Figure 15.Average SAD values for the Cuprite dataset.

Table 1 .
The computational complexity of the algorithms.

Table 1 .
The computational complexity of the algorithms.

Table 2 .
SAD Values of the Estimated Endmembers and Reference Endmembers for the Washington DC Dataset, USA.

Table 3 .
SAD Values of the Estimated Endmembers and Reference Endmembers for the Cuprite dataset.