Unsupervised Nonlinear Hyperspectral Unmixing Based on Bilinear Mixture Models via Geometric Projection and Constrained Nonnegative Matrix Factorization

: Bilinear mixture model-based methods have recently shown promising capability in nonlinear spectral unmixing. However, relying on the endmembers extracted in advance, their unmixing accuracies decrease, especially when the data is highly mixed. In this paper, a strategy of geometric projection has been provided and combined with constrained nonnegative matrix factorization for unsupervised nonlinear spectral unmixing. According to the characteristics of bilinear mixture models, a set of facets are determined, each of which represents the partial nonlinearity neglecting one endmember. Then, pixels’ barycentric coordinates with respect to every endmember are calculated in several newly constructed simplices using a distance measure. In this way, pixels can be projected into their approximate linear mixture components, which reduces greatly the impact of collinearity. Different from relevant nonlinear unmixing methods in the literature, this procedure effectively facilitates a more accurate estimation of endmembers and abundances in constrained nonnegative matrix factorization. The updated endmembers are further used to reconstruct the facets and get pixels’ new projections. Finally, endmembers, abundances, and pixels’ projections are updated alternately until a satisfactory result is obtained. The superior performance of the proposed algorithm in nonlinear spectral unmixing has been validated through experiments with both synthetic and real hyperspectral data, where traditional and state-of-the-art algorithms are compared.


Introduction
In spite of the extensive applications of hyperspectral remote sensing imagery over the decades, the issue associated with mixed pixels always leads to a considerable drop in the pixel-level application accuracy. Mixed pixels exist widely because of the intrinsically low spatial resolution of current sensors, which usually makes more than one pure material appear in each pixel's corresponding observed area. These pixels are the collections of different material spectra; furthermore, a pixel is called an endmember if it only contains one material. To address this issue, spectral unmixing has been investigated, which targets pixels for decomposition into a group of spectrally pure signatures and their associated fractions over the given scenes, known as endmember extraction and abundance estimation, respectively [1,2].
Depending on the specific mixture models, physical mixing processes of distinct substances' spectra can be characterized mathematically, which provides an efficient method for the implementation of spectral unmixing methods. Typically, as the most popular model, the linear mixture model (LMM) assumes that the radiance reflected by each material is directly received without multiple interactions, resulting in that pixels are the linear combinations of endmembers according to their respective abundances. This simple assumption often makes the LMM-based unmixing algorithms computationally tractable and capable of providing sufficient accuracy in most cases, especially when the observed scenarios are mainly occupied by macroscopic mixtures [2]. For example, methods including the N-FINDR [3], vertex component analysis (VCA) [4], and orthogonal bases algorithm (OBA) [5] usually serve for the endmember extraction; and the fully constrained least squares (FCLS) [6] and geometric G-Wang FCLS method [7] could perform the abundance inversion well. However, endmembers are often wrongly extracted when there are no pure pixels in the images due to the low spatial resolution. In addition, an increased error in abundance estimation commonly stems from the use of inaccurate endmembers.
Although the LMM-based unmixing algorithms have shown good unmixing performances, unsatisfactory results may be obtained in some scenarios where the LMM fails in explaining the nonlinear mixing effect [26,27]. "Nonlinear mixing effect" often refers to the multiple scattering and absorption phenomena of light. Intimate mixtures at a microscopic scale and multilayered mixtures in vegetated and urban areas are two typical situations where the nonlinear mixing effect is prominent [28]. To cope with this issue, nonlinear mixture models with different forms have been proposed to provide an appropriate alternative for accounting for the nonlinearity of natural scenes. Formulating a nonlinear function between the reflectance and physical parameters, the Hapke model [29] is popular for the intimate mixtures. While, in the case of the multilayered mixtures, bilinear mixture models (BMMs) with much simpler mathematical expressions have shown promising abilities for nonlinear spectral unmixing [27]. The BMMs can be considered as the nonlinear extensions of the LMM, added with only the second-order scatterings between every two endmembers. Most commonly used BMMs, such as the Fan model (FM) [30], generalized bilinear model (GBM) [31], and polynomial post-nonlinear model (PPNM) [32] actually differ from each other in their nonlinear parts. Their performances have been validated by the convincing field experiments in [33,34]. For more complex scenarios such as urban areas, higher-order scatterings have been taken into account [35][36][37]. Heylen et al. [37] presented a multilinear mixing (MLM) model by introducing a probability parameter of light undergoing further interactions, and all orders of multiple scatterings can be included.
Due to the simplicity and effectiveness of the BMMs, the BMM-based nonlinear unmixing methods are gaining more attention. Besides the traditional Bayesian algorithm and gradient descent algorithm (GDA) [30][31][32], many state-of-the-art BMM-based algorithms have appeared, such as the GBM-based semi-nonnegative matrix factorization (semi-NMF) [38]; algorithms with improvements to the 1.
The negative effect of collinearity in BMM-based nonlinear unmixing methods can be addressed.
To be specific, this goal is reached by adopting a distance measure to project pixels onto their approximate linear mixture components based on the geometric characteristics of the BMMs. This procedure reduces effectively the virtual endmembers' impact on nonlinear unmixing, which is a remarkable advantage compared with other relevant unmixing methods.

2.
The issue of local minima in standard NMF can be well alleviated, and highly mixed nonlinear hyperspectral data can be unmixed accurately. To be specific, the procedure of geometric projection facilitates the direct use of NMF in nonlinear unmixing, and the incorporation of a minimum endmember distance constraint into the NMF framework enables the accurate estimation of endmembers and abundances when pixels are highly mixed.

3.
A general unsupervised nonlinear unmixing strategy is built which is simultaneously suitable for unmixing under the assumptions of three BMMs including the FM, GBM, and PPNM.
It should be noted that this paper is quite different from our previous published conference paper [67] which only gave a simple and immature description of the method. In this paper, a more detailed and comprehensive motivation is explained. In particular, an essential theory (Theorem 1 and Property 1 in Section 3) of constructing nonlinear hyperplanes and geometric projection to address the collinearity is provided with a proof. Several key issues of projection in theoretical and technical parts are also carefully discussed and further improved. Moreover, more reasonable experiments with simulated and real hyperspectral data are carried out here to validate the proposed algorithm's superiority in unsupervised nonlinear unmixing compared with other state-of-the-art methods.
In terms of unsupervised spectral unmixing methods in the literature, their main drawbacks are in three aspects: (1) LMM-based unsupervised unmixing methods fail in explaining the nonlinear mixing effect; (2) Existing BMM-based unsupervised nonlinear unmixing methods always suffer from the negative effect of collinearity; (3) The issue of local minima is commonly serious for unsupervised nonlinear unmixing algorithms via NMF which may perform worse in unmixing highly mixed data.
The precise objective of this work is to propose a BMM-based unsupervised nonlinear unmixing algorithm via geometric projection and NMF, which could overcome the problem of the collinearity, and give better unmixing results when the pixels are nonlinearly and highly mixed. The rest of this paper is organized as follows. The LMM, three BMMs, and NMF are introduced in Section 2. In Section 3, the issue of collinearity is analyzed, geometric projection for generating pixels' approximate linear components is described, and the proposed algorithm combining geometric projection with constrained NMF for nonlinear unmixing is presented. In Section 4, in comparison with traditional and state-of-the-art algorithms, the proposed algorithm is evaluated through experiments. Finally, experimental results and future research directions are discussed in Section 5, and this paper is concluded in Section 6.

Mixture Models
Generally, a hyperspectral image can be denoted as a matrix X ∈ R n×m , each column of which corresponds to a pixel vector x j ∈ R n×1 (j = 1, 2, . . . , m) with n spectral bands. m denotes the number of pixels. In this paper, three typical BMMs, including the FM [30], GBM [31], and PPNM [32], have been taken into account. It is noted that the LMM is these models' common part, which is linearly combined by endmembers [1]. The nonlinearity is denoted as the sum of the products of every two endmembers, abundances, and corresponding nonlinear parameters. Therefore, these mixture models can be easily given a general expression for simplicity: Table 1 lists the specific definitions of general parameters t 1 , t 2 and b i,k, j in Equation (1) with respect to four mixture models. If b i,k, j = 0, Equation (1) is the LMM. When t 1 = r − 1 and t 2 = i + 1, Equation (1) becomes the FM if b i,k, j = 1, while it is the GBM if b i,k, j = γ i,k,j . Equation (1) denotes the PPNM when t 1 = r, t 2 = 1 and b i,k, j = ξ j .  [30] r − 1 i + 1 1 GBM [31] r − 1 i + 1 0 ≤ γ i,k,j ≤ 1 PPNM [32] r 1 ξ j ∈ R In Equation (1), a i ∈ R n×1 is the ith endmember vector and s i,j is its abundance in the pixel x j . r is the number of endmembers. To be physically meaningful, abundances of these models should satisfy the nonnegative constraint (ANC) and sum-to-one constraint (ASC) in Equation (2).
represents the Hadamard product operation and a i a k = (a i,1 a k,1 , . . . , a i,n a k,n ) T denotes the second-order interactions between endmember a i and a k , which is a so-called virtual endmember [50][51][52]. Both γ i,k,j ∈ [0, 1] and ξ j ∈ R are bilinear parameters used for scaling the nonlinearity, which make the GBM and PPNM more flexible than the FM. ε j ∈ R n×1 is the additive noise. It can be seen that when γ i,k,j = 0 and ξ j = 0, ∀i, k, j, both the GBM and PPNM transform into the LMM, and when γ i,k,j = 1, ∀i, k, j, the GBM is equivalent to the FM. In Figure 1, it can be observed how pixels based on four models distribute in a three-dimensional feature subspace, respectively.

NMF
NMF [13] aims to approximate a nonnegative matrix  (4) is often used to quantify the approximation.

V WH
Clearly, NMF has the same structure as the LMM. In terms of the LMM-based spectral unmixing, V is the hyperspectral data, W can be regarded as the endmember matrix, and H represents the abundance matrix. Then, both the endmembers and abundances can be achieved simultaneously using the alternate update rules of NMF [13,68] and the nonnegative constraint is naturally satisfied.

Motivation for the Proposed Algorithm
Spectral unmixing is a process of multiple regression in some sense, where endmembers function as the explanatory variables and abundances are the regression coefficients [51][52][53]. The collinearity exists as a common problem for regression analysis when some explanatory variables can be largely explained by the others, causing the accretion of estimation error. In terms of the LMMbased spectral unmixing, it is often considered that true endmembers of different ground covers are usually affinely independent [2]. So sufficient accuracy can usually be provided, ignoring the effect of collinearity when executing the linear spectral unmixing.
However, this is not the case for the BMM-based nonlinear spectral unmixing, where the virtual endmembers are highly correlated with true endmembers. A virtual endmember [50][51][52] is the product of two true endmembers in each band. Although the high correlation between virtual and true endmembers (close to being linearly dependent) may make them have spectral similarity (see Figure 2), virtual endmembers have no physical meaning in terms of ground covers.
Moreover, since virtual endmembers are commonly used as the explanatory variables to estimate abundances with true endmembers, the negative impact of collinearity is inevitable, though no perfect collinearity exists [51]. Specifically, there will be a dramatic drop in the accuracy of estimated abundances, because 1 [50][51][52][53]. This situation can get worse when numerous nonlinear parameters should be estimated and true endmembers are unknown. The degree of collinearity induced by virtual endmembers will be discussed using a quantitative measure named variance inflation factors (VIFs) [51,52] in Section 4.1.

NMF
NMF [13] aims to approximate a nonnegative matrix V ∈ R n×m with the product of two unknown low rank, nonnegative matrices W ∈ R n×r and H ∈ R r×m in Equation (3). The square of the Euclidean distance between V and WH in Equation (4) is often used to quantify the approximation.
Clearly, NMF has the same structure as the LMM. In terms of the LMM-based spectral unmixing, V is the hyperspectral data, W can be regarded as the endmember matrix, and H represents the abundance matrix. Then, both the endmembers and abundances can be achieved simultaneously using the alternate update rules of NMF [13,68] and the nonnegative constraint is naturally satisfied.

Motivation for the Proposed Algorithm
Spectral unmixing is a process of multiple regression in some sense, where endmembers function as the explanatory variables and abundances are the regression coefficients [51][52][53]. The collinearity exists as a common problem for regression analysis when some explanatory variables can be largely explained by the others, causing the accretion of estimation error. In terms of the LMM-based spectral unmixing, it is often considered that true endmembers of different ground covers are usually affinely independent [2]. So sufficient accuracy can usually be provided, ignoring the effect of collinearity when executing the linear spectral unmixing.
However, this is not the case for the BMM-based nonlinear spectral unmixing, where the virtual endmembers are highly correlated with true endmembers. A virtual endmember [50][51][52] is the product of two true endmembers in each band. Although the high correlation between virtual and true endmembers (close to being linearly dependent) may make them have spectral similarity (see Figure 2), virtual endmembers have no physical meaning in terms of ground covers.
Moreover, since virtual endmembers are commonly used as the explanatory variables to estimate abundances with true endmembers, the negative impact of collinearity is inevitable, though no perfect collinearity exists [51]. Specifically, there will be a dramatic drop in the accuracy of estimated abundances, because a 1 , . . . , a r , a 1 a 2 , . . . , a t 1 a r in Equation (1) may produce a matrix approaching singularity, resulting in the results being unstable and sensitive to noise [50][51][52][53]. This situation can get worse when numerous nonlinear parameters should be estimated and true endmembers are unknown. The degree of collinearity induced by virtual endmembers will be discussed using a quantitative measure named variance inflation factors (VIFs) [51,52] in Section 4.1. One intuitive idea for nonlinear spectral unmixing which always comes into being is that if the nonlinearly-mixed pixels can be transformed into the linearly mixed ones, or their nonlinearity can be removed, the conventional linear methods can be used for unmixing. If so, this not only provides the possibility of carrying out the unsupervised unmixing via the constrained NMF, but the collinearity will also not affect the unmixing. To this end, the geometric characteristics of BMMs are exploited to achieve that goal.
Geometrically, each pixel can be considered as a point in an n-dimensional space. Under the assumption of the LMM, endmembers   in the feature subspace to meet the constraints in Equation (2) and enclose all the pixel points. Abundances are equivalent to the pixels' normalized barycentric coordinates with respect to the endmembers on 1 r  [59,61,62]. As shown in Figure 1a, the simplex 2  spanned by three endmembers is a triangle [65][66][67][68].
On the other hand, in terms of the BMMs, the contribution of the nonlinear part will explicitly lead pixel points to move out of 1 r  and distribute in a higher dimension space forming different nonlinear manifold structures in Figure 1b- to an r -dimensional affine hull along the vector corresponding to j x 's nonlinear part. Mathematically, BMM pixels belong to the linear subspace spanned by both true and virtual endmembers, whose dimension is ( 1) 2 r r  for the FM and GBM and ( 3) 2 r r  for the PPNM. However, in the r -dimensional feature subspace, BMM pixels actually lie on the same r -dimensional affine hull constructed by r true endmembers and an additional affinely independent vector.
According to the BMMs' geometric characteristics, the following Property 1 and Theorem 1 were concluded in the authors' previous work [58]. A brief introduction is provided here.  One intuitive idea for nonlinear spectral unmixing which always comes into being is that if the nonlinearly-mixed pixels can be transformed into the linearly mixed ones, or their nonlinearity can be removed, the conventional linear methods can be used for unmixing. If so, this not only provides the possibility of carrying out the unsupervised unmixing via the constrained NMF, but the collinearity will also not affect the unmixing. To this end, the geometric characteristics of BMMs are exploited to achieve that goal.
Geometrically, each pixel can be considered as a point in an n-dimensional space. Under the assumption of the LMM, endmembers {a 1 , a 2 , . . . , a r } construct a (r − 1)-dimensional simplex in the feature subspace to meet the constraints in Equation (2) and enclose all the pixel points. Abundances are equivalent to the pixels' normalized barycentric coordinates with respect to the endmembers on ∆ r−1 [59,61,62]. As shown in Figure 1a, the simplex ∆ 2 spanned by three endmembers is a triangle [65][66][67][68].
On the other hand, in terms of the BMMs, the contribution of the nonlinear part will explicitly lead pixel points to move out of ∆ r−1 and distribute in a higher dimension space forming different nonlinear manifold structures in Figure 1b-d. It is known that the linear mixture part x LMM j = ∑ r i=1 a i s i,j of a BMM-pixel is equivalent to the definition of LMM, so the x LMM j of a pixel x j belongs to the ∆ r−1 constructed by {a 1 , a 2 , . . . , a r }. As a result, a BMM pixel point x j can be simply regarded as the result by pulling the point x LMM j out from ∆ r−1 to an r-dimensional affine hull along the vector corresponding to x j 's nonlinear part. Mathematically, BMM pixels belong to the linear subspace spanned by both true and virtual endmembers, whose dimension is r(r + 1)/2 for the FM and GBM and r(r + 3)/2 for the PPNM. However, in the r-dimensional feature subspace, BMM pixels actually lie on the same r-dimensional affine hull constructed by r true endmembers and an additional affinely independent vector.
According to the BMMs' geometric characteristics, the following Property 1 and Theorem 1 were concluded in the authors' previous work [58]. A brief introduction is provided here.

Property 1.
In the r-dimensional feature subspace, a BMM pixel x j can be affinely represented on an r-dimensional affine hull C r = aff a 1 , a 2 , . . . , a r , p j = x j = ∑ r i=1 a i h i + p j h r+1 ∑ r+1 i=1 h i = 1 constructed by r true endmembers and an extra vertex p j (called the nonlinear vertex). Moreover, x j 's abundances s j and its normalized barycentric coordinates h j = h 1,j , h 2,j , . . . h r+1,j T on the C r (h r+1,j is x j 's normalized barycentric coordinate with respect to p j ) satisfy s i,j = h i,j /∑ r k=1 h k,j , when this p j also lies on the line defined by x j and its corresponding linear mixture component x LMM j . This property can be easily obtained according to the geometric theory of simplex [59][60][61][62]. As shown in Figure 3, if x j 's affine-projection y j on the ∆ r−1 is equal to x LMM j , y j 's normalized barycentric coordinates h 1,j /∑ r k=1 h k,j , h 2,j /∑ r k=1 h k,j , . . . , h r,j /∑ r k=1 h k,j , 0 T on the C r is equal to Remote Sens. 2018, 10, 801 8 of 34 s 1,j , s 2,j , . . . , s r,j , 0 T . However, it can be deduced that the vertex p j is pixel-dependent, and each pixel can have an infinite number of satisfactory nonlinear vertices as long as such a p j is not only affinely independent of {a 1 , a 2 , . . . , a r }, but also lies on the line defined by the points x LMM j and x j (see Figure 3). On the other hand, x LMM j is unknown for most pixels, and determining each x j 's own correct nonlinear vertex p j is impossible.
This property can be easily obtained according to the geometric theory of simplex [59][60][61][62]. As shown in Figure 3, if j x 's affine-projection j y on the , , , r a a a  , but also lies on the line defined by the points LMM j x and j x (see Figure 3). On the other hand, LMM j x is unknown for most pixels, and determining each j x 's own correct nonlinear vertex j p is impossible. ). The case can be extended to a high-dimensional space as well.
In order to circumvent this issue, a unique and common nonlinear vertex p was constructed in the authors' previous work [58] to project all the pixels onto the simultaneously. This single vertex p can concentrate most nonlinear components from the virtual endmembers, and BMM pixels' projections deviate slightly from their true linear mixture components [58]. Theorem 1 has illustrated the construction of such a single nonlinear vertex p based on the idea of computer-aided geometric design [63].  (b) geometric distance measure-based projection using nonlinear planes. In order to circumvent this issue, a unique and common nonlinear vertex p was constructed in the authors' previous work [58] to project all the pixels onto the ∆ r−1 simultaneously. This single vertex p can concentrate most nonlinear components from the virtual endmembers, and BMM pixels' projections deviate slightly from their true linear mixture components [58]. Theorem 1 has illustrated the construction of such a single nonlinear vertex p based on the idea of computer-aided geometric design [63]. Theorem 1. Given r endmembers in the r-dimensional feature subspace, and assuming that a vertex p is the intersection of r (r − 1)-dimensional hyperplanes H r−1 1 , . . . , H r−1 r , each of which is formed by (r − 1) endmembers and the corresponding BMM pixels constituted by them (e.g., {a 1 , a 2 } and ω 3 define the plane H 2 3 in Figure 4), p will span an r-dimensional affine hull C r = aff{a 1 , a 2 , . . . , a r , p} with r endmembers to affinely represent these pixels (see Property 1).
shown in Figure 3, if j x 's affine-projection j y on the , , , r a a a  , but also lies on the line defined by the points LMM j x and j x (see Figure 3). On the other hand, LMM j x is unknown for most pixels, and determining each j x 's own correct nonlinear vertex j p is impossible. ). The case can be extended to a high-dimensional space as well.
In order to circumvent this issue, a unique and common nonlinear vertex p was constructed in the authors' previous work [58] to project all the pixels onto the simultaneously. This single vertex p can concentrate most nonlinear components from the virtual endmembers, and BMM pixels' projections deviate slightly from their true linear mixture components [58]. Theorem 1 has illustrated the construction of such a single nonlinear vertex p based on the idea of computer-aided geometric design [63].  (b) geometric distance measure-based projection using nonlinear planes. Proof. Let a pixel in Equation (1) x j 's nonlinear part , when the endmembers' number r = 2 (see Figure 3), x N MM j = a 1 a 2 s 1,j s 2,j b 1,2,j for the FM and GBM. In this case, the linear part x LMM j = a 1 s 1,j + a 2 s 2,j lies on the ∆ 1 that is a line Remote Sens. 2018, 10, 801 9 of 34 segment defined by a 1 and a 2 . It is clear that x j and two endmembers {a 1 , a 2 } are located on the same two-dimensional plane H 2 in the three-dimensional feature subspace. The vertex p can be the intersection of two tangential (or vertical) lines (i.e., H 1 1 and H 1 2 ) passing through the endmember a 1 and a 2 , respectively. Then, a two-dimensional affine hull aff{a 1 , a 2 , p} is always obtained to represent all the pixels. It is noted that by using such a vertex p, x j can be largely projected to x LMM j on the ∆ 1 [58]. In terms of the PPNM, s i,j s k,j and the coplanarity is still reliable because a i a i is very likely to be highly dependent on a i . When r = 3 (see Figure 4a), in the three-dimensional feature subspace, two-endmember sets {a 1 , a 2 }, {a 1 , a 3 }, and {a 2 , a 3 } form three two-dimensional hyperplanes H 2 3 , H 2 2 , and H 2 1 with corresponding pixels in the same way, respectively. If noise is ignored, BMM pixels constituted by all three endmembers {a 1 , a 2 , a 3 } (i.e., affected by all virtual endmembers) must be located in the local space enclosed by H 2 1 , H 2 2 , H 2 3 . In this case, the vertex p becomes the intersection of planes H 2 1 , H 2 2 , H 2 3 , and {a 1 , a 2 , a 3 , p} span a three-dimensional affine hyperplane C 3 , including all the pixels in the feature subspace.
When r ≥ 4, the formation of a simplex that any ∆ r−1 is enclosed by r (r − 2)-dimensional simplices, and ∆ r−1 is a facet of ∆ r [9,59] can be applied for understanding. p is the intersection of r (r − 1)-dimensional hyperplanes H r−1 1 , . . . , H r−1 r in the r-dimensional feature subspace. H r−1 q is defined by (r − 1) endmembers a 1 , . . . , a q−1 , a q+1 , . . . , a r and any corresponding BMM pixels constructed by them in Equation (1). For instance, {a 1 , a 2 , a 3 , a 4 , p} span a four-dimensional affine hyperplane C 4 where the vertex p is the intersection of planes H 3 1 , H 3 2 , H 3 3 , H 3 4 and H 3 4 is actually the plane of the previous C 3 . Thus, all the pixels are affinely represented on an r-dimensional affine hull aff{a 1 , . . . , a r , p}. H r−1 1 , . . . , H r−1 r will always intersect at a single point because they intersect each other and their normal vectors are linearly independent in the r-dimensional subspace [59]. In [58], the conclusion that the system of linear equations for calculating the nonlinear vertex p always has a unique non-trivial solution also demonstrates the uniqueness.
In this paper, Theorem 1 is partly utilized to estimate the BMM-pixels' normalized barycentric coordinates h with respect to {a 1 , a 2 , . . . , a r } on the affine hull C r . One way to solve h is to calculate the nonlinear vertex p directly, and then using the traditional linear unmixing methods such as the constrained least squares [6] to get h. In fact, after the hyperplanes H r−1 1 , . . . , H r−1 r are determined, pixels' projection coordinates can be obtained by volume-or distance-based geometric linear unmixing methods (see Figure 4b) [7].
A hyperplane H r−1 q actually represents the nonlinearity not relevant to the endmember a q , which is called a nonlinear hyperplane in the rest of the paper. As discussed earlier, the abundances of the BMM pixels on H r−1 q with respect to a q should be equal to zero. Specifically, following the simplex theory [7,[59][60][61][62][63][64][65][66], the ratio of the distances (signed distance measure in [7] is used in this paper) from the pixel x j and a q to the H r−1 q is x j 's normalized barycentric coordinate h q,j with respect to a q . In this sense, h q,j is also x j 's projection coordinateŝ q,j , which is approximately equal to the abundance s q,j . Then, h j can be used to obtain a pixel's projection y j on the ∆ r−1 spanned by {a 1 , a 2 , . . . , a r }, which is its approximate linear component It allows the further use of the conventional linear unmixing methods to obtain more accurate results.
Accordingly, the virtual endmembers are absent from the abundance estimation, leading to the decrease of collinearity's negative effect to a much lower level [58]. As a result, the unmixing accuracy will get great improvement. On the other hand, it is worthy of noting that by exploiting the distance measure, projections can be achieved without determining the nonlinear vertex p or reducing the dimension of data to r. This advantage makes the later proposed method able to update the endmembers in their original space, which will reduce not only the computational burdens, but also the estimation errors of endmembers in unsupervised unmixing.
The goal of this paper is to unmix the BMM pixels in an unsupervised way. In this sense, both endmembers and abundances are unknown and, as stated in the previous section, algorithms such as VCA may wrongly extract endmembers when pure pixels do not exist in the hyperspectral data.
For example, in Figure 5, the FM-based pixels close to the true endmembers are wrongly extracted as the endmembers. Moreover, as shown in Figure 6, when endmembers are not true, pixels' projections will move out of the simplex spanned by endmembers. Nevertheless, since pixels' projections are linearly mixed, they can be substituted into the constrained NMF framework to update the endmembers and abundances. Then, updated endmembers can be further used to calculate the new projections until the endmembers' simplex ∆ r−1 can contain all the pixels' projections compactly. By doing this, more accurate endmembers can be found even if hyperspectral pixels are highly mixed, and better unmixing results will be obtained in an unsupervised way. Figure 7 depicts the main procedure of the proposed method.
The goal of this paper is to unmix the BMM pixels in an unsupervised way. In this sense, both endmembers and abundances are unknown and, as stated in the previous section, algorithms such as VCA may wrongly extract endmembers when pure pixels do not exist in the hyperspectral data.
For example, in Figure 5, the FM-based pixels close to the true endmembers are wrongly extracted as the endmembers. Moreover, as shown in Figure 6, when endmembers are not true, pixels' projections will move out of the simplex spanned by endmembers. Nevertheless, since pixels' projections are linearly mixed, they can be substituted into the constrained NMF framework to update the endmembers and abundances. Then, updated endmembers can be further used to calculate the new projections until the endmembers' simplex can contain all the pixels' projections compactly. By doing this, more accurate endmembers can be found even if hyperspectral pixels are highly mixed, and better unmixing results will be obtained in an unsupervised way. Figure  7 depicts the main procedure of the proposed method.   The goal of this paper is to unmix the BMM pixels in an unsupervised way. In this sense, both endmembers and abundances are unknown and, as stated in the previous section, algorithms such as VCA may wrongly extract endmembers when pure pixels do not exist in the hyperspectral data.
For example, in Figure 5, the FM-based pixels close to the true endmembers are wrongly extracted as the endmembers. Moreover, as shown in Figure 6, when endmembers are not true, pixels' projections will move out of the simplex spanned by endmembers. Nevertheless, since pixels' projections are linearly mixed, they can be substituted into the constrained NMF framework to update the endmembers and abundances. Then, updated endmembers can be further used to calculate the new projections until the endmembers' simplex can contain all the pixels' projections compactly. By doing this, more accurate endmembers can be found even if hyperspectral pixels are highly mixed, and better unmixing results will be obtained in an unsupervised way. Figure  7 depicts the main procedure of the proposed method.

Nonlinear Hyperplanes and Geometric Projection
In this section, the method of determining the aforementioned nonlinear hyperplanes To be specific, the idea of computer-aided geometric design [63] is used here to determine the nonlinear hyperplanes. As shown in Figure 4, r nonlinear midpoints q ω ( 1, , q r   ) are chosen as the control points [63]. They consist of every ( 1) r  endmembers with the same abundances, i.e., (1), q ω can be expressed as: where i q  and k q  , 1, , q r   . When 1 1 t r   and 2 1 t i   , Equation (5)  , , a ω a . At the same time, the selection of nonlinear midpoints   1 2 , , , r ω ω ω  as the control points also guarantees the formation of r non-degenerate r -dimensional simplices [59]. As depicted in Figure 4b,  

Nonlinear Hyperplanes and Geometric Projection
In this section, the method of determining the aforementioned nonlinear hyperplanes H r−1 1 , . . . , H r−1 r and the approach to calculate BMM pixels' normalized barycentric coordinates with respect to endmembers {a 1 , a 2 , . . . , a r } is introduced, as well as their projections on the bottom affine plane (see Figures 5 and 6) defined by {a 1 , a 2 , . . . , a r }.
Firstly, according to the previous analysis and Theorem 1, a nonlinear hyperplane H r−1 q in the r-dimensional feature subspace is defined by the (r − 1)-endmember subset a 1 , . . . , a q−1 , a q+1 , . . . , a r and corresponding BMM pixels constituted by these (r − 1) endmembers. In other words, only a proper BMM pixel (named the control point) constituted by a 1 , . . . , a q−1 , a q+1 , . . . , a r is needed to further determine the hyperplane H r−1 q . To be specific, the idea of computer-aided geometric design [63] is used here to determine the nonlinear hyperplanes. As shown in Figure 4, r nonlinear midpoints ω q (q = 1, . . . , r) are chosen as the control points [63]. They consist of every (r − 1) endmembers with the same abundances, i.e., s 1,ω q = . . . = s q−1,ω q = s q+1,ω q = . . . = s r,ω q = 1/(r − 1). Based on Equation (1), ω q can be expressed as: where i = q and k = q, q = 1, . . . , r. When t 1 = r − 1 and t 2 = i + 1, Equation (5) is the case of the FM and GBM, and when t 1 = r, t 2 = 1, it corresponds to the PPNM. Then, we use the constructed nonlinear midpoint ω q to determine the hyperplane H r−1 q with (r − 1) endmembers a 1 , . . . , a q−1 , a q+1 , . . . , a r . For instance, H 2 2 in both Figures 4 and 8b refers to the plane defined by {a 1 , ω 2 , a 3 }. At the same time, the selection of nonlinear midpoints {ω 1 , ω 2 , . . . , ω r } as the control points also guarantees the formation of r non-degenerate r-dimensional simplices [59]. As depicted in Figure 4b, {a 1 , a 2 , a 3 , ω 1 }, {a 1 , a 2 , a 3 , ω 2 }, and {a 1 , a 2 , a 3 , ω 3 } form three three-dimensional simplices ∆ 3 1 , ∆ 3 2 , and ∆ 3 3 , respectively. A facet of ∆ 3 2 is on the H 2 2 . In the feature subspace, the qth simplex ∆ r q is constructed by the point set M q = m 1,q , . . . , m l,q , . . . , m r,q , m (r+1),q = a 1 , . . . , a r , ω q , M q ∈ R n×(r+1) , l = 1, 2, . . . , r + 1. subspace, the q th simplex r q   is constructed by the point set  Assuming that an endmember matrix is given, the signed distance measure proposed in [7] is adopted to calculate pixels' projection coordinates. In line with the least squares criterion, this distance measure enables the abundance estimation to be implemented in a subspace at a high speed, without dimension reduction. In our method, the whole process of estimating BMM pixels' projection coordinates are split into r sub-steps. Specifically, the distance measure is applied for r constructed simplices 1 2 , , , , respectively. For the r q   , only pixels' projection coordinates associated with endmember q a are calculated, and the distance measure has the following form: represents the weight vector of the l th facet of r q   . The scalar , One can see that Equation (6) is definitely an extended version of the single simplex under the assumption of LMM in [7]. It should be noted that  , , a ω a to 2 2 H are equal to zero because they actually lie on 2 2 H . Therefore, when 2,2 k and 2,2 b in Equation (6)   Secondly, according to the geometric explanation of the LMM, for example, if a pixel x j is linearly mixed by four endmembers (see Figure 8a), a 2 's fractional abundance in x j is equal to the ratio (i.e., projection coordinates [7]) of distances from x j and a 2 to the plane of the bottom triangle defined by {a 1 , a 3 , a 4 }. Similarly, on the constructed r-dimensional simplex ∆ r q , the distances from each BMM pixel point to the nonlinear hyperplane H r−1 q can be used to calculate pixels' projection coordinates with respect to endmember a q . When the BMM pixels' projection coordinates are obtained, they will be approximately projected to their linear parts on the ∆ r−1 constructed by {a 1 , a 2 , . . . , a r }.
Assuming that an endmember matrix A = (a 1 , . . . , a r ) ∈ R n×r is given, the signed distance measure proposed in [7] is adopted to calculate pixels' projection coordinates. In line with the least squares criterion, this distance measure enables the abundance estimation to be implemented in a subspace at a high speed, without dimension reduction. In our method, the whole process of estimating BMM pixels' projection coordinates are split into r sub-steps. Specifically, the distance measure is applied for r constructed simplices ∆ r 1 , ∆ r 2 , . . . , ∆ r r , respectively. For the ∆ r q , only pixels' projection coordinates associated with endmember a q are calculated, and the distance measure has the following form: where k l,q ∈ R n×1 represents the weight vector of the lth facet of ∆ r q . The scalar b l,q is a threshold value. m l,q is the lth column of M q = a 1 , . . . , a r , ω q . One can see that Equation (6) is definitely an extended version of the single simplex under the assumption of LMM in [7]. It should be noted that k q,q corresponds to the weight vector of the hyperplane H r−1 q opposite to the endmember a q .
For instance, Figure 8b shows a constructed simplex ∆ 3 2 spanned by M 2 = (a 1 , a 2 , a 3 , ω 2 ), where k 2,2 is the weight vector of the hyperplane H 2 2 containing the nonlinear midpoint ω 2 . Based on Equation (6), the distance from a 2 to H 2 2 is 1, while the distances from {a 1 , ω 2 , a 3 } to H 2 2 are equal to zero because they actually lie on H 2 2 . Therefore, when k 2,2 and b 2,2 in Equation (6) are calculated, a BMM pixel x j 's projection coordinateŝ 2,j with respect to the endmember a 2 can be further produced bŷ All the column vectors of V q can be regarded as the base vectors spanning the r-dimensional subspace in which k l,q (l = 1, . . . , r + 1) must lie. k l,q can be linearly represented by r column vectors of V q , which is further written as: In Equation (7), β l,q denotes the coefficients of k l,q with respect to V q 's column vectors, and at least one nonzero coefficient exists [7]. By substituting (7) into (6), and letting K q = k 1,q , . . . , k (r+1),q ∈ R n×(r+1) , , and B q = β 1,q , . . . , β (r+1),q ∈ R r×(r+1) , Equations (8) and (9) are obtained: where 1 r+1 is a column vector of all 1s and I (r+1)×(r+1) is a unit matrix. Finally, after k q,q and b q,q are obtained, pixels' projection coordinates of the qth endmember a q can be obtained, which are equal to k T q,q X + b q,q 1 T m . Furthermore, different from [7], let z q = k q,q and c q = b q,q , and the whole projection coordinate vectorŝ j of a BMM pixel x j in our modified method is given as:ŝ where Z = (z 1 , . . . , z r ) ∈ R n×r and c = (c 1 , . . . , c r ) T ∈ R r×1 . The projection of x j is derived: It is noted that if the used endmembers are true, they will construct a compact simplex ∆ r−1 to enclose all the pixels' projections y j (see Figure 6a) [1][2][3][4][5]. However, if endmembers are unknown or wrongly extracted (see Figure 6b), we should further estimate the endmembers. Fortunately, with the use of the obtained linearly-mixed projections of pixels, the minimum-volume-constrained NMF [8][9][10][11][12] can be further adopted to update the endmembers and abundances simultaneously for highly-mixed data. The newly-updated endmembers can be applied to project pixels to their new linear mixture components (i.e., Equation (11)), alternately.
According to the discussion in Section 3.1, there is a minor bias between a BMM pixel x j 's projection y j in Equation (11) and its true linear mixture component x LMM j . This bias is mainly induced by two aspects: constructing several hyperplanes to produce a single nonlinear vertex p to replace all the virtual endmembers, and the noise interference. This bias has been carefully analyzed in [58], which further utilized two different strategies to do the de-biasing. The process of removing bias is iterative and has a larger calculation burden, and its performance relies on true endmembers already being known. Therefore, the de-biasing is not taken into account in the proposed algorithm of this paper, based on two reasons. Firstly, the endmembers are assumed to be unknown here, and should be estimated in the later unsupervised unmixing. In this sense, the de-biasing should be executed at every iteration once endmembers are updated, resulting in high computational complexity. Secondly, using the inaccurate endmembers to do the de-biasing will not only fail in removing the bias, but also transfer the error into the estimation of endmembers and abundances. On the other hand, the accurate update of endmembers and abundances can be entirely implemented by using the obtained pixels' projections due to their close proximity to the corresponding linear mixture components. In Section 4.1, with the use of true endmembers, the geometric projection coordinates (10) are quantitatively evaluated to illustrate the reasonability of omitting de-biasing in the proposed algorithm.

BMM-Based Constrained NMF
When addressing the issue of unsupervised unmixing, the noncovexity of standard NMF's bi-affine objective function makes the solution nonunique, and an unsatisfied local minimum is always produced for unmixing [11,13]. In order to obtain more accurate results, additional physical constraints on endmembers or abundances should be incorporated into the objective function besides the ANC and ASC [11][12][13][14][15][16][17][18][19][20][21][22][23][24][25]. Consistent with the idea of using minimum-volume-constrained NMF for unmixing highly mixed data [11,12], ∆ r−1 spanned by endmembers {a 1 , a 2 , . . . , a r } should contain all the pixels' projections compactly in the proposed algorithm. This implies that the reconstruction error in Equation (4) should be as small as possible, and the volume of ∆ r−1 should decrease at the same time. Therefore, the endmember distance (EMD) in [12] is added into the standard NMF objective function (i.e., Equation (4)) as an auxiliary constraint, which is convex and represents the sum of distances from each endmember to their common centroid. Different from the optimization problem of constrained NMF for traditional linear unmixing, each pixel's projection y j instead of pixel x j itself is adopted in the following situation: where S = (s 1 , s 2 , . . . , s m ) ∈ R r×m is the abundance matrix and Y = (y 1 , . . . , y m ) ∈ R n×m represents the matrix of pixels' projections. The purpose of the endmember distance constraint EMD defined in (13) is to keep the simplex ∆ r−1 of endmembers as compact as possible so as to limit the solution space.
If the reconstruction error Y − AS 2 F becomes smaller, ∆ r−1 's volume will increase to include all the projections, while, if EMD decreases, ∆ r−1 's volume will shrink. λ is a regularization parameter to balance the tradeoff between two terms in (12) to provide accurate endmembers.
Some existing algorithms such as the multiple update rule [13] and projected gradient (PG) method [48,57,68] can solve the optimization problem in (12). In this paper, the PG method is adopted to update endmembers A and abundances S alternately in the framework of constrained NMF.
The gradients of f in (12) with respect to S and A are provided in Equations (14) and (15), respectively: Specifically, the current values of S t+1 or A t+1 are updated using their corresponding values at the last iteration t. Following the PG update rules, S t+1 and A t+1 are updated in Equations (16) and (17) to satisfy the ANC.
In Equations (16) and (17), P[·] represents the operator for projecting S and A to be nonnegative.
α t and β t are the updating steps which can be determined using the Armijo rule [48,57,68]. In simple terms, α t and β t decrease to give a better convergence when the relative error condition between two contiguous iterations in [68] is satisfied, or they will increase to accelerate the search. A t+1 is further substituted into the geometric projection scheme (6)- (11) to generate the new projections Y t+1 . The convergence analysis on the PG has been well analyzed in [68]. An experimental analysis is also provided for discussion of the convergence and complexity of the proposed algorithm in Section 4.1. Moreover, in order to satisfy the ASC, abundances should be normalized and two formulas in (18) [11][12][13][14][15][16][17][18][19][20][21][22][23][24][25] are substituted for operations (14)-(17) instead, where δ is used for balancing the effect of the ASC. Endmembers and abundances with sufficient accuracy will be obtained when the maximum number of iterations is reached, or the error between two iterations is less than a given threshold τ.
The nonlinear coefficients of the GBM and PPNM are not estimated in the proposed algorithm. In the case of the PPNM, if two pixels x j and x j+1 have the same abundances s but different nonlinear scalar coefficients ξ j and ξ j+1 (see Table 1), they still have the same projection y after being projected. In the case of the GBM, although the nonlinear coefficients γ i,k,j affect every virtual endmember, their influence on the projection is still minor because most nonlinear components have been explained by the nonlinear vertex p. On the other hand, endmembers and abundances are actually the core goals of unmixing, which are directly estimated in this work. When accurate endmembers and abundances are obtained, the nonlinear coefficients will be easily calculated by linear programming [65]. According to the flowchart in Figure 7, the proposed BMM-based constrained NMF algorithm (BCNMF) for the unsupervised nonlinear spectral unmixing is summarized in Algorithm 1.

Algorithm 1: BMM-based constrained NMF (BCNMF)
Input: Hyperspectral data X ∈ R n×m and initial endmember matrix A 0 obtained by VCA. Output: Abundance matrix S ∈ R r×m and endmember matrix A ∈ R n×r .

Experimental Results
In this section, the unmixing performance of the proposed algorithm BCNMF is evaluated by carrying out five experiments with the synthetic data, one experiment with a virtual citrus orchard data, and two experiments with real hyperspectral images, respectively. Moreover, to demonstrate the robustness of BCNMF, it is compared with a group of traditional and state-of-the-art linear or nonlinear spectral unmixing algorithms. Accordingly, four supervised algorithms, including the famous linear algorithm FCLS [6], GBM-based semiNMF [38], PPNM-based GDA [32], and MLM [37] are compared for the abundance estimation by using true endmembers or the endmembers extracted by VCA [4]. Besides, four state-of-the-art unsupervised nonlinear spectral unmixing algorithms, including the Fan-NMF [30], distance-based nonlinear simplex projection unmixing (DNSPU) [46], RNMF [54], and bi-objective kernel NMF (Bio-KNMF) [57] have also been considered. Most of these algorithms have been briefly described in the introduction. DNSPU adopts the geodesic distance to exploit the data manifold and estimate both endmembers and abundances. Bio-KNMF combines the objective functions of kernel NMF and NMF to formulate a bi-objective problem to be solved for unmixing. In the following experiments when the true endmembers are unknown, the endmembers extracted by VCA and abundances estimated by FCLS are used as the initialization for the Fan-NMF, RNMF, Bio-KNMF, and the proposed algorithm.
According to the recommendatory parameter settings reported in the compared algorithms' corresponding references, we make their configurations as follows to produce the best results. The maximum numbers of iteration in Fan-NMF, RNMF, Bio-KNMF, and BCNMF are set to 300 and τ = 1e − 5. δ that controls the ASC in GBM-semiNMF, BCNMF, and Fan-NMF is set to 10. λ that balances the additional constraints in BCNMF and RNMF is equal to 0.1 (detailed discussions are given in Section 4.1.5). For the DNSPU, the geodesic distance is used and K = 20 for the nearest-neighbor graph [46]. A Gaussian kernel with parameter σ = 3 is applied in Bio-KNMF, and the weight α is carefully selected from {0, 0.1, . . . , 0.9, 1} through cross-validation to obtain the best performance. The experiments were performed on a computer with a 4-GHz Intel Core i7 CPU and 32 GB of memory.
In Equations (19) and (20), a i and a i represent a true endmember and an estimated endmember, respectively. s i and s i are the true abundance and estimated abundance, respectively. Since the FCLS, GBM-semiNMF, PPNMGDA, and MLM are all supervised unmixing algorithms, only six algorithms' MSADs of extracted endmembers are compared.
On the other hand, it is noted that the issue of unavailable true endmembers and abundances in real hyperspectral images always hinders convincible quantitative evaluation in practice. Many works had to use a metric-like reconstruction error (RE) or signal-to-reconstruction error (SRE) [69] as a reference criterion for evaluation. However, a small value of RE does not really mean that accurate endmembers and abundances are obtained due to the possible occurrence of over-fitting [34,38]. Therefore, we first use a recent physical-based simulated data of a virtual citrus orchard [33,34,70] for unmixing in Section 4.2. It can be considered as a quasi-real hyperspectral data for studying nonlinearity, and true endmembers and abundances are known. Then, in the case of two real hyperspectral images, the reconstructed linear mixture components (i.e., A S) instead of the reconstructed images are used to calculate the SRE LMM (dB) in Equation (21) to give a fair comparison. A large SRE LMM indicates that an algorithm may not reflect the sufficient nonlinear components.

Experiments with Synthetic Data
Three types of synthetic data based on the FM, GBM, and PPNM are generated. Nine material spectra are selected as the endmembers from the US Geological Survey (USGS) digital spectral library. Trees, grass, and several minerals are included (see Figure 9) which are common ground covers associated with the nonlinear mixing effect [26,27]. According to the Dirichlet distribution [4], 2000 pixels' abundances are randomly generated for each data type. Generally, the maximum abundance is set to 0.8 so that no pure pixels exist, and data is highly mixed. Moreover, data with different degrees of mixing is also provided in the third experiment to evaluate the sensitivity of algorithms. In terms of the GBM, the nonlinear parameter γ i,k,j is uniformly drawn in the set [0, 1], while, the parameter ξ j of the PPNM is uniformly drawn from (−0.3, 0.3) as reported in the reference [32]. Finally, the endmembers, abundances, and nonlinear parameters are substituted into three BMMs to produce the corresponding pixels, and additive white Gaussian noise is further added to the data.
As illustrated in the previous sections, the collinearity acts as an essential obstacle in nonlinear spectral unmixing, because the virtual endmembers are highly correlated to true endmembers. Therefore, before conducting the experiments, the true endmembers' minimum VIFs [51,52] are first adopted to compare the collinearity quantitatively when true and virtual endmembers are regarded as the explanatory variables. If a true endmember's VIF is smaller, it is hard to be explained by other variables, and the effect of collinearity on it is negligible [51][52][53]. In Table 2, as the number of endmembers increases, VIFs always increase; but when only true endmembers are considered as the explanatory variables, VIFs are very small. On the contrary, when the virtual endmembers are used, VIFs increase dramatically to very high values, and it is much easier for true endmembers to be explained by the other variables because of the serious collinearity, resulting in the larger errors for unmixing. Even if there are only three endmembers, the collinearity is prominent when considering the virtual endmembers. Accordingly, in the proposed algorithm, since the geometric projection enables the virtual endmembers to be ignored, a great improvement in accuracy can be observed in the results of subsequent experiments.
library. Trees, grass, and several minerals are included (see Figure 9) which are common ground covers associated with the nonlinear mixing effect [26,27]. According to the Dirichlet distribution [4], 2000 pixels' abundances are randomly generated for each data type. Generally, the maximum abundance is set to 0.8 so that no pure pixels exist, and data is highly mixed. Moreover, data with different degrees of mixing is also provided in the third experiment to evaluate the sensitivity of algorithms. In terms of the GBM, the nonlinear parameter , , i k j γ is uniformly drawn in the set [0, 1], while, the parameter j ξ of the PPNM is uniformly drawn from (−0.3, 0.3) as reported in the reference [32]. Finally, the endmembers, abundances, and nonlinear parameters are substituted into three BMMs to produce the corresponding pixels, and additive white Gaussian noise is further added to the data. As illustrated in the previous sections, the collinearity acts as an essential obstacle in nonlinear spectral unmixing, because the virtual endmembers are highly correlated to true endmembers. Therefore, before conducting the experiments, the true endmembers' minimum VIFs [51,52] are first adopted to compare the collinearity quantitatively when true and virtual endmembers are regarded as the explanatory variables. If a true endmember's VIF is smaller, it is hard to be explained by other variables, and the effect of collinearity on it is negligible [51][52][53]. In Table 2, as the number of endmembers increases, VIFs always increase; but when only true endmembers are considered as the explanatory variables, VIFs are very small. On the contrary, when the virtual endmembers are used, VIFs increase dramatically to very high values, and it is much easier for true endmembers to be explained by the other variables because of the serious collinearity, resulting in the larger errors for unmixing. Even if there are only three endmembers, the collinearity is prominent when considering the virtual endmembers. Accordingly, in the proposed algorithm, since the geometric projection enables the virtual endmembers to be ignored, a great improvement in accuracy can be observed in the results of subsequent experiments.
Using the synthetic data, five experiments have been designed for evaluating the compared algorithms. In the first experiment, the effect of collinearity on unmixing is first analyzed by using different numbers of known true endmembers. In this sense, the procedures of updating endmembers in five unsupervised nonlinear unmixing algorithms FanNMF, DNSPU, RNMF, Bio-KNMF, and the proposed BCNMF are omitted. Only the results of estimated abundances are compared. Then, algorithms' robustness to the number of endmembers is analyzed when endmembers are unknown. In the second experiment, the robustness of the algorithms to noise are  Using the synthetic data, five experiments have been designed for evaluating the compared algorithms. In the first experiment, the effect of collinearity on unmixing is first analyzed by using different numbers of known true endmembers. In this sense, the procedures of updating endmembers in five unsupervised nonlinear unmixing algorithms FanNMF, DNSPU, RNMF, Bio-KNMF, and the proposed BCNMF are omitted. Only the results of estimated abundances are compared. Then, algorithms' robustness to the number of endmembers is analyzed when endmembers are unknown. In the second experiment, the robustness of the algorithms to noise are compared by changing the Signal Noise Ratio (SNR = 10 log 10 (E[x T x]/E[ε T ε])) [2]. Next, the third experiment is carried out to evaluate the influence of the degree of mixing on the algorithms' unmixing performances. These experiments have been executed independently twenty times, and the results' averages and standard deviations are listed in the subsequent tables. The complexity and convergence of the proposed algorithm are analyzed, and the computational time is also compared in the fourth experiment. Finally, the impact of regularization parameter λ on BCNMF is discussed in the last experiment.

Robustness to the Collinearity and the Number of Endmembers
In order to illustrate the proposed BCNMF's advantage in addressing the collinearity, true endmembers are first adopted for each algorithm to compare the accuracy of the abundance estimation. The SNR is 40 dB in this experiment. The endmembers are not updated in five unsupervised algorithms. Particularly, geometric projection coordinates (i.e., Equation (10)) obtained by BCNMF are directly used as the estimated abundances without any iterations. Table 3 displays the corresponding results when the numbers of endmembers are changed. It can be observed that BCNMF always performs much better than the other three state-of-the-art unsupervised algorithms. Compared with the supervised algorithms such as PPNMGDA which resort to multiple iterations, BCNMF is just slightly worse when the number of endmembers is three or five. It seems that the projection bias turns to be remarkable in this case, because the de-biasing process is omitted in BCNMF (more accurate results obtained by using de-biasing are provided in [58]).
However, if there are more endmembers (seven and nine) when the collinearity is more serious (see Table 2), BCNMF has the best results compared with other algorithms. It can be concluded that when true endmembers are known (i.e., doing the supervised unmixing), the procedure of geometric projection in the proposed algorithms is still able to produce satisfactory estimated abundances without de-biasing. Moreover, the effect of collinearity is reduced through projecting BMM pixels into their approximate linear mixture components.
Next, data with the maximum abundance 0.8 are generated to evaluate the algorithms by changing the number of endmembers, and SNR is 40 dB. Table 4 shows the endmember extraction results. BCNMF performs the best, and it is clear that accurate endmembers can always be obtained because of the combination of geometric projection and constrained NMF. Geometric projection enables BCNMF to get over the collinearity. Using the linearly-mixed projections, minimum-distance-constrained NMF helps to find the endmembers and alleviate the local minima for highly-mixed data. VCA and DNSPU could not provide endmembers as accurate as other algorithms. This is because these two algorithms are based on the assumption that pure pixels exist in the data, which may be not suitable for the highly-mixed data. Moreover, in Table 5, the accuracies of estimated abundances are compared. Although algorithms like GBM-semiNMF, PPNMGDA and RNMF have shown competitive performances especially for their corresponding model-based data, BCNMF always provides the most accurate abundances. Comparing with the results in Table 3, it is observed that when endmembers are inaccurate, a large drop appears in the accuracies of estimated abundances produced by all other supervised or unsupervised algorithms. Nevertheless, this situation does not significantly impact the BCNMF's results. Experimental results imply that the number of endmembers does not affect the BCNMF much. This can be explained because the issue of collinearity seems to be addressed by the process of geometric projection, and local minima have been alleviated by adding useful constraints into the NMF framework.

Noise Robustness Analysis
In this experiment, three different BMM-based synthetic data are generated by five endmembers, and the SNR is set as 60 dB, 50 dB, 40 dB, 30 dB and 20 dB, respectively, to study the noise's impact on the proposed algorithm. In Tables 6 and 7, MSADs and RMSEs of the compared algorithms are given. The proposed algorithm BCNMF has shown the best unmixing performance no matter what the data is. As the SNR decreases, both the accuracies of estimated endmembers and abundances provided by most algorithms tend to be worse. From the results, one can see that RNMF can obtain accurate endmembers for three types of data because it is a model robust algorithm. DNSPU seems to be affected by noise at a high level, especially when SNR is equal to 20 dB. For this case it also needs the presence of pure pixels, and the calculation of geodesic distance is sensitive to noise. Bio-KNMF has a good accuracy of abundances, but the estimated endmembers are the worst, indicating that endmembers may be mapped into the high-dimensional feature space in an inappropriate way. In terms of specific model-based algorithms such as the GBM-semiNMF and PPNMGDA, more accurate results appear in the unmixing for the corresponding model-based data. Since the MLM is a general model including all degrees of multiple scatterings, it shows good performances in unmixing different BMM-based data as well. Compared with RNMF and BCNMF, Fan-NMF does not add extra constraints into the NMF, which often makes it perform worse and fall into local minima easily. In this experiment, it can be seen that BCNMF is superior for all BMM-based data, and has good robustness to noise.

Analysis of Sensitivity to the Degree of Mixing
In this experiment, the performance of the compared algorithms for the BMM-based data with different degrees of mixing (max abundance) is discussed, especially when data are highly mixed and when pure pixels exist. To produce the synthetic data, max abundance ranges from 0.7 to 1, the number of endmembers is five and SNR is 40 dB. From Table 8, it can be seen that as the max abundance decreases, an obvious drop appears in the accuracy of endmembers extracted by VCA. Moreover, Fan-NMF, RNMF, and Bio-KNMF also show a similar tendency, because they are initialized by VCA but fail to get much better results. On the other hand, although endmembers extracted by VCA are also used for the initiation of BCNMF, it has not been affected because the pixels' linearly-mixed projections motivate the update of endmembers to work in a reasonable way under the constrained NMF framework. The minimum distance constraint enables NMF to find accurate endmembers for data in different mixing degrees. Similar situations happen in the abundance estimation, and related results are shown in Table 9. When the degree of mixing is high, VCA could not find the correct endmembers, and RMSEs of estimated abundances decrease as well. In this experiment, BCNMF still performs the best, owing to the combination of geometric projection and constrained NMF, which has considerably decreased the negative effect of the collinearity and local minima on the unsupervised nonlinear spectral unmixing.

Complexity and Convergence Analysis
The complexity and convergence of the proposed algorithms are first analyzed in this experiment. During t iterations of BCNMF, three update steps for calculating the pixels' projections Y (5)-(11), endmembers A (17), and abundances S (16) cost the most time. To be specific, in the process of geometric projection, the computational costs of Equations (5), (9)-(11) are O(tr 2 n), O(t(r(r + 1)n 2 + (r + 1) 3 + nr 2 (r + 1))), O(trmn 2 ), and O(tr 2 mn), respectively. So, the computational complexity of projection is O(t(r(r + 1)n 2 + (r + 1) 3 + nr 2 (r + 2) + rmn 2 + r 2 mn)). On the other hand, under the minimum-distance-constrained NMF framework, the total computational cost of updating A and S based on the PG is O(2t(κr 2 (m + n) + rmn + rn)) where O(2trn) is the cost of calculating the distance constraint (13). κ represents the average number of the checked condition [68] for updating the step sizes α t and β t per iteration.
In order to further compare the computational time of the algorithms, the number of pixels changes from 1000 to 4000 when the number of endmembers is five and SNR is 40 dB in the following experiment. Table 10 displays the algorithms' time costs for unmixing three different BMMs data. It can be observed that as the number of pixels increases, more time is always needed for each algorithm. In all cases, FCLS has the best speed without considering the nonlinearity. But nonlinear unmixing methods such as PPNMGDA and Bio-KNMF need more time for calculation. Notably, the time cost of BCNMF is considerably low compared with other NMF-based methods and the algorithms which have good unmixing accuracies (e.g., PPNMGDA and MLM) in the previous experiments. It implies that BCNMF may be a fast and efficient nonlinear unmixing algorithm. An experiment is further carried out to illustrate the convergence of BCNMF. Five endmembers are used to generate three BMM-based data and the SNR is 40 dB. In Figure 10, the changing curves of MSADs and RMSEs obtained by BCNMF for the FM, GBM, and PPNM based data are depicted. Both the MSAD and RMSE have converged to the stable small values quickly. In Figure 10b, the initial RMSEs are small because the abundances are initialized by the geometric projection, which has the capability to project pixels onto their approximate linear mixture components (see details in Section 3). Moreover, since the step of geometric projection is in line with the least squares [7] and the convergence of PG to a stationary point for endmember-distance-constrained NMF is also satisfied, referring to [12,68], the proposed algorithm BCNMF can always converge to get satisfactory unmixing results after a sufficient number of iterations.

Parameter Sensitivity Analysis
This experiment is used to analyze the sensitivity of BCNMF's performance to variations of regularization parameter λ in Equation (12). A proper λ will help to find more accurate endmembers even if pixels are highly mixed. The number of endmembers is five and SNR is 40 dB.
The value of λ is well selected from   0, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 2 . Figure 11a,b shows the change of the accuracy of estimated endmembers and abundances, respectively. It can be observed that the overall best unmixing results are obtained when λ is around 0.1 for three BMMs. A larger λ may quickly decrease the accuracy. Although the RMSEs of abundances are still small when 0 λ  , i.e., EMD in Equation (12) is not considered, MSADs of estimated endmembers are worse than the case of 0.1 λ  . It is indicated that the use of minimum distance constraint in the NMF framework indeed helps to alleviate the local minima and especially improve the endmember extraction for highly mixed data.

Experiments with Virtual Orchard and Real Hyperspectral Images
A quasi-real hyperspectral data of virtual citrus orchard constituted by soil, weed, and trees (see Figure 12a) [33,34,70] is first used for evaluation. It is simulated by an extended physically based raytracing (PBRT) model [70] so that detailed light-rays' transmission paths are known, and multiple scattering effects can be modelled realistically. Illumination sources, sensor platforms, geometry descriptions and material optical properties introduced in Table 11 construct the observed scenes. Endmembers' spectrum in Figure 12b are calibrated through replacing the remaining ground covers

Parameter Sensitivity Analysis
This experiment is used to analyze the sensitivity of BCNMF's performance to variations of regularization parameter λ in Equation (12). A proper λ will help to find more accurate endmembers even if pixels are highly mixed. The number of endmembers is five and SNR is 40 dB. The value of λ is well selected from {0, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 2}. Figure 11a,b shows the change of the accuracy of estimated endmembers and abundances, respectively. It can be observed that the overall best unmixing results are obtained when λ is around 0.1 for three BMMs. A larger λ may quickly decrease the accuracy. Although the RMSEs of abundances are still small when λ = 0, i.e., EMD in Equation (12) is not considered, MSADs of estimated endmembers are worse than the case of λ = 0.1. It is indicated that the use of minimum distance constraint in the NMF framework indeed helps to alleviate the local minima and especially improve the endmember extraction for highly mixed data.

Parameter Sensitivity Analysis
This experiment is used to analyze the sensitivity of BCNMF's performance to variations of regularization parameter λ in Equation (12). A proper λ will help to find more accurate endmembers even if pixels are highly mixed. The number of endmembers is five and SNR is 40 dB.
The value of λ is well selected from   0, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 2 . Figure 11a,b shows the change of the accuracy of estimated endmembers and abundances, respectively. It can be observed that the overall best unmixing results are obtained when λ is around 0.1 for three BMMs. A larger λ may quickly decrease the accuracy. Although the RMSEs of abundances are still small when 0 λ  , i.e., EMD in Equation (12) is not considered, MSADs of estimated endmembers are worse than the case of 0.1 λ  . It is indicated that the use of minimum distance constraint in the NMF framework indeed helps to alleviate the local minima and especially improve the endmember extraction for highly mixed data.

Experiments with Virtual Orchard and Real Hyperspectral Images
A quasi-real hyperspectral data of virtual citrus orchard constituted by soil, weed, and trees (see Figure 12a) [33,34,70] is first used for evaluation. It is simulated by an extended physically based raytracing (PBRT) model [70] so that detailed light-rays' transmission paths are known, and multiple scattering effects can be modelled realistically. Illumination sources, sensor platforms, geometry descriptions and material optical properties introduced in Table 11 construct the observed scenes. Endmembers' spectrum in Figure 12b are calibrated through replacing the remaining ground covers

Experiments with Virtual Orchard and Real Hyperspectral Images
A quasi-real hyperspectral data of virtual citrus orchard constituted by soil, weed, and trees (see Figure 12a) [33,34,70] is first used for evaluation. It is simulated by an extended physically based ray-tracing (PBRT) model [70] so that detailed light-rays' transmission paths are known, and multiple scattering effects can be modelled realistically. Illumination sources, sensor platforms, geometry descriptions and material optical properties introduced in Table 11 construct the observed scenes. Endmembers' spectrum in Figure 12b are calibrated through replacing the remaining ground covers with perfectly absorbing background, and averaging those pixels with abundances greater than 0.95 after rendering. This dataset was well calibrated by in situ data, and successfully adopted for evaluating nonlinear unmixing methods recently [34]. The water-vapor absorption bands (99-110, 143-161) were removed in the experiment. with perfectly absorbing background, and averaging those pixels with abundances greater than 0.95 after rendering. This dataset was well calibrated by in situ data, and successfully adopted for evaluating nonlinear unmixing methods recently [34]. The water-vapor absorption bands (99-110, 143-161) were removed in the experiment.
(a) (b) Figure 12. Virtual orchard dataset [34]: (a) high-resolution image (tree, soil and weed); (b) endmembers' spectral curves. With the use of true endmembers and abundances, quantitative unmixing results are provided in Table 12. The proposed BCNMF under the assumptions of all three BMMs always obtained the best MSADs and RMSEs. In particular, BCNMF based on the PPNM performs much better than other methods. Moreover, it is noted that due to the big similarity of FM and GBM and the use of geometric projection, BCNMF based on these two models obtained the same results. On the other hand, VCA also extracted accurate endmembers because the spatial resolution is high and some pure pixels exist. Next, the performance of BCNMF is discussed using two well-chosen real hyperspectral images. In the observed areas corresponding to them, ground cover mainly consists of vegetation, soil, and so on, implying that the nonlinear mixing effect may be a non-negligible factor in spectral unmixing [26][27][28]33,34]. Therefore, they are expected to provide a good test for the nonlinear unmixing  With the use of true endmembers and abundances, quantitative unmixing results are provided in Table 12. The proposed BCNMF under the assumptions of all three BMMs always obtained the best MSADs and RMSEs. In particular, BCNMF based on the PPNM performs much better than other methods. Moreover, it is noted that due to the big similarity of FM and GBM and the use of geometric projection, BCNMF based on these two models obtained the same results. On the other hand, VCA also extracted accurate endmembers because the spatial resolution is high and some pure pixels exist. Next, the performance of BCNMF is discussed using two well-chosen real hyperspectral images. In the observed areas corresponding to them, ground cover mainly consists of vegetation, soil, and so on, implying that the nonlinear mixing effect may be a non-negligible factor in spectral unmixing [26][27][28]33,34]. Therefore, they are expected to provide a good test for the nonlinear unmixing algorithms. The first image was acquired over Moffett Field, CA, by the Airborne Visible Infrared Imaging Spectrometer (AVIRIS) in 1997. A region of interest (see Figure 13) with a size of 50×50 pixels is selected, and the low SNR bands (1-7, 108-113, 152-169, 219-224) are removed, leaving the remaining 187 bands for the experiment. In fact, this dataset has been widely used for the algorithm evaluation of both linear and nonlinear spectral unmixing in the past [31,32,49,54], the results in the previous works provide the comparison and partial ground truth for the later discussion. Three types of ground cover, including water, soil, and vegetation show a prominent presence in the sub-image of Figure 13.
Imaging Spectrometer (AVIRIS) in 1997. A region of interest (see Figure 13) with a size of 50×50 pixels is selected, and the low SNR bands (1-7, 108-113, 152-169, 219-224) are removed, leaving the remaining 187 bands for the experiment. In fact, this dataset has been widely used for the algorithm evaluation of both linear and nonlinear spectral unmixing in the past [31,32,49,54], the results in the previous works provide the comparison and partial ground truth for the later discussion. Three types of ground cover, including water, soil, and vegetation show a prominent presence in the sub-image of Figure 13.
Since the detailed ground truth is not available, the endmembers' spectral curves extracted by BCNMF under the FM, GBM, and PPNM are provided in Figure 14, and compare the estimated abundance maps of water, soil, and vegetation in Figure 15 according to the previous results [31,32,49,54]. It can be observed from Figure 14 that three endmembers have been determined well by BCNMF, and approach the real situations no matter what specific BMM is used. Moreover, abundance maps in Figure 15 also display the similarity of the three materials' distributions to the published results and ground truth. In order to compare the unmixing results of BCNMF with other algorithms' and analyze the contribution of nonlinearity, the maps of differences between the images reconstructed by the FCLS and other nonlinear spectral unmixing algorithms are further provided in Figure 16. Pixels that are located on the boundary between water and soil (or vegetation) can be easily observed in the corresponding results of BCNMF. Residual errors in these pixels are larger because of the nonlinear mixing effect that has been effectively addressed in the proposed algorithm. Other algorithms such as the PPNMGDA and RNMF also show good performances, but the Fan-NMF, DNSPU and Bio-KNMF seem to be less robust for this dataset. SRE LMM obtained by BCNMF with respect to this dataset is shown in Table 12. Similar to the other nonlinear unmixing methods such as PPNMGDA and MLM, obvious nonlinear components may be quantitatively reflected.  Since the detailed ground truth is not available, the endmembers' spectral curves extracted by BCNMF under the FM, GBM, and PPNM are provided in Figure 14, and compare the estimated abundance maps of water, soil, and vegetation in Figure 15 according to the previous results [31,32,49,54]. It can be observed from Figure 14 that three endmembers have been determined well by BCNMF, and approach the real situations no matter what specific BMM is used. Moreover, abundance maps in Figure 15 also display the similarity of the three materials' distributions to the published results and ground truth. In order to compare the unmixing results of BCNMF with other algorithms' and analyze the contribution of nonlinearity, the maps of differences between the images reconstructed by the FCLS and other nonlinear spectral unmixing algorithms are further provided in Figure 16. Pixels that are located on the boundary between water and soil (or vegetation) can be easily observed in the corresponding results of BCNMF. Residual errors in these pixels are larger because of the nonlinear mixing effect that has been effectively addressed in the proposed algorithm. Other algorithms such as the PPNMGDA and RNMF also show good performances, but the Fan-NMF, DNSPU and Bio-KNMF seem to be less robust for this dataset. SRE LMM obtained by BCNMF with respect to this dataset is shown in algorithms. The first image was acquired over Moffett Field, CA, by the Airborne Visible Infrared Imaging Spectrometer (AVIRIS) in 1997. A region of interest (see Figure 13) with a size of 50×50 pixels is selected, and the low SNR bands (1-7, 108-113, 152-169, 219-224) are removed, leaving the remaining 187 bands for the experiment. In fact, this dataset has been widely used for the algorithm evaluation of both linear and nonlinear spectral unmixing in the past [31,32,49,54], the results in the previous works provide the comparison and partial ground truth for the later discussion. Three types of ground cover, including water, soil, and vegetation show a prominent presence in the sub-image of Figure 13.
Since the detailed ground truth is not available, the endmembers' spectral curves extracted by BCNMF under the FM, GBM, and PPNM are provided in Figure 14, and compare the estimated abundance maps of water, soil, and vegetation in Figure 15 according to the previous results [31,32,49,54]. It can be observed from Figure 14 that three endmembers have been determined well by BCNMF, and approach the real situations no matter what specific BMM is used. Moreover, abundance maps in Figure 15 also display the similarity of the three materials' distributions to the published results and ground truth. In order to compare the unmixing results of BCNMF with other algorithms' and analyze the contribution of nonlinearity, the maps of differences between the images reconstructed by the FCLS and other nonlinear spectral unmixing algorithms are further provided in Figure 16. Pixels that are located on the boundary between water and soil (or vegetation) can be easily observed in the corresponding results of BCNMF. Residual errors in these pixels are larger because of the nonlinear mixing effect that has been effectively addressed in the proposed algorithm. Other algorithms such as the PPNMGDA and RNMF also show good performances, but the Fan-NMF, DNSPU and Bio-KNMF seem to be less robust for this dataset. SRE LMM obtained by BCNMF with respect to this dataset is shown in Table 12. Similar to the other nonlinear unmixing methods such as PPNMGDA and MLM, obvious nonlinear components may be quantitatively reflected.    The second image corresponds to a sub-region of the dataset collected by the Hyperspectral Digital Imagery Collection Experiment (HYDICE) sensor over the Mall in Washington DC. After removing the bands affected badly by noise and water absorption bands from the original dataset (210 bands), 100 × 100 pixels with 191 bands comprise the image for the experiment shown in Figure  17. According to the report in [71] and the number of endmembers estimated by HySime [72], five materials including the roofs, trees, water, roads, and grass are considered to be distributed in this area. Due to the limited space, only abundance maps estimated by BCNMF are shown in Figure 18, and Table 12 shows the corresponding SRE LMM of each algorithm. Five materials have been determined and can be easily distinguished from each other. It can be seen that trees and grass occupy the largest area in the image, and the nonlinear mixing effect is supposed to reach a high level between these two materials. Therefore, the differences between the trees' abundances estimated by the FCLS and other algorithms are further compared in Figure 19. The fact that residual errors of trees' abundances increase in the area covered by both trees and grass is reflected by BCNMF as well.    The second image corresponds to a sub-region of the dataset collected by the Hyperspectral Digital Imagery Collection Experiment (HYDICE) sensor over the Mall in Washington DC. After removing the bands affected badly by noise and water absorption bands from the original dataset (210 bands), 100 × 100 pixels with 191 bands comprise the image for the experiment shown in Figure  17. According to the report in [71] and the number of endmembers estimated by HySime [72], five materials including the roofs, trees, water, roads, and grass are considered to be distributed in this area. Due to the limited space, only abundance maps estimated by BCNMF are shown in Figure 18, and Table 12 shows the corresponding SRE LMM of each algorithm. Five materials have been determined and can be easily distinguished from each other. It can be seen that trees and grass occupy the largest area in the image, and the nonlinear mixing effect is supposed to reach a high level between these two materials. Therefore, the differences between the trees' abundances estimated by the FCLS and other algorithms are further compared in Figure 19. The fact that residual errors of trees' abundances increase in the area covered by both trees and grass is reflected by BCNMF as well.  The second image corresponds to a sub-region of the dataset collected by the Hyperspectral Digital Imagery Collection Experiment (HYDICE) sensor over the Mall in Washington DC. After removing the bands affected badly by noise and water absorption bands from the original dataset (210 bands), 100 × 100 pixels with 191 bands comprise the image for the experiment shown in Figure 17. According to the report in [71] and the number of endmembers estimated by HySime [72], five materials including the roofs, trees, water, roads, and grass are considered to be distributed in this area. Due to the limited space, only abundance maps estimated by BCNMF are shown in Figure 18, and Table 12 shows the corresponding SRE LMM of each algorithm. Five materials have been determined and can be easily distinguished from each other. It can be seen that trees and grass occupy the largest area in the image, and the nonlinear mixing effect is supposed to reach a high level between these two materials. Therefore, the differences between the trees' abundances estimated by the FCLS and other algorithms are further compared in Figure 19. The fact that residual errors of trees' abundances increase in the area covered by both trees and grass is reflected by BCNMF as well.   The second image corresponds to a sub-region of the dataset collected by the Hyperspectral Digital Imagery Collection Experiment (HYDICE) sensor over the Mall in Washington DC. After removing the bands affected badly by noise and water absorption bands from the original dataset (210 bands), 100 × 100 pixels with 191 bands comprise the image for the experiment shown in Figure  17. According to the report in [71] and the number of endmembers estimated by HySime [72], five materials including the roofs, trees, water, roads, and grass are considered to be distributed in this area. Due to the limited space, only abundance maps estimated by BCNMF are shown in Figure 18, and Table 12 shows the corresponding SRE LMM of each algorithm. Five materials have been determined and can be easily distinguished from each other. It can be seen that trees and grass occupy the largest area in the image, and the nonlinear mixing effect is supposed to reach a high level between these two materials. Therefore, the differences between the trees' abundances estimated by the FCLS and other algorithms are further compared in Figure 19. The fact that residual errors of trees' abundances increase in the area covered by both trees and grass is reflected by BCNMF as well.    Figure 19. The differences in trees' abundance maps between the fully constrained least squares (FCLS) and nonlinear spectral unmixing algorithms.

Discussion
In Section 4, both qualitative and quantitative evaluation of BCNMF and the compared algorithms are presented in the experiments with synthetic data, virtual orchard data and real hyperspectral remote sensing images. The comprehensive results can be very useful to validate BCNMF's superiority in unsupervised nonlinear unmixing.

Unmixing Accuracy Improvement by Addressing the Collinearity
Endmembers' VIFs in Table 2 partly reflect that virtual endmembers can greatly increase the collinearity's negative influence in nonlinear unmixing. However, this issue can be well addressed by the proposed BCNMF. It is mainly because that the procedure of geometric projection in Equations (5)-(11) can prevent virtual endmembers from participating directly in the abundance estimation. In this process, pixels' approximate linear components can be obtained, and then nonlinear unmixing is naturally transformed into the linear unmixing. Therefore, the impact of collinearity is significantly reduced and the unmixing accuracy is improved in the proposed method. Nevertheless, traditional algorithms such as Fan-NMF and PPNMGDA which rely on virtual endmembers for the parameter estimation will definitely suffer from the collinearity, resulting in larger unmixing errors.

Discussion
In Section 4, both qualitative and quantitative evaluation of BCNMF and the compared algorithms are presented in the experiments with synthetic data, virtual orchard data and real hyperspectral remote sensing images. The comprehensive results can be very useful to validate BCNMF's superiority in unsupervised nonlinear unmixing.

Unmixing Accuracy Improvement by Addressing the Collinearity
Endmembers' VIFs in Table 2 partly reflect that virtual endmembers can greatly increase the collinearity's negative influence in nonlinear unmixing. However, this issue can be well addressed by the proposed BCNMF. It is mainly because that the procedure of geometric projection in Equations (5)- (11) can prevent virtual endmembers from participating directly in the abundance estimation. In this process, pixels' approximate linear components can be obtained, and then nonlinear unmixing is naturally transformed into the linear unmixing. Therefore, the impact of collinearity is significantly reduced and the unmixing accuracy is improved in the proposed method. Nevertheless, traditional algorithms such as Fan-NMF and PPNMGDA which rely on virtual endmembers for the parameter estimation will definitely suffer from the collinearity, resulting in larger unmixing errors.
This viewpoint can be demonstrated by the results in Table 3 where true endmembers are assumed to be known in advance. In this ideal situation, BCNMF shows better or competitive Figure 19. The differences in trees' abundance maps between the fully constrained least squares (FCLS) and nonlinear spectral unmixing algorithms.

Discussion
In Section 4, both qualitative and quantitative evaluation of BCNMF and the compared algorithms are presented in the experiments with synthetic data, virtual orchard data and real hyperspectral remote sensing images. The comprehensive results can be very useful to validate BCNMF's superiority in unsupervised nonlinear unmixing.

Unmixing Accuracy Improvement by Addressing the Collinearity
Endmembers' VIFs in Table 2 partly reflect that virtual endmembers can greatly increase the collinearity's negative influence in nonlinear unmixing. However, this issue can be well addressed by the proposed BCNMF. It is mainly because that the procedure of geometric projection in Equations (5)- (11) can prevent virtual endmembers from participating directly in the abundance estimation. In this process, pixels' approximate linear components can be obtained, and then nonlinear unmixing is naturally transformed into the linear unmixing. Therefore, the impact of collinearity is significantly reduced and the unmixing accuracy is improved in the proposed method. Nevertheless, traditional algorithms such as Fan-NMF and PPNMGDA which rely on virtual endmembers for the parameter estimation will definitely suffer from the collinearity, resulting in larger unmixing errors.
This viewpoint can be demonstrated by the results in Table 3 where true endmembers are assumed to be known in advance. In this ideal situation, BCNMF shows better or competitive performance, even if the projection coordinates (without any iterations) are directly adopted as the abundances for comparison. We consider that the strong interference of virtual endmembers in unmixing is effectively avoided because of the geometric projection. Moreover, it also implies that the impact of projection bias seems to be weaker than the collinearity in most cases, partly leading to a reasonable explanation of removing the de-biasing process in BCNMF. However, in the case of other compared algorithms, the collinearity might affect seriously the abundance estimation, which cannot be actually addressed by multiple iterations. Since the collinearity problem can be overcome, BCNMF still keeps its outstanding unmixing performance in other experiments where more complex data (e.g., more endmembers and lower SNRs) are considered. Quantitative results including endmembers' MSADs and abundances' RMSEs in Tables 4-9 prove reasonably that BCNMF is a robust and stable unsupervised nonlinear unmixing algorithm.
Further, the proposed BCNMF is also validated in the experiments with a virtual orchard dataset and two real hyperspectral images. Since true endmembers and abundances in the real hyperspectral data are unknown, the virtual orchard physical-based dataset can be very useful for the quantitative analysis. In Table 12, BCNMF has smaller RMSEs and MSADs than the other algorithms based on the same BMMs, which indicates that it can obtain more accurate endmembers and abundances in this quasi-real situation. In addition, the quantitative SRE LMM with respect to the real data in Table 12 and Figures 14-19 further illustrate that BCNMF is able to reflect the nonlinear components in practice like other state-of-the-art methods (e.g., MLM). As stated earlier, we can also infer that the improvement of accuracy in BCNMF mainly comes from the operation of geometric projection which decreases the negative effect of collinearity.

Improvement of Endmember Extraction for Highly Mixed Nonlinear Data
Endmembers may be wrongly extracted by methods such as VCA especially when pixels are highly mixed. The accuracy of the used endmembers can also influence the geometric projection. Therefore, using the calculated pixels' approximate linear components, NMF is further exploited in BCNMF to update endmembers and abundances in an unsupervised way. In addition, a minimum-endmember-distance constraint is incorporated into the framework of NMF to produce better unmixing results. Tables 8 and 9 show the algorithms' sensitivities to the degree of mixing. BCNMF extracted the most accurate endmembers no matter when the pixels were highly mixed or not. Besides, more accurate abundances were also obtained by BCNMF. However, VCA failed to extract endmembers accurately, especially when data was highly mixed. As a result, supervised methods (i.e., PPNMGDA, GBM-semiNMF and MLM) using endmembers extracted by VCA could not obtain the abundances as accurate as the results they produced when true endmembers were used in Table 3. Moreover, other unsupervised NMF-based methods still could not outperform the proposed method. It is because that the minimum distance constraint had alleviated the issue of local minima, and enabled BCNMF to find accurate endmembers for data in different mixing degrees. The sensitivity analysis on regularization parameter λ in Figure 11 further proves the necessity of this constraint. When the EMD in Equation (12) is ignored (i.e., λ = 0), although RMSEs of the estimated abundances seem not to be influenced, MSADs of the estimated endmembers are always worse than the case λ = 0.1 for all three BMMs. Therefore, we can conclude that BCNMF could find more accurate endmembers compared with other supervised or unsupervised unmixing algorithms, resulting from the incorporation of minimum distance constraint in the NMF framework.
Moreover, Figure 10 proves that BCNMF has a good convergence for nonlinear unmixing. Table 10 further illustrates that the time cost of BCNMF is much lower than the algorithms like PPNMGDA and kernel-based Bio-KNMF. In comparison with the fast methods such as GBM-semiNMF and RNMF, BCNMF's computational speed can also be slightly better. We consider that the speed advantage of the proposed method should come from both the low computational complexities of geometric projection and constrained NMF.

Limitations
The combination of geometric projection and minimum distance constrained NMF enables BCNMF to alleviate the collinearity and unmix highly mixed nonlinear hyperspectral data. However, the following limitations should be overcome. (1) An effective scheme may be built to reduce the complexity of de-biasing so that projection bias can be further removed to get better unmixing results; (2) It is noted that we only exploited the geometric characteristics of BMM data without considering the spatial information of hyperspectral imagery. Therefore, a proper spatial regularizer should be adopted in the framework of BCNMF to improve unmixing performance.

Conclusions
This paper has presented an unsupervised nonlinear spectral unmixing algorithm by exploiting the common geometric property of BMMs. Through a process of geometric projection, pixels' approximate linear mixture components are obtained, and then put into the framework of CNMF for further unmixing. Using the given endmembers, pixels' projections are obtained by a signed distance measure for calculating pixels' barycentric coordinates on the constructed simplices. As a result, not only the negative effect of collinearity can be overcome, but unsupervised linear spectral unmixing algorithms such as CNMF can also be used to unmix the projections and update endmembers and abundances. Similar to the linear case, the objective function of standard NMF is properly constrained to mitigate the problem of local minima by the endmember distance constraint. After endmembers are updated, projections will be updated as well until the algorithm converges. Compared with the traditional and state-of-the-art algorithms in the experiments with both synthetic data and real hyperspectral images, the proposed algorithm has provided more accurate unmixing results.
In the future work, we will deal with the drawbacks of the proposed method in ignoring projection bias and spatial information of hyperspectral remote sensing imagery. Moreover, an implicit assumption of the BMMs at present is that the strength of nonlinearity in different spectral bands is the same, which is commonly not the case in practical applications. Therefore, a study on the band-wise nonlinear spectral unmixing algorithms will also be an interesting and meaningful piece of work.
Author Contributions: B.Y. conceived and designed the method and experiments, performed the experiments, analyzed the data and wrote the paper. B.W. reviewed the results, edited the paper and provided advices for the preparation and revision of the paper. Z.W. provided the mathematical expertise of this work.