Parameterized Nonlinear Least Squares for Unsupervised Nonlinear Spectral Unmixing

This paper presents a new parameterized nonlinear least squares (PNLS) algorithm for unsupervised nonlinear spectral unmixing (UNSU). The PNLS-based algorithms transform the original optimization problem with respect to the endmembers, abundances, and nonlinearity coefficients estimation into separate alternate parameterized nonlinear least squares problems. Owing to the Sigmoid parameterization, the PNLS-based algorithms are able to thoroughly relax the additional nonnegative constraint and the nonnegative constraint in the original optimization problems, which facilitates finding a solution to the optimization problems . Subsequently, we propose to solve the PNLS problems based on the Gauss–Newton method. Compared to the existing nonnegative matrix factorization (NMF)-based algorithms for UNSU, the well-designed PNLS-based algorithms have faster convergence speed and better unmixing accuracy. To verify the performance of the proposed algorithms, the PNLS-based algorithms and other state-of-the-art algorithms are applied to synthetic data generated by the Fan model and the generalized bilinear model (GBM), as well as real hyperspectral data. The results demonstrate the superiority of the PNLS-based algorithms.


Introduction
Due to the spatial resolution limitation of hyperspectral remote sensors as well as the diversity of surface features, mixed pixels are a common occurrence in hyperspectral images.Spectral unmixing is an essential technique for analyzing mixed pixels which extracts a collection of spectral signatures or endmembers and their corresponding fractions (abundances) contained in the mixed pixels.A large number of unmixing algorithms have been proposed based on the linear mixture model (LMM), which assumes that the spectrum of the mixed pixels is a linear combination of several endmembers.The linear unmixing process involves extracting the endmembers [1] and estimating their abundances [2] or obtaining endmembers and abundances simultaneously [3][4][5].
Nevertheless, the LMM cannot necessarily reflect the true composition of real-world scenarios such as planetary remote sensing, intimate mineral mixtures, vegetation canopies, or urban scenes [6].Therefore, kernels are popularly adopted to introduce nonlinearities in unmixing [7][8][9], which is independent of specific mixing models.However, the kernel-based methods largely depend on the choice of the kernel functions and parameters [6].On the other hand, physical models based on the the nature of environment have been proposed to overcome the limitation of the LMM, including the Hapke model proposed in [10] and the bilinear mixture models (BMMs) [6].In BMMs, the light probably undergoes secondary reflections or bilinear interactions before reaching the sensor.Different assumptions on the abundance constraints or bilinear parameters result in different expressions of the BMMs, such as the polynomial post-nonlinear model [11], the Nascimento model [12], the Fan model [13], and the generalized bilinear model (GBM) [14].Several supervised nonlinear unmixing algorithms based on BMMs has been developed [14][15][16].However, due to the lack of an endmember updating process, the unmixing performance of these supervised nonlinear unmixing algorithms is limited and greatly influenced by the accuracy of the endmember extraction methods [16], e.g., vertex component analysis (VCA) [1], pixel purity index (PPI) [17], simplex growing algorithm (SGA) [18] or N-FINDR algorithm [19].To alleviate the limitations of supervised nonlinear unmixing, the unsupervised nonlinear spectral unmixing (UNSU) provides a viable option and some attempts have been made to estimate simultaneously the endmembers and abundances (or nonlinearity coefficients) for nonlinear mixing models.For example, the polynomial post-nonlinear mixing model (PPNMM) has been solved with an unsupervised method using a Hamiltonian Monte Carlo Algorithm [20] and has been shown to result in a better unmixing performance than the supervised method [15].In [21], the authors provide an UNSU method for the Fan model using the NMF (Fan-NMF), which is proved to obtain better performance than the supervised or unsupervised LMM-based methods or the supervised GBM-based methods.In [22], we have also proposed an NMF-based UNSU method for the GBM (GBM-NMF).
In NMF-based methods, the physical constraints on the endmembers (nonnegativity) and abundances (nonnegativity and sum-to-one) are taken into consideration by means of a projected gradient (PG) update algorithm.The PG method is originally proposed by Lin [23] and provides a faster convergence than the multiplicative update rules designed by Lee and Seung [24].However, NMF-based UNSU methods suffer from two problems.First, the simple projection step in the PG of setting negative values to zero or small positive values to impose non-negativity makes it difficult to provide a theoretical analysis of its convergence [25].The projection step is likely to increase the objective function and may cause non-monotonic changes in the objective function, resulting in inaccurate factorization [26].Second, only the first derivatives are used for updating in the NMF-based UNSU methods, which is inefficient for solving nonlinear mixing models since the endmembers to be estimated have strong interactions due to the existence of virtual endmembers, which are generated by the Hadamard product mathematically.
To further alleviate the problems of the NMF-based UNSU methods, in this study, we propose two well-designed UNSU methods based on parameterized nonlinear least squares (PNLS) for the widely used GBM and Fan model.In the proposed methods, the optimization problem for the UNSU is transformed into two (for Fan model) or three (for GBM) alternate linear/nonlinear least squares (LS/NLS) problems.To avoid the simple projection step which ensures nonnegativity, the unknown endmembers, abundances, and nonlinearity coefficients (only for the GBM) are parameterized by a Sigmoid function.In this way, the LS/NLS problems with additional constraints are transformed into PNLS problems free of constraints.The Gauss-Newton method is then utilized to alternately solve the unconstrained PNLS problems rather than the PG method used in the NMF-based UNSU methods.The proposed PNLS-based UNSU algorithms can obtain the endmembers, abundances, and nonlinearity coefficients (only for the GBM) simultaneously.The experimental results of synthetic and real data have verified that the proposed algorithms achieve better unmixing performance than the NMF-based UNSU algorithms for the GBM and the Fan model.
The remainder of this paper is organized as follows: Section 2 provides a brief review of the GBM and Fan model; Section 3 details the proposed PNLS-based UNSU algorithms; Section 4 presents and analyses a series of experiments using both synthetic and real hyperspectral data; Section 5 concludes this paper.

Bilinear Mixing Models
This section provides a brief introduction to the general form of two widely used bilinear models including the GBM and the Fan model.

GBM
The GBM is derived from the combination of an LMM and an additional nonlinear effect term, which emphasizes the second-order scattering effect.Let X ∈ L×N be any hyperspectral image with L bands and N pixels, then the nth pixel x n in X can be represented using the GBM as follows [16]: with constraints where the first term on the right side of the Equation (1) indicates the linear mixing effect and the second term indicates the nonlinear mixing effect.m p denotes the pth endmember in the endmember matrix M ∈ L×P and A p,n denotes the corresponding abundance of m p in the pixel x n .denotes the Hadamard product between two vectors.B (p,q),n is the nonlinearity coefficient controlling the significance of the nonlinear effect.The subscript (p, q) means that the item is derived from or associated with A p,n and A q,n .ffl represents the additional noise.It is noticed that apart from the nonnegative constraint, an upper bound constraint is also imposed on B. They are collectively called the boundary constraint hereinafter unless otherwise specified.

Fan Model
The Fan model can be regarded as a special case of the GBM where B (p,q),n = A p,n A q,n instead of the constraint 0 ≤ B (p,q),n ≤ A p,n A q,n , i.e., the Fan model has the following expression: with constraints:

Proposed PNLS
In this section, we present the proposed PNLS for the GBM and the Fan model.As the derivations of the PNLS for both models are quite similar, we only show the detailed derivation of the PNLS for the GBM and the PNLS for the Fan model will be given directly.

Definition of the Alternate LS/NLS Problems
Based on the GBM in (1), the UNSU problem can be defined as three alternate LS/NLS optimization problems for the local optimal solution of the endmembers, abundances, and nonlinearity coefficients.

Constrained NLS for Endmembers Estimation
To define the endmembers estimation as an NLS problem, the pixel-wise or column-wise form defined in Equation (1) should be transformed into the band-wise or row-wise form as follows: with constraints: where x l , m l , z l are row vectors denote the lth row of X, M, Z respectively and Z = [m 1 m 2 , m 1 m 3 , ..., m P−1 m P ] is the virtual endmember matrix.For the endmembers estimation, the abundances and nonlinearity coefficients are assumed to be fixed.Thus the constraints with respect to the abundances and nonlinearity coefficients can be neglected and the following constrained NLS problem is obtained:

Constrained LS for Abundances Estimation
The endmembers M, virtual endmembers Z, and nonlinearity coefficients B are supposed to be fixed when updating the abundances A. To relax the abundances sum-to-one constraint (ASC), we use the method in [2] and add a pseudo-band into X, M, and Z, resulting in where δ is a parameter controlling the weight of the sum-to-one constraint.Therefore, the pixel-wise constrained LS problem for the abundances estimation can be expressed as where a n , b n , x n are column vectors denote the nth column of A, B, X respectively.

Constrained LS for Nonlinearity Coefficients Estimation
The expression of the LS for the nonlinearity coefficients B estimation is similar to that for the abundances estimation.By fixing M, Z, and A and neglecting their constraints, we obtain the following constrained LS problem:

Sigmoid Parameterization
Due to the existence of a nonnegative constraint and a boundary constraint, there is no analytical solution for the NLS problem in Equation (7) or LS problems in Equations (11) and (12).A conventional method to deal with this alternate optimization problem is the PG descent method, which has been used in some studies [21,22].However, as discussed above, the simple projection step in the PG method may prevent the convergence.Therefore, if the nonnegative constraint and boundary constraint can be naturally enforced or relaxed rather than using a rough projection, a better convergence can be expected.In this study, we propose to introduce the Sigmoid function [27] to parameterize the unknown variables in the alternate optimization procedures.Through the Sigmoid parameterization, the nonnegative constraint and boundary constraint are totally relaxed and the optimization problems shown in Equations ( 7), (11) and (12) are totally free of constraints, which facilitates the solution of those problems.
The Sigmoid function of a scalar c can be expressed as follows: The Sigmoid function fits well with our problems for two reasons: (1) the Sigmoid function is derivable with respect to any value of c, which makes the parameterization problems easy to solve.More, its derivative has a simple form: dg(c)/dc = g(c)(1 − g(c)).(2) The output of the Sigmoid function varies in the range of 0-1, which means that the function is bounded and has upper and lower bounds.This property agrees well with the constraints of our problems.Although some other functions (e.g., the exponential function or quadratic function) also have the ability to relax the nonnegative constraint, they have no upper bound, which may cause a 'numerical explosion' that affects the optimization process negatively.In fact, the pixel values are usually normalized to the range of 0-1 before unmixing.The parameterization without an upper bound may cause its output to increase continuously and even cause overflows during updating.
In this paper, the endmembers, abundances, and nonlinearity coefficients are parameterized by the following forms: where g(E), g(D), and g(F) denote the element-wise sigmoid function of E, D, and F. In particular, B is parameterized by the Hadamard product of Y and g(F), where Y (p,q),n = A p,n A q,n .Given the fixed Y, this parameterization ensures not only the nonnegative constraint but also the upper bound constraint.In this manner, the nonnegativity of M and A, as well as the nonnegativity and upper bound of B are essentially guaranteed.Therefore, the nonnegative constraint and boundary constraint can be totally relaxed.As a result, the NLS problem in Equation ( 7) is transformed into the following unconstrained PNLS problem: where e l is a row vector denotes the lth row of E.
As the virtual endmember Z = [m 1 m 2 , m 1 m 3 , ..., m P−1 m P ], it should be noted that Z is also parameterized by E.
The LS problem in Equation ( 11) is transformed into the following unconstrained PNLS problem: where d n is a column vector denotes the nth column of D.
Differ to the B in the LS problem in Equation ( 11), B in Equation ( 16) is also parameterized by D.
The LS problem in Equation ( 12) is transformed into the following unconstrained PNLS problem: where f n , y n are column vectors denote the nth column of F, Y respectively.

Gauss-Newton Based Optimization
The minimization problems in Equations ( 15)-( 17) can be solved using the alternate gradient descent method.But neither the second derivatives or its approximation is considered in this method.
For the PNLS problems in Equations ( 15)-( 17), there exist obvious nonlinearity caused by the bilinear model itself and the sigmoid parameterization, which makes it inefficient to optimize these problems by the gradient descent method.Furthermore, the convergence of the gradient descent method has to be ensured by a suitable step size and the commonly used line search method is very time-consuming.For above considerations, we propose to introduce the Gauss-Newton method [28] to solve the PNLS problems in Equation ( 15)-( 17) in an alternate manner.

Endmembers Updating Rule
Without loss of generality, we first denote r(e (t) l ) as the residual of the lth band with variable e l at the tth iteration with the following form: Based on the Taylor formula, the first order approximation of the Taylor expansion of the residual of the variable e l at the (t + 1)th iteration can be obtained: where l and J r (e l ) is the Jacobian matrix of r(e l ) whose entries are The complete matrix expression of J r (e l )) has the following form: where diag(g(e )) is a P × P diagonal matrix with the entries of vector g(e l )) aligned on the main diagonal, and V n is a P × P symmetric matrix determined by entries in b n with the following form: For example, when P = 4, Q = P(P − 1)/2 = 3, then At the (t + 1)th iteration, the task is to minimize r(e By substituting e (t+1) l using Equation ( 19), we can alternatively minimize the following function: Given that the variable e l is assumed to be fixed at the tth iteration, this is a typical linear least squares problem with respect to ∆.By letting the derivation equal to 0, the analytical solution of ∆ can be easily obtained: As l , the endmember updating rule can be explicitly obtained:

Abundances Updating Rule
According to the analysis above, the derivation of the updating rule for the abundances can be interpreted in a similar manner.Again, we denote r(d n ) as the residual of the nth pixel with variables d n at the tth iteration with the following form: Using a Taylor approximation and algebraic operation, the following updating rule is deduced: where

Nonlinearity Coefficients Updating Rule
Analogously, the residual of the nth pixel with variables f n at the tth iteration is denoted as r( f n ) and has the following form: The updating rule can be deduced: where the Jacobian matrix J r (f

Generalization to Fan Model
There is only a subtle difference between the GBM and the Fan model and only two variables need to be updated in Fan model, i.e., the endmembers and the abundances.The parameterization and derived updating rules are similar to those of the GBM.We reuse the expressions in Equations ( 26) and (28) as the updating rules for consistency and clarity: It is worth noting that the expression of the Jacobian matrix J r (d 34) differs from the one in Equation ( 28): Table 1 summaries the different specific symbol meanings when the same form of the updating rules are applied to different nonlinear mixing models.The specific forms of the variables V n , U n and W l in Table 1 are listed in Table 2.
Table 1.Specific symbol meanings for the GBM and Fan model in the updating rules.

GBM Fan Model
r(e Table 2.The specific forms of the variables V n , U n and W l in Table 1. The proposed PNLS for the UNSU is summarized in Algorithm 1 (GBM version) and Algorithm 2 (Fan version).We denote the GBM version as GBM-PNLS and the Fan version as Fan-PNLS.

Algorithm 1 GBM-PNLS for unsupervised nonlinear unmixing
Input: hyperspectral data matrix X, parameter δ and iteration number T. Output: endmember matrix M, abundance matrix A and nonlinearity coefficients matrix B.
1: Initialize M, A and B properly.Then E, D and F are initialized by the inverse parameterization functions, respectively.

4:
for each row of E do 5: Update e l according to (26).for each column of D do 8: Update d n according to (28).for each column of F do 11: Update f n according to (31). 12: end for 13: end while Algorithm 2 Fan-PNLS for unsupervised nonlinear unmixing Input: hyperspectral data matrix X, parameter δ and iteration number T. Output: endmember matrix M and abundance matrix A.
1: Initialize M and A properly.Then E and D are initialized by the inverse parameterization functions, respectively.

8:
for each column of D do 9: Update d n according to (34).The minimization problems for the GBM and Fan model are non-convex for all the variables.In addition, the Gauss-Newton method can only obtain local minimum.To speed up the convergence, the initialization for the proposed algorithm is crucial.We use the state-of-the-art simplex growing algorithm (SGA) [18] and fully constrained least squares (FCLS) method to initialize the endmembers M and abundances A respectively to increase the convergence speed.Comparison of different initialization methods for endmembers is also presents in Section 4. For the GBM, the nonlinearity coefficient matrix B is initialized with the Hadamard product operation of A.

Damping
One can observe that there exists an inversion operation of the square matrix, e.g., J T r (e (t) l ), in the updating rules.Once the square matrix is singular or close to singular, the updating results can be inaccurate.Therefore, a damping term is added when implementing the updating rules as follows: where I is the identity matrix and γ is the damping factor.In Levenberg's algorithm [29], the damping factor is adjusted at each iteration based on the reduction speed of the objective function.In this study, we find it is sufficient to fix γ at 0.01 to obtain a good convergence speed.

ASC Factor
The sum-to-one constraint weight δ influences the unmixing results.A large enough δ guarantees that the estimated abundances satisfy the physical meaning.However, a large δ will magnify the final reconstruction error and degrade the unmixing performance.In this study, according to the pre-experiment, δ is set to 1 to achieve a balance between the importance of the physical meaning and a low reconstruction error.

Stopping Criteria
We use two stopping criteria in the algorithm.One is the relative difference between two successive iterations and the other is the maximum iteration number T. When the cost function changes slowly enough as follows: or the maximum iteration number T is reached at 400, the algorithm terminates.

Experimental Results and Analysis
The performance of the proposed algorithms were evaluated with experiments using synthetic and real hyperspectral data.For comparison, some state-of-the-art algorithms were included.The GBM-based GBM-GDA [30] estimates the abundances and nonlinearity coefficients of the GBM.The LMM-based standard NMF [31], and the Fan model-based NMF via PG (Fan-NMF) [21] estimates endmembers and abundances simultaneously.We also added the results of our previous work [22], in which the GBM-NMF were developed to estimate all the variables in the GBM.All the tested algorithms are unsupervised except the GBM-GDA.All the algorithms were initialized with the SGA and FCLS and their combination SGA-FCLS was included in the experiment.The unmixing performance was evaluated using three metrics, including estimation accuracy of single endmember, global endmembers and global abundances.
For a single estimated endmember, the estimation accuracy was evaluated by the spectral angle distance (SAD) which measures the similarity between the true endmember m and its estimated endmember m as follows: The mean spectral angle distance (MSAD) was used to measure the similarity between the true endmember matrix M and its estimated endmember matrix M, which is defined as where m p , m p denote the pth endmember in M, M respectively.The root mean square error (RMSE) is also included [32].It evaluates the errors between all the entries of the true abundance matrix and the estimated abundance matrix and which is defined as follows: Generally, small values for the SAD, MSAD, and RMSE indicate a good performance.

Synthetic Experiments
The synthetic data used in the experiments are derived from several spectra (endmembers) in the USGS [33] library.The used spectra contain 224 spectral bands covering wavelengths from 0.38 to 2.5 µm with a spectral resolution of 10 nm.Six sample spectra are shown in Figure 1.In general, the distribution of the endmembers is generated involving these steps: (a) A hyperspectral image with z 2 × z 2 pixels (z can vary in the experiments) is divided into z × z sub-domains, where each sub-domain is initialized with one same material spectrum randomly selected from the given spectra.(b) A (z + 1) × (z + 1) low-pass filter in the spatial domain is applied to generated mixed pixels and make the abundance variation smooth.(c) Pixels whose abundances are larger than a preset maximum abundance are removed and replaced with pixels made up of all endmembers with uniformly distributed abundance values.(d) The nonlinear component is calculated in the form of the GBM model Equation ( 1) or the Fan model Equation ( 3).(e) Lastly, white Gaussian noise is also added, resulting in synthetic data with different signal-to-noise ratios (SNRs).The SNR is defined as: where E(•) is the expectation operator.Five synthetic experiments are conducted to evaluate the unmixing performance of the proposed algorithms.The first experiment focuses on the convergence property of the proposed algorithms.The second experiment evaluates the algorithms' robustness to different levels of noise, while the third experiment tests the algorithms on data with different endmember numbers.The fourth experiment evaluates the algorithms' robustness to data with various maximum abundances.The last experiment studies the influence of data size.

Convergence Test
The test is performed on the synthetic data with five endmembers.The SNR and maximum abundance are set to 30 dB and 0.8, respectively.The proposed GBM-PNLS and the NMF-based GBM-NMF are compared on the GBM data while the proposed Fan-PNLS and the Fan-NMF are compared on the Fan data.Figure 2 shows the convergence paths of the algorithms for the GBM data and the Fan data.One can observe that all the algorithms gradually converge to smaller values as the iteration number increases.However, the proposed PNLS-based algorithms provide faster convergence speed than the other algorithms for both the GBM data and the Fan data.A similar conclusion can be drawn from Figures 3 and 4. The PNLS-based algorithms converge quite quickly for the GBM data and the Fan data and thus the corresponding MSAD and RMSE values decrease rapidly.These results show that the proposed PNLS-based algorithms are more efficient than the NMF-based UNSU algorithms.

Comparison of Different Initialization Methods
In this experiment, we investigate the influence of different initialization strategies.Three methods for the endmember initialization, i.e., random initialization, VCA and SGA, are studied.Due to the randomness of random initialization and VCA, the standard deviations for the results with 20 independent runs are reported.Figures 5 and 6 present the experimental results.One can observe that the proposed PNLS-based algorithms obtain the best unmixing results for the GBM and Fan data no matter which initialization method is used.As expected, the VCA initialization generates results with smaller standard deviations compared with the random initialization.It also can be observed that SGA leads to MSAD and RMSE results slightly better than the average results of VCA's.In addition, the result of SGA for the same data is deterministic.Therefore, SGA is adopted for the initialization of endmemebers in the following experiments.

Robustness to Various Noise Levels
This experiment examines the algorithms' robustness to different synthetic images with different noise levels.The endmember number is fixed at 5 and the maximum abundance value is set to 0.8.By varying the SNRs from 20 dB to 40 dB and to infinity (noiseless), 6 synthetic images are generated for the GBM and the Fan model.Figures 7 and 8 show the MSAD and RMSE results for all the algorithms.As expected, the MSADs and RMSEs of all the algorithms degrade when the SNR increases.Over the entire range of the SNRs, the proposed PNLS-based algorithms always obtain the best unmixing results for the GBM and Fan data with the lowest MSAD and RMSE values.For the noise free case, Fan-PNLS performs even worse than other cases, which may be caused by the reason that the proposed algorithm falls into an unsatisfactory local minimum.The two NMF-based nonlinear unmixing methods appear to perform even worse than the standard NMF when it comes to the MSAD.The supervised GBM-based unmixing algorithm (GBM-GDA) also shows a poor performance, even worse than the standard NMF.One possible reason is that the inaccurate endmember estimation provided by the SGA affected the interactions between the endmembers, thereby magnifying the nonlinear mixing error [34].Actually, a prior knowledge of true endmember is the bias of supervised nonlinear unmixing [16].From this point of view, the limitations of the supervised nonlinear unmixing algorithms are obvious and the proposed algorithm can eliminate or at least mitigate this limitation.

Results for Different Endmember Numbers
The impact of different endmember numbers on the unmixing performance of the proposed algorithm is investigated in this experiment.Six synthetic images are generated with endmember numbers in the range of 3-8 for the GBM and the Fan model.The SNR is fixed to 30 dB and the maximum abundance is set to 0.8.When generating a new image with more endmember number, the spectra used in the previous image are preserved.A new spectrum selected from the USGS library is then added to the endmember set.By this way, we try to eliminate the impact caused by the variation of the endmember class rather than the endmember number.The results are shown in Figures 9 and 10, respectively.The former presents the MSAD results and the latter presents the RMSE results.We can observe that the proposed PNLS-based algorithms are superior to the other methods in most cases with low MSAD and RMSE values for the GBM and Fan data.In some cases, e.g., 8 endmembers, the proposed PNLS achives worse result than the Fan-PNLS and the standard NMF.This may be because the proposed algorithm falls into an unsatisfactory local minimum.It is interesting that the GBM-GDA achives poor results slightly better than the initialization method SGA-FCLS's.This suggests the limitation of the supervised nonlinear unmixing method again.We explore the impact of the mixing degree of the synthetic data in this experiment.The mixing degree can be directly controlled by the maximum abundance value of the data, i.e., larger maximum abundance values correspond to lower mixing degree and vice versa.In this test, the endmember number is fixed to 5 and the SNR is set to 30 dB.The maximum abundance varies from 0.6 to 1 for the evaluation.The experiment results in Figures 11 and 12 show that with the increasing of maximum abundance, the MSAD and RMSE values of all the algorithms decrease generally.This phenomenon mainly results from the fact that the initialization method SGA assumes the presence of pure pixels.More specifically, with increasing maximum abundance, the mixing degree of the data decreases, which leads to the performance improvement of the SGA.The proposed PNLS-based algorithm achives the smallest MSAD or RMSE values in all cases.Moreover, one can observe that the performance of the PNLS is more stable and consistent.The RMSE and MSAD values are comparatively small, even for the smallest maximum abundance value.In contrast, the performance of the GBM-NMF and Fan-NMF is not very satisfactory and decreases obviously when the mixed level increases.The results indicate that the proposed PNLS-based algorithms can handle data with different mixing degrees well, even for the highly mixed data.The endmember number is fixed to 5 and the SNR is set to 30 dB.The results in Figures 13 and 14 show that the impact of the data size for all the algorithms.The proposed PNLS-based algorithms always obtain the best estimation accuracy for the GBM data and the Fan data and the performance is quite stable.Meanwhile, the proposed algorithms show more obvious advantage when applied to larger scale of images.Additionly, the performance of the GBM-NMF is unexpectedly worse than that of the standard NMF.

Real Data Experiments
Jasper Ridge is a dataset widely used in hyperspectral processing, which consists of 512 × 614 pixels.The data includes 224 spectral bands covering the wavelength ranging from 380 nm to 2500 nm with about 10 nm spectral resolution.In this study, we only considered a cropped 100 × 100 sub-scene for simplicity consulting the experiment setup in [35].Figure 15 shows the 100th band of the scene.After removing the water vapor and atmospheric effects bands, 198 bands remain.According to [35], the endmember number of this sub-scence is set to four, including the substances Road, Soil, Water, and Tree.
Table 3 quantitatively presents the SAD results for the algorithms involving the endmember estimation for various substances.Compared to the other algorithms, GBM-PNLS and Fan-PNLS obtain better results since all four endmembers are well extracted and the smallest MSAD results are achieved.Figures 16 and 17 show the obtained endmember spectra for the GBM-PNLS and Fan-PNLS algorithms, as well as the reference spectra.The extracted endmembers and the reference endmembers are nearly identical.Since the ground truth of abundance map is not available, reference data have been often used as the benchmark to evaluate the accuracy of algorithms.It's worth noting that the accuracy of reference data is unknown actually [36].In this paper, we use the reference abundance maps obtained from [37] to further evaluate the algorithms' performance.The reference data have been commonly used in [35,38].Table 4 presents the RMSE performance of different algorithms.We can also find that the proposed metohds' results are generally better than those yielded by the other algorithms.It is interesting that the two PNLS-based algorithms perform quite similarly.This reflects the similar modeling capability of the GBM and Fan model.Figure 18 shows the corresponding abundance maps achieved by different algorithms, as well as the reference maps.Through visual comparison, we can find that the proposed proposed GBM-PNLS and Fan-PNLS achieve abundance maps that are more consistent with the reference maps compared with other methods.To further investigate the nonlinear behavior of proposed method, the maps of nonlinearity coefficients obtained by the proposed GBM-PNLS are provided in Figure 19.As shown in the figure, the bilinear mixing effect mainly exists in the transition regions of different substances, which accords with the physical truth well.Figure 19b shows that a large bilinear spectral mixing effect exists in the mixing regions of tree and soil since there exist 3-D multilayered structures and multiple scattering is a common phenomenon in these regions.It can also be observed that the bilinear mixing effect appears in the boundary of water and soil, as well as the boundary of road and soil, which is also consistent with physical model of GBM-based unmixing.The above results demonstrate that the proposed methods have potential in exploring underlying nonlinear effects and handling BMM-based spectral unmixing.

Conclusions
In this study, two PNLS-based UNSU algorithms are proposed for the GBM and Fan model.They are able to obtain the endmembers, abundances and nonlinearity coefficients simultaneously in a complete iteration.To better achieve the unsupervised unmixing task, the proposed algorithms integrate a Sigmoid parameterization so that the nonnegative constraint and nonlinearity coefficients boundary constraint can be relaxed.The optimization problem of the UNSU is then transformed into two or three alternate PNLS problems which can be solved by the Gauss-Newton method.Compared to existing NMF-based UNSU methods, the PNLS-based UNSU algorithms converge faster and the performance is improved to a certain extent.We evaluate the performance of the proposed algorithms on both synthetic and real data.The experimental results show that the proposed algorithms possess better unmixing performance than the other reference algorithms.
There are still some issues to resolve.For example, the computational complexity is still high, especially when the dataset is large.Considering the proposed algorithms are actually highly parallel, hopefully, the computational complexity can be reduced using high-performance computing such as graphic processing unit (GPU) acceleration, which will be explored in a further study.

Figure 9 .
Figure 9. MSAD results for various numbers of endmembers for: (a) GBM data, (b) Fan data.

Figure 15 .
Figure 15.The 100th band of the Jasper Ridge data.

Figure 16 .Figure 17 .
Figure 16.Comparison of the USGS library spectra (solid green line) with the endmembers extracted by GBM-PNLS (red dotted line) on the Jasper Ridge data: (a) Tree, (b) Water, (c) Soil, (d) Road.

Figure 18 .
Figure 18.Abundance maps of the Jasper Ridge data by the reference data, SGA-FCLS, Standard NMF, GBM-GDA, GBM-NMF, Fan-NMF, GBM-PNLS and Fan-PNLS, respectively, from top to bottom.Each column shows the corresponding abundance maps of a same endmember.
Figure 11.MSAD results for different mixing degrees for: (a) GBM data, (b) Fan data.In this test, five synthetic images with different sizes are generated for the GBM and the Fan model.The data size is set to 36 × 36, 64 × 64, 100 × 100, 144 × 144, and 196 × 196, respectively.

Table 3 .
SADs (×10 −2 ) of all the algorithms for the Jasper Ridge data.