Reducing the Effect of the Endmembers ’ Spectral Variability by Selecting the Optimal Spectral Bands

Variable environmental conditions cause different spectral responses of scene endmembers. Ignoring these variations affects the accuracy of fractional abundances obtained from linear spectral unmixing. On the other hand, the correlation between the bands of hyperspectral data is not considered by conventional methods developed for dealing with spectral variability. In this paper, a novel approach is proposed to simultaneously mitigate spectral variability and reduce correlation among different endmembers in hyperspectral datasets. The idea of the proposed method is to utilize the angular discrepancy of bands in the Prototype Space (PS), which is constructed using the endmembers of the image. Using the concepts of PS, in which each band is treated as a space point, we proposed a method to identify independent bands according to their angles. The proposed method comprised two main steps. In the first step, which aims to alleviate the spectral variability issue, image bands are prioritized based on their standard deviations computed over some sets of endmembers. Independent bands are then recognized in the prototype space, employing the angles between the prioritized bands. Finally, the unmixing process is done using the selected bands. In addition, the paper presents a technique to form a spectral library of endmembers’ variability (sets of endmembers). The proposed method extracts endmembers sets directly from the image data via a modified version of unsupervised spatial–spectral preprocessing. The performance of the proposed method was evaluated by five simulated images and three real hyperspectral datasets. The experiments show that the proposed method—using both groups of spectral variability reduction methods and independent band selection methods—produces better results compared to the conventional methods of each group. The improvement in the performance of the proposed method is observed in terms of more appropriate bands being selected and more accurate fractional abundance values being estimated.


Introduction
In the past decade, numerous methods have been introduced for unmixing hyperspectral imagery [1,2].Spectral mixture analysis (SMA) is one of most commonly-used methods, and is used in different applications.Basically, the spectra of mixed pixels are modeled using linear or non-linear mixture models.The spectral signature of each pixel is converted to a set of fractional abundances of its constituent spectra (endmembers) by these models [3].The answer to this question of which one (linear or non-linear models) is superior for unmixing the hyperspectral data is not clear, and depends on the type of the mixture of objects and their applications.However, the acceptable accuracy and the simplicity of linear mixture models entice more researchers to employ them [4].
If the multiple scattering among the endmembers is negligible and the mixture could be supposed macroscopic, a linear mixture model (LMM) can be written as Equation (1): where y(n) = [y 1 (n), y 2 (n), . . ., y B (n)] T ∈ R B is the vector of observations; B is the number of bands; n = 1, • • • , N is the index of pixels in the image; m i ∈ R B , i = 1, . . ., p is the spectral signature of endmembers; p is the number of endmembers; α i (n) is the abundance of the ith endmember in the nth pixel; M = m 1 , • • • , m p ∈ R B×p is the coefficient matrix of endmembers; T ∈ R p is the vector of abundance values in the nth pixel; and ν(n) ∈ R B represents noise.
The accuracy of the unmixing process highly depends on the completeness and goodness of the selected endmembers.Therefore, many endmember extraction algorithms have been developed in recent years [3,5].The accuracy of the fractional abundances obtained from SMA is affected by the residual spectral error caused by inaccurate atmospheric correction, an insufficient signal-to-noise ratio (SNR), and the noise caused by neglecting the non-linear effect of inputs.However, the most important source of error in SMA is due to ignoring the spectral variability (SV) of endmembers caused by variable illumination and environmental, atmospheric, and temporal conditions [4].These algorithms generally model the entire image using a constant spectral feature for each endmember.In fact, this is a simplification, because in many cases the spectrum of endmembers could change in different spatial and temporal conditions.
Generally, two types of SV can be distinguished among the samples from different classes: (1) the variability within the endmembers of a specific class (intra-class variability); and (2) the spectral similarity between the endmembers of different classes (inter-class variability) [4].By increasing the intra-class variability, the accuracy of sub-pixel fraction estimation decreases linearly [6].On the other hand, in some applications where the separation of similar phenomena is of interest, the spectral similarity among the different endmembers (e.g., crops and weeds in agricultural fields or spectral similarity among minerals) makes it difficult to separate these classes.The estimation of the fractional abundances using the linear mixture model could be achieved by different methods, such as least squares and sparse regression with different constraints [3,7].In the least squares-based spectral unmixing problem, the spectral similarity among the endmembers results in a high correlation between the columns of the coefficients matrix (M) in Equation (1).Consequently, the rank deficiency of the coefficients matrix leads to an unstable solution for the least squares problem and decreasing the accuracy of the estimation of the fractional abundances.Despite the serious effects in the LMMs and the destruction of the reliability of the results of spectral unmixing (SU), this issue is typically ignored [8].
According to [4], the efforts to decrease the effect of the SV can be classified into five general categories: (1) the use of multiple endmembers for each component in an iterative mixture analysis procedure; (2) the spectral weighting of bands; (3) the spectral transformations; (4) the use of radiative transfer models in a mixture analysis; and (5) the selection of a subset of stable spectral features.In addition, to significantly improve the accuracy of the estimation of fractional abundances, the last strategy effectively reduces the computational cost.
The non-orthogonality of the endmembers appears when a linear correlation exists between two endmembers or a multi-collinearity exists among some endmembers.By increasing the correlation among the endmembers, the LMM tends to be instable and extremely sensitive to the small variations of the input spectrum and noises.According to [8], the approaches to deal with the problem could be categorized as: (1) excluding the correlated endmember; (2) de-correlating the endmembers using the spectral transformations; (3) using iterative approaches to select the independent endmembers; and (4) the regularization of the SU equations.Regarding the redundancy of bands in the hyperspectral images, it is not unexpected to identify a subset of bands that decreases the correlation of endmembers.To deal with the problem, the correlation of endmembers can be evaluated using singular value decomposition (SVD) and the condition number of the coefficient matrix of the endmembers in the unmixing procedure.
This paper presents a novel and effective approach for managing the SV and decreasing spectral correlation among the endmembers based on the selection of the optimal bands in the Prototype Space (PS) [9].The proposed method consists of two main steps.Based on the spectral behavior of the endmembers' set, the image bands are firstly prioritized in such a way that they have the least sensitivity to the SV of the endmembers.Then, the optimal band selection is done based on this prioritization.Since the spectral correlation among the image bands is not considered in this process, in the second step the independent bands are selected using their angles in the PS.In this way, the spectral correlation among the endmembers is reduced as well.Besides, collecting a spectral library from the SV of endmembers is an expensive and time-consuming process.Therefore, these sets were directly extracted from the image in this paper.
The remaining parts of the paper are organized as follows: the theoretical background and the previous algorithms are introduced in Section 2; the proposed method is explained in Section 3; the experimental results and further discussion are provided in Section 4; and concluding remarks are found in Section 5.

Theoretical Background
The effect of the SV of endmembers in the SU and the approaches developed to deal with it have been taken into consideration in [4,10,11].Furthermore, the problems caused by the spectral correlation among the endmembers and its adverse effects on the reliability of the results of the SU have been investigated in [8].Given the centrality of the optimal band selection to enhancing the stability of the elected set against the spectral variation and also to decrease the spectral correlation among the endmembers in the proposed method, the algorithms that have dealt with the problem by the band selection approach are reviewed in this section.

Feature Selection Algorithms to Decrease the Spectral Variability Effect
The precise selection of bands that are stable against the SV (e.g., those bands that minimize the intra-class variance and maximize the inter-class dispersion) plays a significant role in the accuracy improvement of the estimation of fractional abundances.Previous studies on the variability of the optical properties of leaf, litter, and soil in semi-arid and arid areas had illustrated that the SWIR2 region from 2050-2500 nm is the least dependent on variations in structural and biochemical attributes [12].Therefore, this region was selected as a stable spectral region for the aforementioned materials to be used in the SMA for developing the AutoSWIR algorithm in [12].However, since the position and the spectral region of bands, as well as the number of stable spectral regions depend on the spatial, spectral, and temporal complexity as well as the mixture of endmembers that are present in the scene [4], this algorithm was not extendable for different ecosystems.
In [13], a more applicable spectral feature selection algorithm entitled the stable zone unmixing (SZU) was introduced.In this algorithm, the sensitive wavelengths to the SV are evaluated using the instability index (ISI).Then, a protocol was introduced to enhance the spectral subset selection by accounting for a tradeoff between the number of wavelengths used in the analysis (i.e., information) and the ISI (i.e., spectral variability).
The redundancy problem of the hyperspectral images and high correlation between their bands was not taken into account in either of the methods (AutoSWIR and SZU).A greater potential in computation efficiency and fraction estimate accuracy could only be provided if the independent bands were employed in the LMM [4].

Feature Selection Algorithms to Decrease the Spectral Correlation of Endmembers
The most common way to deal with the correlation of endmembers is to eliminate the collinear endmembers [8], which causes two disadvantages: (1) If two endmembers are highly correlated, which one should be excluded?(2) When an endmember is eliminated, it may contain some useful information that could result in destabilizing the LMM.Another solution is to combine the two correlated classes to form a new endmember.However, this will inevitably result in missed classes or similar problems to exclude the endmembers.There are a number of transformations (e.g., principal component analysis (PCA) [14] and the maximum noise fraction (MNF) [15]) which can be employed to reduce the correlation of endmembers by de-correlating the band-to-band correlation.The major drawback of these methods is encountering endmembers whose spectral response has no physical meaning [8].
Another obvious solution that comes to mind is to use a number of subsets of the spectral region.Regarding the high spectral resolution of the hyperspectral images, the spectral region could be decomposed into the visible, near-infrared, and shortwave infrared sections.Then, only those regions that contain the absorption features of the objects of interest could be employed in the appropriate applications.Of course, different regions of the electromagnetic wave spectrum have been theoretically considered by researchers in order to examine different phenomena (i.e., using the shortwave infrared region for analyzing minerals).However, upon decreasing the spectral region, the correlation among endmembers increases [8].
Recently, several endmember-extraction-based methods were applied to feature selection methods, which elect the distinctive spectral signatures.Some of these methods (e.g., geometrical feature selection (G-FS) [16] and linear prediction (LP) [17]) operate in the pixel space.The dimensionality of such a space is equal to the number of participating image pixels.Some others, such as prototype feature selection (PFS) and maximum tangent discrimination (MTD) [18] extract informative bands in an unsupervised manner via geometrical interpretation of distinctive bands in the Prototype Space (PS) [9].In this way, the axes of the PS are defined based on the spectrum of the extracted endmembers or the clusters' center of the existing classes in the scene.Therefore, the dimensionality of such a space is equal to the number of constructive components of the scene, and each band is represented as a point or vector in this space.Using these methods, a subset of independent bands that are distributed throughout the spectral region of the hyperspectral sensor could be provided by eliminating the dependent bands in a meaningful manner.

Quantitative Evaluation of the Endmembers' Correlation
In order to quantitatively estimate the correlation of endmembers constructing the coefficient matrix (M), two measures can be employed [8]: (1) deriving a measure from the coefficient matrix based on its singular values, and (2) extracting a measure from the correlation of the endmembers' spectra.The SVD of the endmembers matrix (M) to extract singular values (i.e., the square root of eigenvalues) is M = U ∑V T , where U and V are both square, unitary, and orthonormal matrices, and ∑ is a diagonal matrix with the singular values of M. The ratio of the largest singular value to the smallest one is called the condition number of the matrix.The ideal value of one for the condition number indicates that the matrix is fully orthogonal.By increasing the condition number of the endmember matrix, the correlation of endmembers increases, and in an extreme case, will cause the singularity of the matrix.If the correlation of endmembers exceeds 0.6, the condition number increases exponentially [8].In this way, the condition number is a good measure to evaluate the endmembers' correlation.
The correlation matrix of the endmembers partially shows the collinearity of each pair of endmembers.The average value of the upper or the lower triangle elements of the matrix provides a measure that indicates the average correlation of the endmembers constructing the coefficient matrix, and could be used as a measure of the overall correlation of endmembers.

The Proposed ISI-PS Method
This section explains the proposed ISI-PS method, which is an incorporation of the methods to decrease the SV effect and to select the independent bands of the hyperspectral images.These two factors directly play a positive role in enhancing the accuracy and the reliability of the fractional abundances computed.The proposed method has been developed by assuming the existence of some pure pixels in the image for each class.
In order to explain the relation of the different parts of the proposed method, its flowchart (Figure 1) and pseudo-code are presented, and detailed information is provided in the following sections.By assuming the existence of pure pixels in the hyperspectral images, a set of pure spectra is firstly provided for each class, which indicates the class spectral variability.These sets are employed to create the endmember of each class and statistical analysis in each band.In this way, the extraction of the information of the endmembers in the proposed method has been compiled based on the geometrical endmember extraction algorithms and by assuming the existence of pure pixels in the image.By means of intra-class and inter-class analysis using measures (e.g., ISI) that consider the SV, the persistent bands against the variability are prioritized.Thereafter, the PS is established based on the spectra of the pure representative of each class, and then bands are represented in this space.Finally, the correlation among the prioritized bands is measured by computing their angles in the PS, and then those that have a similar behavior are eliminated.Thus, the remaining bands are independent and persistent against the variability, which are employed to do the SU and estimate the fractional abundance of each endmember.

The Proposed ISI-PS Method
This section explains the proposed ISI-PS method, which is an incorporation of the methods to decrease the SV effect and to select the independent bands of the hyperspectral images.These two factors directly play a positive role in enhancing the accuracy and the reliability of the fractional abundances computed.The proposed method has been developed by assuming the existence of some pure pixels in the image for each class.
In order to explain the relation of the different parts of the proposed method, its flowchart (Figure 1) and pseudo-code are presented, and detailed information is provided in the following sections.By assuming the existence of pure pixels in the hyperspectral images, a set of pure spectra is firstly provided for each class, which indicates the class spectral variability.These sets are employed to create the endmember of each class and statistical analysis in each band.In this way, the extraction of the information of the endmembers in the proposed method has been compiled based on the geometrical endmember extraction algorithms and by assuming the existence of pure pixels in the image.By means of intra-class and inter-class analysis using measures (e.g., ISI) that consider the SV, the persistent bands against the variability are prioritized.Thereafter, the PS is established based on the spectra of the pure representative of each class, and then bands are represented in this space.Finally, the correlation among the prioritized bands is measured by computing their angles in the PS, and then those that have a similar behavior are eliminated.Thus, the remaining bands are independent and persistent against the variability, which are employed to do the SU and estimate the fractional abundance of each endmember.Pseudocode of the proposed method: (1) Estimating the number of classes of the image and establishing a spectral library of the SV of endmembers (i.e., the sets of endmembers).(2) Prioritizing the persistent bands against the SV using the SV index and some training data.where

{ }
where Ω is the set of prioritized persistent bands (B) and ith is the prioritizing index of bands.Pseudocode of the proposed method: (1) Estimating the number of classes of the image and establishing a spectral library of the SV of endmembers (i.e., the sets of endmembers).(2) Prioritizing the persistent bands against the SV using the SV index and some training data.
where Ω is the set of prioritized persistent bands (B) and ith is the prioritizing index of bands.
(3) Selecting the most different bands in the PS using the distance of the bands from the main diagonal of the space, Ω 0 = {B PS }. (4) Considering band B i and computing its angle from all members of set Ω i−1 in the PS.(5) If the angle of band B i from all of the previously selected bands was greater than a predefined threshold (T), then and band B i is eliminated.(6) Unmixing the hyperspectral image using the selected bands.

Establishing a Set of Spectral Variabilities for Each Endmember
The LMM is widely used to model the spectral composition of a spectrum.However, many reasons lead to the SV of endmembers, such as the change of environmental illumination, as well as atmospheric and temporal conditions [10].The methods for dealing with the endmembers' variability could be categorized into two classes [10]: (1) endmembers as sets, and (2) endmembers as statistical distributions.Usually, methods of the first category require a spectral library of the SV of endmembers to deal with the variability phenomena.Collecting a spectral library of the endmembers' variabilities is an expensive and time-consuming process.Therefore, automatic extraction of endmembers' sets from an image is greatly beneficial.In this regard, automated endmember bundles (AEB) [19] has established endmembers sets by executing standard endmember extraction algorithms such as N-FINDR [20], orthogonal subspace projection (OSP) [21], unsupervised fully constrained least squares (UFCLS) [22], iterative error analysis (IEA) [23] and vertex component analysis (VCA) [24] and clustering the resulted endmembers from different methods.Recently, the authors in [25] showed that VCA is essentially the same as simplex growing algorithm (SGA) [26] as long as their initial conditions are the same.So, other conventional endmember extraction algorithms such as SGA can be used in AEB.
Spectral features of those spectra that are located in each endmember set indicate the representative of that endmember.Obviously, it is due to the SV that these features are not exactly similar.Therefore, these representatives should have the general condition of an endmember, including: (1) they should lie next to the vertex of point clouds in the feature space; (2) they should situate in homogenous regions in the spatial domain; and (3) pixels of each class should have a similar spectral behavior.Recently, a module entitled spatial-spectral preprocessing (SSPP) was presented in [27], which could be used as a preprocessing function prior to endmember identification and SU.This module firstly computes the spatial homogeneity index for the pixels of the image, which is used to determine the homogenous regions of the image.Simultaneously, unsupervised clustering is employed to identify the spectral classes.Finally, by fusion of this spatial and spectral information, a subset of pixels that are spatially homogenous and spectrally pure is identified in each class, which could be used as the input of the endmember extraction algorithms.
In this way, the applied procedure to determine the endmembers has been shown in the dashed box of Figure 2 and runs as follows.Firstly, the virtual dimensionality (VD) of the image (p) is determined via Hysime [28].Thereafter, the dimension is reduced using a PCA or MNF transformation into (p-1) bands.In this reduced feature space, endmembers are found via the well-known pixel purity index (PPI) [29] technique with a threshold value equal to zero.Then, the obtained results from the PPI are clustered into p clusters.Instead of the common K-means clustering (which is suitable for classes with spherical distributions), the Fast Density Peak Detection (FDPC) method [30] is applied since herein the classes are seen to have a variety of different distributions.Since the endmembers are found, image homogeneous regions are also detected by a Gaussian filter, and according to Equations ( 6) and (7).The previously found endmembers are prioritized according to both their spectral purity and homogeneity indices.Next, the spatial and spectral maps are generated based on the first 20% of the purest pixels, as well as 30% of the most homogeneous ones.The ultimate representative endmembers of each class are finally selected from the overlap of these two maps.considered in the band prioritizing process to reduce the effect of the SV.These two issues are closely related to each other.Therefore, by eliminating those bands for which the endmembers' sets spectrum is similar, the less-correlated spectral features of these endmembers could be achieved.The spectrum of alunite and kaolinite minerals from the USGS spectral library is represented by the spectral response of the AVIRIS sensor in Figure 2. As can be seen, the bands in the blue, red, and yellow regions are redundant.Besides, these regions may be correlated with each other.

Reducing the Spectral Variability Effect by Selecting the Optimal Bands
In contrast with the conventional SMA methods which use the overall spectral region (i.e., all bands), SZU [13] has been designed based on the selection of persistent bands against the SV phenomena using ISI.The numerical value of this index is computed using Equation (2), which has been developed based on the Fisher separability function.This index is defined for each band based on the ratio of the intra-class dispersion (i.e., the total standard deviation of endmembers for each class) to the inter-class variability of endmembers (i.e., the average distance among the center of classes).The value of one indicates that the intra-class and inter-class variations are similar, and the smaller this value, the better the situation for that band to separate classes.
where p is the number of endmembers and σ i,λ and m i,λ are the standard deviation and the average of class i in the band λ, respectively.The image's bands could be prioritized with respect to the SV of their constituent spectra using this index.

Reducing the Correlation of Endmembers by Selecting the Independent Bands in the Prototype Space
Linear correlation of two or more endmembers always exists in the SU of hyperspectral images.However, little attention has been paid to this [8].One of the main objectives of this paper is to reduce these correlations without the elimination of the dependent endmembers.This is because-as was mentioned-it would be of interest to separate the different species in some applications.On the other hand, the correlation of the spectrum of endmembers' sets for adjacent bands has not been considered in the band prioritizing process to reduce the effect of the SV.These two issues are closely related to each other.Therefore, by eliminating those bands for which the endmembers' sets spectrum is similar, the less-correlated spectral features of these endmembers could be achieved.The spectrum of alunite and kaolinite minerals from the USGS spectral library is represented by the spectral response of the AVIRIS sensor in Figure 2. As can be seen, the bands in the blue, red, and yellow regions are redundant.Besides, these regions may be correlated with each other.
In order to estimate the correlation of bands, some methods have been proposed based on the divergence and correlation functions on the histogram of the image's bands [31,32].However, since the goal is to improve the condition of the coefficient matrix to reduce the endmembers' correlation, in this paper, the angle of bands in the prototype space has been used as a measure of the correlation of endmembers' sets in those two bands.In other words, by establishing the PS using the endmembers and representing the bands in this space, the dependent bands could be identified.The advantage of this method is that the correlation of bands is evaluated dealing with the endmembers' sets, because the axes of the PS are the endmembers.Therefore, if the combination of the endmembers is changed in the scene, the proposed method will select a new subset of bands that also have the minimum correlation.
Bands are categorized into three classes in the PS: (1) informative bands: the lager the distance of bands from the main diagonal of the space, the better the bands can separate the image' classes; (2) correlated bands: the bands that have a similar spectral response of endmembers are gathered close together in this space-this concept is beyond the correlation of the adjacent bands in the hyperspectral images, because it would have occurred for those bands that are not adjacent; (3) non-informative bands: those bands that are located close to the main diagonal of the space, which have the exact same response for different classes [9].
The spectrum (i.e., bands) of alunite and kaolinite minerals (Figure 2) are illustrated in the PS, which was constructed based on these two endmembers in Figure 3.In this figure, examples of highly correlated bands (using blue, red, and yellow colors, which are shown with magnification in the right view), informative bands (the two bands that are shown with blue squares and that have the largest distances from the main diagonal of the space, and those bands that are illustrated with red triangles in which the two spectra have an appropriate distance from each other) and uninformative bands (those bands that are shown with magenta color and that are located close to the main diagonal of the space) are illustrated.It is obvious that the angles of the correlated bands are close to each other, even if these regions with a similar spectrum are not adjacent.Therefore, the correlation of bands in dealing with the endmembers could be understood by extracting these angles.It is obvious that the angles of the correlated bands are close to each other, even if these regions with a similar spectrum are not adjacent.Therefore, the correlation of bands in dealing with the endmembers could be understood by extracting these angles.

Determining the Threshold Value to Identify the Independent Bands
A threshold value should be defined in order to identify and eliminate the correlated bands, and the decision to preserve or eliminate each band is made by the comparison of the angle between that band and the previously selected bands with the pre-defined threshold value.The angle between bands is computed in the prototype space.If this angle is less that the pre-defined threshold value, it means that the band evaluated is similar to a band in the set of previously selected bands.Otherwise, the band evaluated is added to the set of bands.
In order to determine the threshold value, in this paper, the independent bands are extracted by defining a range (e.g., from 0.5 to five degrees) and a step (e.g., 0.15 degrees) for the variation of this threshold.Then, an image is reconstructed from the selected bands.The root mean square error (RMSE) of discrepancies between the estimated fractional abundances from this image and the ground truth map is computed.Finally, the threshold value that leads to the minimum RMSE is selected as the optimal threshold value.However, the obtained bands using this threshold value are selected as the optimal bands.
The ground truth map is not available in most applications of spectral unmixing.In these cases, the map of the extracted pure pixels from the image in Section 3.1 of the proposed method could be used as the ground truth map.In other words, similar to the case that the ground truth map is available, the proper threshold value is determined by evaluating the RMSE obtained from the estimation of the fractional abundances at the position of the image's pure pixels.

Spectral Unmixing
Finally, in order to evaluate the fractional abundances, a valid and unique method is needed for comparing the performance of the selected bands using the different methods studied.In this study, the least squares method has been employed to solve the inverse problem.According to [33], which studied the different methods of the estimation of the fractional abundances, the fully constrained least squares (FCLS) is introduced as a proper method in this regard.In order to accurately estimate the fractional abundances of the endmembers, two constraints-namely, the abundance sum-to-one constraint (ASC) and the abundance non-negativity constraint (ANC)-have been applied to the linear mixture model.In order to use the FCLS method, the constituents of the imaging scene should be fully known.This issue is considered using the information obtained from the ground truth map or the endmembers extracted from the image in the supervised and unsupervised manners, respectively.

Results and Discussion
In this section, the performance of the proposed ISI-PS algorithm is evaluated using the both simulated and real hyperspectral datasets.In the first subsection, several simulated hyperspectral images were employed, which have been produced using some spectra of the USGS spectral library and different scenarios.The constituent spectra of imagery and their fractional abundances were exactly known for these datasets.Then, the effect of selecting the optimal bands using the proposed ISI-PS algorithm to reduce the SV and endmembers' correlation was evaluated on the AVIRIS hyperspectral images.Finally, the results of the proposed method were compared with the results of the SZU method [13], which only dealt with the SV, and the results of the MTD method [18], which only tried to select the independent bands in an unsupervised manner.

Simulated Dataset
In this research, several hyperspectral images with a dimension of 100 × 100 pixels and various spatial patterns were simulated using some spectra from the USGS spectral library.These spectra were selected from different combinations of minerals of hydro-thermal alteration zones in geological applications (Table 1).Several conditions were considered to simulate these images so that the resulting imagery reflected the real conditions as much as possible.In order to reconstruct the spatial patterns that appear in nature, the functions of the HYDRA software package [34] have been used, in which the two functions Legendre and Gaussian were employed to generate the fractional abundances.The Gaussian function could be performed using four different modes to generate the spatial patterns, and an example of each mode is illustrated in Figure 4. Several conditions were considered to simulate these images so that the resulting imagery reflected the real conditions as much as possible.In order to reconstruct the spatial patterns that appear in nature, the functions of the HYDRA software package [34] have been used, in which the two functions Legendre and Gaussian were employed to generate the fractional abundances.The Gaussian function could be performed using four different modes to generate the spatial patterns, and an example of each mode is illustrated in Figure 4.In this study, the effects of the endmembers' variability and the illumination fluctuation due to the topography on the spectrum of objects have been modeled according to [35].The SV was simply characterized by spectral shape invariance [36].In other words, while the spectral shapes of the endmembers were fairly consistent, their amplitudes varied considerably over the scene.Accordingly, the spectral variability of the ith endmember in each pixel can be modeled as Equation (3). where m is an endmember that is selected from the spectral library and 0 ≥ i ψ is a random factor that affects the spectral amplitude of this endmember equally in all bands.i ψ is the variable for each endmember in each class.This factor was generated by defining the range of the spectral variations (i.e., ( ) ) and using the normal distribution.In the experiments, a standard deviation (sd) of 0.05 was adopted for the spectral variations.i η is a random noise with a zero-mean, which was considered to model those variations that are not modeled by i ψ .
By substituting Equation (3) in the linear mixture model (Equation 1), an equation was achieved to model the spectral variability for each pixel (Equation 4).In this study, the effects of the endmembers' variability and the illumination fluctuation due to the topography on the spectrum of objects have been modeled according to [35].The SV was simply characterized by spectral shape invariance [36].In other words, while the spectral shapes of the endmembers were fairly consistent, their amplitudes varied considerably over the scene.Accordingly, the spectral variability of the ith endmember in each pixel can be modeled as Equation (3).
where m • i is an endmember that is selected from the spectral library and ψ i ≥ 0 is a random factor that affects the spectral amplitude of this endmember equally in all bands.ψ i is the variable for each endmember in each class.This factor was generated by defining the range of the spectral variations (i.e., (1 ± sd)) and using the normal distribution.In the experiments, a standard deviation (sd) of 0.05 was adopted for the spectral variations.η i is a random noise with a zero-mean, which was considered to model those variations that are not modeled by ψ i .By substituting Equation (3) in the linear mixture model (Equation 1), an equation was achieved to model the spectral variability for each pixel (Equation 4).
In order to apply these factors, a threshold was considered for the standard deviation of variations (sd).Then, the factors were randomly generated with a normal distribution in the range of 1 ± sd, which were multiplied by the amplitude of the original spectra.These factors were different for each pixel and properly model the effect of SV.The range of (0.05-0.15) was employed in experiments as the standard deviation of spectral variabilities.
The effect of the illumination fluctuation due to the topography was similar for all bands, and could therefore be considered as additive noise [35].In order to model this effect, the spectral features matrix was considered to be fixed, and the fraction of each pixel was multiplied by a factor (γ).These factors were randomly selected with a normal distribution in each pixel in the range of (0.95-1.05).In order to simulate the effects of the instrumental noises, Gaussian noise with the zero-mean and different ratios was added to the simulated scenes.Therefore, Equation ( 5) is the model that was considered in the experiments in this study, and still was linear.The illumination fluctuation, the SV, and the instrumental noise of images could be modeled using this equation.The flowchart of generating the simulated images is illustrated in Figure 5.
Remote Sens. 2017, 9, 884 11 of 29 where ( ) In order to apply these factors, a threshold was considered for the standard deviation of variations (sd).Then, the factors were randomly generated with a normal distribution in the range of sd ± 1 , which were multiplied by the amplitude of the original spectra.These factors were different for each pixel and properly model the effect of SV.The range of (0.05-0.15) was employed in experiments as the standard deviation of spectral variabilities.
The effect of the illumination fluctuation due to the topography was similar for all bands, and could therefore be considered as additive noise [35].In order to model this effect, the spectral features matrix was considered to be fixed, and the fraction of each pixel was multiplied by a factor (γ ).These factors were randomly selected with a normal distribution in each pixel in the range of (0.95-1.05).In order to simulate the effects of the instrumental noises, Gaussian noise with the zero-mean and different ratios was added to the simulated scenes.Therefore, Equation ( 5) is the model that was considered in the experiments in this study, and still was linear.The illumination fluctuation, the SV, and the instrumental noise of images could be modeled using this equation.The flowchart of generating the simulated images is illustrated in Figure 5.

LTRAS Dataset
The Russell Ranch Sustainable Agriculture Facility is a unique 300-acre facility near the UC Davis campus dedicated to investigating irrigated and dry-land agriculture in a Mediterranean climate.The goal of this facility-which is known as Long-Term Research in Agricultural Sustainability (LTRAS)-was used to investigate the impact of external factors such as crop rotation, farming systems (conventional, organic, and mixed), and the inputs of water, nitrogen, carbon, and other elements on agricultural sustainability [37].
Currently, the Century Experiment contains ten systems, which are two-year rotations and include corn/tomato, wheat/tomato, wheat/fallow, and wheat/legume rotations.Additionally, a perennial native grass system and a six-year alfalfa-corn-tomato rotation were initiated in 2012.The arrangement of each farm in this facility is illustrated in Figure 6a [38], along with the type of its irrigation and fertilization.

LTRAS Dataset
The Russell Ranch Sustainable Agriculture Facility is a unique 300-acre facility near the UC Davis campus dedicated to investigating irrigated and dry-land agriculture in a Mediterranean climate.The goal of this facility-which is known as Long-Term Research in Agricultural Sustainability (LTRAS)-was used to investigate the impact of external factors such as crop rotation, farming systems (conventional, organic, and mixed), and the inputs of water, nitrogen, carbon, and other elements on agricultural sustainability [37].
Currently, the Century Experiment contains ten systems, which are two-year rotations and include corn/tomato, wheat/tomato, wheat/fallow, and wheat/legume rotations.Additionally, a perennial native grass system and a six-year alfalfa-corn-tomato rotation were initiated in 2012.The arrangement of each farm in this facility is illustrated in Figure 6a [38], along with the type of its irrigation and fertilization.The hyperspectral data used in this study were captured by the AVIRIS sensor on 3 August 2013, from low altitude with a 3.2-m ground pixel size.The region of LTRAS comprises 199 lines by 217 samples, and was extracted from the original image that had been radiometrically and geometrically corrected.Its true and pseudo-color composites are illustrated in Figure 7.The dimension of each farm is approximately 64 × 62 m, for which regarding the date of imaging and the planting schedule in Figure 6a, its classes have been extracted as Figure 6b.Plots 4-5 and 8-9-which had been planted with corn-were excluded from the ground truth map due to harvesting; as were Plots 5-4, which had not been planted with tomato as per the schedule.Because of the imaging date, the plots of cover crop had no covers, and were therefore considered as bare earth.The hyperspectral data used in this study were captured by the AVIRIS sensor on 3 August 2013, from low altitude with a 3.2-m ground pixel size.The region of LTRAS comprises 199 lines by 217 samples, and was extracted from the original image that had been radiometrically and geometrically corrected.Its true and pseudo-color composites are illustrated in Figure 7.The dimension of each farm is approximately 64 × 62 m, for which regarding the date of imaging and the planting schedule in Figure 6a, its classes have been extracted as Figure 6b.Plots 4-5 and 8-9-which had been planted with corn-were excluded from the ground truth map due to harvesting; as were Plots 5-4, which had not been planted with tomato as per the schedule.Because of the imaging date, the plots of cover crop had no covers, and were therefore considered as bare earth.The impacts of different irrigation systems and fertilization, as well as the effect of crop rotation were considered in this facility.Therefore, this dataset was proper to evaluate the SV phenomena in vegetables.The SV of pixels for three classes of wheat, tomato, and corn are illustrated in Figure 8.

Salinas Dataset
This dataset was collected by the AVIRIS sensor on 9 October 1998, over Salinas Valley, Southern California, and is available as the at-sensor radiance unit.The scene comprises 512 lines by 217 samples, with 160 spectral bands (after discarding noisy and water absorption bands) in the wavelength range of 0.4-2.5 microns.Its nominal spectral and radiometric resolutions were 10 nanometers and 16 bits, respectively.This image was captured from low altitude with a 3.7-m ground pixel size.Its false color composite is illustrated in Figure 9a.The impacts of different irrigation systems and fertilization, as well as the effect of crop rotation were considered in this facility.Therefore, this dataset was proper to evaluate the SV phenomena in vegetables.The SV of pixels for three classes of wheat, tomato, and corn are illustrated in Figure 8.The impacts of different irrigation systems and fertilization, as well as the effect of crop rotation were considered in this facility.Therefore, this dataset was proper to evaluate the SV phenomena in vegetables.The SV of pixels for three classes of wheat, tomato, and corn are illustrated in Figure 8.

Salinas Dataset
This dataset was collected by the AVIRIS sensor on 9 October 1998, over Salinas Valley, Southern California, and is available as the at-sensor radiance unit.The scene comprises 512 lines by 217 samples, with 160 spectral bands (after discarding noisy and water absorption bands) in the wavelength range of 0.4-2.5 microns.Its nominal spectral and radiometric resolutions were 10 nanometers and 16 bits, respectively.This image was captured from low altitude with a 3.7-m ground pixel size.Its false color composite is illustrated in Figure 9a.

Salinas Dataset
This dataset was collected by the AVIRIS sensor on 9 October 1998, over Salinas Valley, Southern California, and is available as the at-sensor radiance unit.The scene comprises 512 lines by 217 samples, with 160 spectral bands (after discarding noisy and water absorption bands) in the wavelength range of 0.4-2.5 microns.Its nominal spectral and radiometric resolutions were 10 nanometers and 16 bits, respectively.This image was captured from low altitude with a 3.7-m ground pixel size.Its false color composite is illustrated in Figure 9a.This dataset was collected from an agricultural region, and its ground truth map has been gathered into 15 classes, as illustrated in Figure 9b.It includes vegetables, bare soils, and vineyard fields with sub-categories as follows.The sub-categories of broccoli and green weeds were distinguished, with one having smaller and fewer weeds, while two had taller and more weeds, with both categories mostly covering the soil.The romaine lettuce sub-classes have been defined based on the planting week and their growth rates, which have different covers on the soil.
The soil was categorized into three sub-classes: the fallow rough plow class had recently been turned with larger clumps and appeared to have more moisture, while the fallow class was plowed soil with smaller clumps, and the fallow smooth class had even smaller clumps.The stubble class comprised bare soil and straw, and could also be considered a sub-class of the soil group.In the vineyard group, the untrained vineyard and the untrained grapes sub-classes were actually similar to each other.In the untrained vineyard sub-class, vine had been grown on wooden and plastic posts, and their canopies had nearly covered the soil.The situation of the selected classes at the time of imaging is shown in Figure 10.This dataset was collected from an agricultural region, and its ground truth map has been gathered into 15 classes, as illustrated in Figure 9b.It includes vegetables, bare soils, and vineyard fields with sub-categories as follows.The sub-categories of broccoli and green weeds were distinguished, with one having smaller and fewer weeds, while two had taller and more weeds, with both categories mostly covering the soil.The romaine lettuce sub-classes have been defined based on the planting week and their growth rates, which have different covers on the soil.
The soil was categorized into three sub-classes: the fallow rough plow class had recently been turned with larger clumps and appeared to have more moisture, while the fallow class was plowed soil with smaller clumps, and the fallow smooth class had even smaller clumps.The stubble class comprised bare soil and straw, and could also be considered a sub-class of the soil group.In the vineyard group, the untrained vineyard and the untrained grapes sub-classes were actually similar to each other.In the untrained vineyard sub-class, vine had been grown on wooden and plastic posts, and their canopies had nearly covered the soil.The situation of the selected classes at the time of imaging is shown in Figure 10.This dataset was collected from an agricultural region, and its ground truth map has been gathered into 15 classes, as illustrated in Figure 9b.It includes vegetables, bare soils, and vineyard fields with sub-categories as follows.The sub-categories of broccoli and green weeds were distinguished, with one having smaller and fewer weeds, while two had taller and more weeds, with both categories mostly covering the soil.The romaine lettuce sub-classes have been defined based on the planting week and their growth rates, which have different covers on the soil.
The soil was categorized into three sub-classes: the fallow rough plow class had recently been turned with larger clumps and appeared to have more moisture, while the fallow class was plowed soil with smaller clumps, and the fallow smooth class had even smaller clumps.The stubble class comprised bare soil and straw, and could also be considered a sub-class of the soil group.In the vineyard group, the untrained vineyard and the untrained grapes sub-classes were actually similar to each other.In the untrained vineyard sub-class, vine had been grown on wooden and plastic posts, and their canopies had nearly covered the soil.The situation of the selected classes at the time of imaging is shown in Figure 10.

Indiana Indian Pines dataset
The third dataset used in the experiments was collected by the AVIRIS sensor over the Indian Pines Test Site in Northwestern Indiana in 1992.This image has a size of 145 × 145 pixels, and was acquired over a mixed agricultural/forest area, early in the growing season.The spatial resolution is approximately 20 m, and the radiometric resolution is 10 bits.The image comprised 220 spectral channels in the wavelength range from 0.4-2.5 micrometers, nominal spectral resolution of 10 nanometers.Bands 1-2, 100-114, 147-167, and 216-220 were removed from the dataset due to the noise and the water absorption phenomena, leaving a total of 177 radiance channels to be used in the experiments.This scene contained two-thirds agriculture and one-third forest or other natural perennial vegetation.For illustrative purposes, Figure 11a shows a false color composition of the AVIRIS Indian Pines scene, while Figure 11b shows the ground truth map available for the scene, with 16 classes.In our experiments, we considered a real situation in which most of the similar classes were included in the evaluations.Hence, 12 classes with an adequate number of labeled samples were selected for the experiments.

Indiana Indian Pines dataset
The third dataset used in the experiments was collected by the AVIRIS sensor over the Indian Pines Test Site in Northwestern Indiana in 1992.This image has a size of 145 × 145 pixels, and was acquired over a mixed agricultural/forest area, early in the growing season.The spatial resolution is approximately 20 m, and the radiometric resolution is 10 bits.The image comprised 220 spectral channels in the wavelength range from 0.4-2.5 micrometers, nominal spectral resolution of 10 nanometers.Bands 1-2, 100-114, 147-167, and 216-220 were removed from the dataset due to the noise and the water absorption phenomena, leaving a total of 177 radiance channels to be used in the experiments.This scene contained two-thirds agriculture and one-third forest or other natural perennial vegetation.For illustrative purposes, Figure 11a shows a false color composition of the AVIRIS Indian Pines scene, while Figure 11b shows the ground truth map available for the scene, with 16 classes.In our experiments, we considered a real situation in which most of the similar classes were included in the evaluations.Hence, 12 classes with an adequate number of labeled samples were selected for the experiments.

Experiments on the Simulated Dataset
The proposed ISI-PS algorithm was firstly performed on the simulated dataset, which was generated using the elements of Table 1.This data contained five datasets with different numbers and types of endmembers.In order to generate the fractional abundance for each dataset, a different pattern was employed according to Table 1.The main objective of this experiment was to evaluate the performance of the proposed method to deal with the endmembers' SV and decreasing the correlation of endmembers by selecting the optimal bands to generate accurate fractional abundances.Besides, the quality of endmembers' sets-which had been extracted in an unsupervised manner according to Figure 1-were compared with a spectral library that existed for the ground truth maps.
Several scenarios have been designed in order to evaluate the performance of the proposed method.In addition to the variety of spectral features and spatial patterns, different signal-to-noise ratios (i.e., 20:1, 25:1, and 30:1) have been employed to generate the simulated images.Equation ( 5) was used directly to simulate the first four datasets.In the case of the last dataset, in addition to Equation ( 5), two spectra from different species of one material were used to complicate the SV condition (Figure 12).
As was mentioned in the fifth step of the proposed method, a threshold value (T) was employed to identify the independent bands in the PS.In other words, if the angle of the candidate band from all previously selected bands was greater than the threshold value, this band had a distinct behavior dealing with other selected bands.

Experiments on the Simulated Dataset
The proposed ISI-PS algorithm was firstly performed on the simulated dataset, which was generated using the elements of Table 1.This data contained five datasets with different numbers and types of endmembers.In order to generate the fractional abundance for each dataset, a different pattern was employed according to Table 1.The main objective of this experiment was to evaluate the performance of the proposed method to deal with the endmembers' SV and decreasing the correlation of endmembers by selecting the optimal bands to generate accurate fractional abundances.Besides, the quality of endmembers' sets-which had been extracted in an unsupervised manner according to Figure 1-were compared with a spectral library that existed for the ground truth maps.
Several scenarios have been designed in order to evaluate the performance of the proposed method.In addition to the variety of spectral features and spatial patterns, different signal-to-noise ratios (i.e., 20:1, 25:1, and 30:1) have been employed to generate the simulated images.Equation ( 5) was used directly to simulate the first four datasets.In the case of the last dataset, in addition to Equation ( 5), two spectra from different species of one material were used to complicate the SV condition (Figure 12).The value of this threshold (T) was affected by the number, spectral similarity, and variety of endmembers in the scene.However, if a precise ground truth map were available, the root mean square error (RMSE) resulted from the comparison of this map, and the estimated fractional abundance could be used to properly evaluate this threshold value.In this regard, by increasing the threshold value in the range of 0.5-5 degrees with an increment of 0.25 degrees and evaluating the precision of the resulted fractional abundances, the threshold that led to the minimum error was selected.The results of different scenarios are provided in Table 2, along with the number of selected bands in each method.
If there were no in situ information for establishing a spectral library of the endmembers' variability, this information could be directly extracted from the image according to the early stages of the proposed algorithm.In this process-which was developed by applying some revisions to the SSPP algorithm-the virtual dimensionality (VD) of data was firstly estimated using the signal subspace identification algorithms (e.g., Hysime [28]) to be used for spectral dimension reduction by employing PCA or MNF transformations.Thereafter, the data used were also reduced in the spatial domain using spectral purity indices such as the PPI, and consequently, the number of candidate pure pixels was decreased.Using the mean spectrum of spectral clusters as the indicator of those classes without eliminating impure pixels led to the mean spectrum being affected by these pixels.In this case, the spectral changes among the clusters' pixels were not only due to the SV phenomena.By clustering those pixels that were probably pure, besides the tending of the class' mean towards the purity, the separability of classes would be properly shown.
Some other factors that challenged the performance of endmember extraction algorithms were sensitivity to noise, unusual pixels resulting from inaccurate atmospheric correction, and the image's hot spots.Therefore, extracting the homogeneous area of the image as a possible location for pure pixels could help to reduce the impact of these annoying phenomena.However, this could destroy the information of anomaly classes.
In order to identify the homogenous regions of the image used, a Gaussian filter with the functional form of ( ) was firstly applied to each band of the hyperspectral image (Equation 6).In this filter, the parameter σ controls the amount of spatial smoothing.

( ) ( ) ( )
, , , As was mentioned in the fifth step of the proposed method, a threshold value (T) was employed to identify the independent bands in the PS.In other words, if the angle of the candidate band from all previously selected bands was greater than the threshold value, this band had a distinct behavior dealing with other selected bands.
The value of this threshold (T) was affected by the number, spectral similarity, and variety of endmembers in the scene.However, if a precise ground truth map were available, the root mean square error (RMSE) resulted from the comparison of this map, and the estimated fractional abundance could be used to properly evaluate this threshold value.In this regard, by increasing the threshold value in the range of 0.5-5 degrees with an increment of 0.25 degrees and evaluating the precision of the resulted fractional abundances, the threshold that led to the minimum error was selected.The results of different scenarios are provided in Table 2, along with the number of selected bands in each method.
If there were no in situ information for establishing a spectral library of the endmembers' variability, this information could be directly extracted from the image according to the early stages of the proposed algorithm.In this process-which was developed by applying some revisions to the SSPP algorithm-the virtual dimensionality (VD) of data was firstly estimated using the signal subspace identification algorithms (e.g., Hysime [28]) to be used for spectral dimension reduction by employing PCA or MNF transformations.Thereafter, the data used were also reduced in the spatial domain using spectral purity indices such as the PPI, and consequently, the number of candidate pure pixels was decreased.Using the mean spectrum of spectral clusters as the indicator of those classes without eliminating impure pixels led to the mean spectrum being affected by these pixels.In this case, the spectral changes among the clusters' pixels were not only due to the SV phenomena.By clustering those pixels that were probably pure, besides the tending of the class' mean towards the purity, the separability of classes would be properly shown.
Some other factors that challenged the performance of endmember extraction algorithms were sensitivity to noise, unusual pixels resulting from inaccurate atmospheric correction, and the image's hot spots.Therefore, extracting the homogeneous area of the image as a possible location for pure pixels could help to reduce the impact of these annoying phenomena.However, this could destroy the information of anomaly classes.
In order to identify the homogenous regions of the image used, a Gaussian filter with the functional form of g(i , j ) = 1 2πσ 2 e was firstly applied to each band of the hyperspectral image (Equation ( 6)).In this filter, the parameter σ controls the amount of spatial smoothing.
where X F b (i, j) is the value of the pixel (i,j) in the band b of the filtered image.The value of a = (w − 1)/2 was determined regarding the filter dimension g w×w , and showed the spatial index of the filter.In order to generate the spatial homogeneity index (SHI), the RMSE of the discrepancies of the original image X and the filtered imaged X F was computed using Equation ( 7) [27].
In this way, a layer was obtained where the value of the pixel (i,j) indicates the homogeneity of that region of the image.The lower the pixel value, the more homogeneous that region will be.
Finally, pixels that were located in each cluster were sorted based on the spectral purity index, and the purest ones that were located in homogenous areas were selected as the indicators of those clusters.In the experiments, 20 percent of the pixels of each cluster and 30 percent of the homogenous pixels with the best scores were employed.The results obtained from the subsections of this process are illustrated in Figure 13 for the first dataset.is the value of the pixel (i,j) in the band b of the filtered image.The value of ( ) was determined regarding the filter dimension w w g × , and showed the spatial index of the filter.In order to generate the spatial homogeneity index (SHI), the RMSE of the discrepancies of the original image X and the filtered imaged F X was computed using Equation ( 7) [27]. ( In this way, a layer was obtained where the value of the pixel (i,j) indicates the homogeneity of that region of the image.The lower the pixel value, the more homogeneous that region will be.
Finally, pixels that were located in each cluster were sorted based on the spectral purity index, and the purest ones that were located in homogenous areas were selected as the indicators of those clusters.In the experiments, 20 percent of the pixels of each cluster and 30 percent of the homogenous pixels with the best scores were employed.The results obtained from the subsections of this process are illustrated in Figure 13 for the first dataset.The results obtained from the SU of the five simulated datasets are provided in Table 3.It is worth mentioning that the endmembers' sets were directly extracted from the image in these experiments.For comparison purposes, the data used in these experiments were quite similar to those that were used in the previous supervised experiments.The results obtained from the SU of the five simulated datasets are provided in Table 3.It is worth mentioning that the endmembers' sets were directly extracted from the image in these experiments.For comparison purposes, the data used in these experiments were quite similar to those that were used in the previous supervised experiments.
Table 2.The accuracy assessment results of the selected bands by maximum tangent discrimination (MTD), stable zone unmixing (SZU), and the proposed instability index (ISI)-prototype space (PS) algorithms, in comparison with using all bands to estimate the fractional abundances through a supervised spectral unmixing on the simulated datasets.According to Tables 2 and 3, the proposed ISI-PS algorithm always provided proper accuracy in the estimation of fractional abundances of endmembers by selecting the optimal bands.Moreover, the results obtained from the supervised and unsupervised experiments had good agreement with each other.Therefore, the spectral library of endmembers' variability was correctly extracted.The accuracies of the fractional abundances obtained from the MTD and ISI-PS methods were compatible.However, by comparing the position of selected bands in these two methods, it is obvious that in the MTD algorithm, the SV of endmembers was neglected, and the most separable bands were selected only by considering the spectral feature of each class.When the SV of the endmembers' sets was not to the extent that the spectrum of classes highly conflicted in the overlapping regions, the results of the two methods were close to each other.However, if the SV disrupted the separability of classes, the proposed ISI-PS method led to more accurate results by selecting the bands with the minimum spectral conflicts.

Experiments on Real Datasets
In order to evaluate the effect of SV on the results of unmixing, the ground truth map of the data used was needed, as well as the spectral library of the intra-class variations of each endmember.Providing the sub-pixel fractional abundance of the image's components was practically impossible, and collecting a spectral library from the variation of each component was an expensive and time-consuming process.However, in the case of the LTRAS dataset, for which only one crop was planted in each farm, the fraction of the components could be supposed to be 100 percent in the related plot; and observed spectral variations among the pixels of each plot could be seen as their SV due to the factors mentioned in Section 4.1.2.
However, it is worth mentioning that the change of the ratio of plants and background soil in each farm will lead to the change of the received spectra.This type of change was not considered as spectral variability.In this study, regarding the rather limited area and homogeneity of the farms, it was assumed that the mixture of materials was accrued with a constant ratio.
The extraction of the statistics of each class was performed using five percent of its pixels, which were selected randomly.This subset was used as the training dataset, and the remaining part was employed as the test dataset.Regarding the intra-class variations, the spectral features of the training pixels were employed for establishing a spectral library for each class.The mean spectrum of these sets was then used as the spectral indicator of the related classes.
As previously mentioned, the correlation of bands was not considered in the prioritization process of the SZU method.The selected bands using this method are illustrated by green bars in Figure 14.As can be seen, several bands were selected in the vicinity of each other in the event that the behavior of the endmembers was close together in these bands.In other words, if this redundancy could be reduced by a meaningful selection of the optimal bands in each region, more accurate and computationally-effective results could be achieved.
In the proposed ISI-PS algorithm, the angle of bands was used in the PS to deal with the bands' correlation.The selection of a proper threshold (T) to eliminate the correlated bands was affected by the number, spectral similarity, and variety of endmembers in the scene, as well as the number of resulting bands.If a ground truth map were available, the proper angle to eliminate the redundant bands could be estimated using the estimation accuracy of the ground truth map.In this experiment, regarding the homogeneity of farms, a ground truth map was generated using the map shown in Figure 6b.In this regard, the fraction of each endmember in the related class was considered as one, and for other classes, the fractions were considered to be zero.The fraction of classes in each pixel was then estimated involving the selected bands in each method, and their box plots are illustrated in Figure 15.In this case, the closer the resulting fraction to one and the less the standard deviation of fractions, the better the performance of the algorithm for dealing with the intra-class variations.It is worth mentioning that the most SV had occurred in the corn, wheat, and tomato classes.As can be seen, in comparison with the MTD method, the proposed ISI-PS algorithm tried to reduce the SV in these classes by decreasing the median and the standard deviation of the estimated fractions.However, in the other classes, the performances of the studied methods were close to each other.In this experiment, the threshold value of the correlated bands was considered as 1.25 degrees.Figure 16a shows the RMSE of the fractional abundance estimation of the LTRAS classes using the bands obtained from the MTD, the SZU, and the proposed ISI-PS algorithms, for threshold values of 0.25-5 degrees by an increment of 0.2 degrees.Index 6 was equivalent to the threshold value of 1.25 degrees, and caused the selection of 46 bands from the original 170 bands.The selected bands using the MTD, the SZU, and the ISI-PS algorithms are illustrated in Figure 14 using red, green, and dashed blue lines, respectively.It is worth mentioning that the proposed ISI-PS algorithm provided the most accurate fractions.
The singularity of the coefficient matrix, which was generated using the selected bands, was evaluated using: (1) the condition number of the endmembers matrix and (2) the average correlation of the endmembers' correlation matrix.By increasing the number of selected bands using each method, these two measures are illustrated in Figure 17, as well as for the full dimension of the data (i.e., all bands).In this experiment, the threshold value of the correlated bands was considered as 1.25 degrees.Figure 16a shows the RMSE of the fractional abundance estimation of the LTRAS classes using the bands obtained from the MTD, the SZU, and the proposed ISI-PS algorithms, for threshold values of 0.25-5 degrees by an increment of 0.2 degrees.Index 6 was equivalent to the threshold value of 1.25 degrees, and caused the selection of 46 bands from the original 170 bands.The selected bands using the MTD, the SZU, and the ISI-PS algorithms are illustrated in Figure 14 using red, green, and dashed blue lines, respectively.It is worth mentioning that the proposed ISI-PS algorithm provided the most accurate fractions.
The singularity of the coefficient matrix, which was generated using the selected bands, was evaluated using: (1) the condition number of the endmembers matrix and (2) the average correlation of the endmembers' correlation matrix.By increasing the number of selected bands using each method, these two measures are illustrated in Figure 17, as well as for the full dimension of the data (i.e., all bands).In this experiment, the threshold value of the correlated bands was considered as 1.25 degrees.Figure 16a shows the RMSE of the fractional abundance estimation of the LTRAS classes using the bands obtained from the MTD, the SZU, and the proposed ISI-PS algorithms, for threshold values of 0.25-5 degrees by an increment of 0.2 degrees.Index 6 was equivalent to the threshold value of 1.25 degrees, and caused the selection of 46 bands from the original 170 bands.The selected bands using the MTD, the SZU, and the ISI-PS algorithms are illustrated in Figure 14 using red, green, and dashed blue lines, respectively.It is worth mentioning that the proposed ISI-PS algorithm provided the most accurate fractions.
The singularity of the coefficient matrix, which was generated using the selected bands, was evaluated using: (1) the condition number of the endmembers matrix and (2) the average correlation of the endmembers' correlation matrix.By increasing the number of selected bands using each method, these two measures are illustrated in Figure 17, as well as for the full dimension of the data (i.e., all bands).Finally, to evaluate the role of the selected bands on the results, the original image was reconstructed using the same set of endmembers and fractional abundances obtained from each method.In other words, by considering the accuracy of endmembers and applying the LMM, each pixel (y) of the original image could be approximated using where p is the number of endmembers, i α is the estimated fractions for each endmember, and i m is the ith endmember.
Accordingly, the original and the reconstructed images using the LMM could be considered as , respectively, where N is the number of pixels.The reconstruction accuracy could be estimated using Equation (8).The average reconstruction error of each method is illustrated in Figure 16b, and the obtained results from the SU process are provided in Table 4.

( )
[ ] Finally, to evaluate the role of the selected bands on the results, the original image was reconstructed using the same set of endmembers and fractional abundances obtained from each method.In other words, by considering the accuracy of endmembers and applying the LMM, each pixel (y) of the original image could be approximated using where p is the number of endmembers, i α is the estimated fractions for each endmember, and i m is the ith endmember.
Accordingly, the original and the reconstructed images using the LMM could be considered as , respectively, where N is the number of pixels.The reconstruction accuracy could be estimated using Equation (8).The average reconstruction error of each method is illustrated in Figure 16b, and the obtained results from the SU process are provided in Table 4.

( )
[ ] Finally, to evaluate the role of the selected bands on the results, the original image was reconstructed using the same set of endmembers and fractional abundances obtained from each method.In other words, by considering the accuracy of endmembers and applying the LMM, each pixel (y) of the original image could be approximated using ŷ = ∑ p i=1 α i m i , where p is the number of endmembers, α i is the estimated fractions for each endmember, and m i is the ith endmember.Accordingly, the original and the reconstructed images using the LMM could be considered as I = (y k ) N k=1 and R = ( ŷk ) N k=1 , respectively, where N is the number of pixels.The reconstruction accuracy could be estimated using Equation (8).The average reconstruction error of each method is illustrated in Figure 16b, and the obtained results from the SU process are provided in Table 4.In the unsupervised manner, the dimension of the signal subspace was firstly estimated as nine using the HySime algorithm.Then, according to the proposed method, a spectral library was established from the spectral variability of each class, regardless of the ground truth map.In order to evaluate the spectral similarity of the endmembers obtained with the spectrum of each class from the ground truth map, the average spectra of the two sets was compared with the spectral information divergence (SID) [39] similarity measure.The results obtained are provided in Table 5.The lower the value of SID, the more similar the two spectra will be.According to the results, the corn class was split into three sub-classes.However, the spectrum of the other classes was estimated properly.
The estimated fractional abundances using the 61 selected bands by the proposed ISI-PS algorithm (Table 4) and the FCLS method are illustrated in Figure 18.As can been seen, the corn class was split into three sub-classes, as in Figure 18a,h,i.The fractional abundances of the bare soil and the wheat classes (i.e., Figure 18c,d, respectively) were partially overlaid, which was due the harvesting of wheat and the appearance of the background soil of the farms.It is worth mentioning that this issue had mostly occurred in farms with a lesser vegetation density due to the type of irrigation and fertilization.This could be obviously understood by the comparison of the obtained results and Figure 7.
Remote Sens. 2017, 9, 884 23 of 28 In the unsupervised manner, the dimension of the signal subspace was firstly estimated as nine using the HySime algorithm.Then, according to the proposed method, a spectral library was established from the spectral variability of each class, regardless of the ground truth map.In order to evaluate the spectral similarity of the endmembers obtained with the spectrum of each class from the ground truth map, the average spectra of the two sets was compared with the spectral information divergence (SID) [39] similarity measure.The results obtained are provided in Table 5.The lower the value of SID, the more similar the two spectra will be.According to the results, the corn class was split into three sub-classes.However, the spectrum of the other classes was estimated properly.
The estimated fractional abundances using the 61 selected bands by the proposed ISI-PS algorithm (Table 4) and the FCLS method are illustrated in Figure 18.As can been seen, the corn class was split into three sub-classes, as in Figure 18a,h,i.The fractional abundances of the bare soil and the wheat classes (i.e., Figure 18c,d, respectively) were partially overlaid, which was due the harvesting of wheat and the appearance of the background soil of the farms.It is worth mentioning that this issue had mostly occurred in farms with a lesser vegetation density due to the type of irrigation and fertilization.This could be obviously understood by the comparison of the obtained results and Figure 7.All of the pre-mentioned process for the LTRAS dataset was performed on the Salinas and Indiana Indian Pines datasets as well (Table 4).
In this table, in order to evaluate the accuracy of the proposed method when no in situ data were available, the results have been provided for the supervised and the unsupervised manners.In the unsupervised approach, the endmember sets were extracted from the image without any prior knowledge.In this regard, using the position of the pure pixels obtained, a fractional map with a 100 percent abundance was first generated for each endmember to compute the threshold value.Then, the fractional abundances of the endmembers were estimated using the extracted endmembers and the different bands obtained from different threshold values.Finally, the threshold value that led to the minimum RMSE of the estimation of the fractional abundances of endmembers was selected as the optimal threshold value.
In order to compare the results obtained from the supervised and the unsupervised approaches, the estimated fractional maps using the FCLS method over the Salinas dataset are illustrated in Figures 19 and 20.As can be seen, the results of the unsupervised approach were compatible with the supervised approach, and the proposed method was able to separate the similar spectral classes.However, due to the similar spectral behaviors in the endmember extraction step, the two classes grapes_untrained and vineyard_untrained were considered as unique classes.
In this section, a comparison is made between the computational times of different methods and is reported in Table 6.All of the methods were executed on a PC with an i7 5820k CPU and 32 GB of RAM.As can be seen, ISI-PS showed disadvantages from the computational time point of view.This was mainly due to the exhaustive search, which was applied to locate an optimum value for threshold T (see Section 3.4).We had chosen a rather vast domain of T for the sake of a richer evaluation.However, the processing time of ISI-PS can be highly improved by a more exact estimation of the search domain for T. In addition, more advanced search strategies-instead of the exhaustive search applied herein-can be of great help to mitigate the computational costs of ISI-PS, which is suggested for further study.

Conclusions
Band selection has been always a challenge in the processing of high-dimensional hyperspectral data.In this paper, a novel method was presented to select a subset of bands that led especially to improving the results of spectral unmixing.The proposed method-named ISI-PS-integrates two measures of band selection.Firstly, it is aimed at managing the spectral variability.To do so, the bands were prioritized in a way so as to have the least inter-class variability while at the same time achieving the highest possible between-class separation.On the other hand, the second phase takes into account the bands' dependency and makes an effort to detect and remove highly correlated bands.This phase was performed in the Prototype Space, which was formed by image endmembers.In the Prototype Space-in which the bands were treated as the space points-bands' dependencies were examined via their inter-angles.
As mentioned above, the second phase of the proposed method required the knowledge of image endmembers, which is itself a challenge in hyperspectral image processing.In this paper, as with the other contribution, an unsupervised automatic technique was proposed that can effectively extract the endmembers from the image itself and that needed no more input knowledge.
The proposed method was examined and validated on a variety of simulated and real datasets.To do so, the selected bands were used in the spectral unmixing, and the RMSE of the obtained fractional abundances was considered as the accuracy measure.The obtained results were all compatible with the in-situ observations and confirmed the effectiveness of the proposed method.In addition, the performance of the proposed method was compared with the SZU and the MTD algorithms, which proved the superiority of the proposed method.

Figure 1 .
Figure 1.Flowchart of the proposed method.

Figure 1 .
Figure 1.Flowchart of the proposed method.

Figure 2 .
Figure 2. The spectrum of alunite and kaolinite minerals and the highly-correlated regions among the adjacent bands.

Figure 2 .
Figure 2. The spectrum of alunite and kaolinite minerals and the highly-correlated regions among the adjacent bands.

Figure 3 .
Figure 3.The Prototype Space (PS) constructed using the two endmembers alunite and kaolinite.

Figure 3 .
Figure 3.The Prototype Space (PS) constructed using the two endmembers alunite and kaolinite.

Figure 4 .
Figure 4.An example of the fractional abundances that were generated using: (a) Spherical Gaussian fields; (b) Exponential Gaussian fields; (c) Rational Gaussian fields; and (d) Mattern Gaussian fields functions.

Figure 4 .
Figure 4.An example of the fractional abundances that were generated using: (a) Spherical Gaussian fields; (b) Exponential Gaussian fields; (c) Rational Gaussian fields; and (d) Mattern Gaussian fields functions.

Figure 5 .
Figure 5. Flowchart of generating the simulated images.

Figure 5 .
Figure 5. Flowchart of generating the simulated images.

Figure 7 .
Figure 7. (a) True color composite; (b) False color composite; and (c) Color composite of the first three components of the maximum noise fraction (MNF) transformation.

Figure 8 .
Figure 8.The spectral variability of pixels for three classes of: (a) Wheat; (b) Tomato; and (c) Corn.

Figure 7 .
Figure 7. (a) True color composite; (b) False color composite; and (c) Color composite of the first three components of the maximum noise fraction (MNF) transformation.

Figure 7 .
Figure 7. (a) True color composite; (b) False color composite; and (c) Color composite of the first three components of the maximum noise fraction (MNF) transformation.

Figure 8 .
Figure 8.The spectral variability of pixels for three classes of: (a) Wheat; (b) Tomato; and (c) Corn.

Figure 8 .
Figure 8.The spectral variability of pixels for three classes of: (a) Wheat; (b) Tomato; and (c) Corn.

Figure 9 .
Figure 9. Salinas dataset: (a) False color composite of the AVIRIS image; (b) Ground truth map.

Figure 10 .
Figure 10.Photograph of the selected classes in the region of the imaging.

Figure 9 .
Figure 9. Salinas dataset: (a) False color composite of the AVIRIS image; (b) Ground truth map.

Figure 9 .
Figure 9. Salinas dataset: (a) False color composite of the AVIRIS image; (b) Ground truth map.

Figure 10 .
Figure 10.Photograph of the selected classes in the region of the imaging.

Figure 10 .
Figure 10.Photograph of the selected classes in the region of the imaging.

Figure 11 .
Figure 11.Indiana Indian Pines dataset: (a) False color composite of the AVIRIS image; (b) Ground truth map.

Figure 11 .
Figure 11.Indiana Indian Pines dataset: (a) False color composite of the AVIRIS image; (b) Ground truth map.

Figure 12 .
Figure 12.Endmembers used to simulate the last dataset.

Figure 12 .
Figure 12.Endmembers used to simulate the last dataset.

Figure 13 .
Figure 13.The results obtained from the subsections of the process of extraction of the spectral library of each class from the image: (a) Output of the pixel purity index (PPI) for greater than zero pixels; (b) The result of band filtering; (c) The clustering of PPI's output; (d) Map of the homogeneity scores of pixels; (e) 20 percent of pixels of each cluster with the maximum value of PPI; (f) 30 percent of pixels of each cluster with the best homogeneity; (g) Fusion of maps (e,f) to extract the final pure pixels; (h) Ground truth map of the image's endmembers.

Figure 13 .
Figure 13.The results obtained from the subsections of the process of extraction of the spectral library of each class from the image: (a) Output of the pixel purity index (PPI) for greater than zero pixels; (b) The result of band filtering; (c) The clustering of PPI's output; (d) Map of the homogeneity scores of pixels; (e) 20 percent of pixels of each cluster with the maximum value of PPI; (f) 30 percent of pixels of each cluster with the best homogeneity; (g) Fusion of maps (e,f) to extract the final pure pixels; (h) Ground truth map of the image's endmembers.

Figure 14 .
Figure 14.Representation of the selected bands using the MTD, SZU, and the proposed ISI-PS methods.

Figure 15 .
Figure 15.The box plot of the resulting fractions for each class using the selected bands by the studied method, using: (1) all bands; (2) MTD bands; (3) SZU bands; and (4) ISI-PS bands.

Figure 14 . 29 Figure 14 .
Figure 14.Representation of the selected bands using the MTD, SZU, and the proposed ISI-PS methods.

Figure 15 .
Figure 15.The box plot of the resulting fractions for each class using the selected bands by the studied method, using: (1) all bands; (2) MTD bands; (3) SZU bands; and (4) ISI-PS bands.

Figure 15 .
Figure 15.The box plot of the resulting fractions for each class using the selected bands by the studied method, using: (1) all bands; (2) MTD bands; (3) SZU bands; and (4) ISI-PS bands.

Figure 16 .Figure 17 .
Figure 16.The RMSE of estimating: (a) The fractions; (b) The LTRAS image by means of the selected bands using the MTD, the SZU, and the ISI-PS algorithms for different threshold values.

Figure 16 .Figure 16 .Figure 17 .
Figure 16.The RMSE of estimating: (a) The fractions; (b) The LTRAS image by means of the selected bands using the MTD, the SZU, and the ISI-PS algorithms for different threshold values.

Figure 17 .
Figure 17.Plot of: (a) The condition number of the matrix; (b) The average correlation of the endmembers' correlation matrix for the different band selection methods by increasing the number of the selected bands.

Figure 19 .
Figure19.The fractional abundances estimated from the Salinas area using the fully constrained least squares (FCLS) method and the selected bands by the proposed ISI-PS algorithm in the supervised approach.Figure19.The fractional abundances estimated from the Salinas area using the fully constrained least squares (FCLS) method and the selected bands by the proposed ISI-PS algorithm in the supervised approach.

Figure 19 .
Figure19.The fractional abundances estimated from the Salinas area using the fully constrained least squares (FCLS) method and the selected bands by the proposed ISI-PS algorithm in the supervised approach.Figure19.The fractional abundances estimated from the Salinas area using the fully constrained least squares (FCLS) method and the selected bands by the proposed ISI-PS algorithm in the supervised approach.

Figure 20 .
Figure 20.The fractional abundances estimated from the Salinas area using the FCLS method and the selected bands by the proposed ISI-PS algorithm in the unsupervised approach.Figure20.The fractional abundances estimated from the Salinas area using the FCLS method and the selected bands by the proposed ISI-PS algorithm in the unsupervised approach.

Figure 20 .
Figure 20.The fractional abundances estimated from the Salinas area using the FCLS method and the selected bands by the proposed ISI-PS algorithm in the unsupervised approach.Figure20.The fractional abundances estimated from the Salinas area using the FCLS method and the selected bands by the proposed ISI-PS algorithm in the unsupervised approach.

Table 1 .
The constituent spectra of the simulated images.

Table 1 .
The constituent spectra of the simulated images.

Table 3 .
The accuracy assessment results of the selected bands by MTD, SZU, and the ISI-PS algorithms, in comparison with using all bands to estimate the fractional abundances through an unsupervised spectral unmixing on to the simulated datasets.S is the number the image's components; #F is the number of selected bands; SNR is the signal-to-noise ratio; Cond is the Condition number; Corr is the mean Correlation of endmembers; T is the angle threshold and bold number is the best result. #

Table 4 .
The accuracies obtained from the selected bands using the MTD, the SZU, the proposed ISI-PS methods, and the full dimension real hyperspectral datasets to estimate the fractional abundances.

Table 5 .
Similarity values among the extracted endmembers and the reference spectral signatures of each endmembers by spectral information divergence (SID).

Table 6 .
Computational times of unmixing using full bands and reduced bands from different band selection methods.