The Successive Projection Algorithm (SPA), an Algorithm with a Spatial Constraint for the Automatic Search of Endmembers in Hyperspectral Data.

Spectral mixing is a problem inherent to remote sensing data and results in fewimage pixel spectra representing "pure" targets. Linear spectral mixture analysis isdesigned to address this problem and it assumes that the pixel-to-pixel variability in ascene results from varying proportions of spectral endmembers. In this paper we present adifferent endmember-search algorithm called the Successive Projection Algorithm (SPA).SPA builds on convex geometry and orthogonal projection common to other endmembersearch algorithms by including a constraint on the spatial adjacency of endmembercandidate pixels. Consequently it can reduce the susceptibility to outlier pixels andgenerates realistic endmembers.This is demonstrated using two case studies (AVIRISCuprite cube and Probe-1 imagery for Baffin Island) where image endmembers can bevalidated with ground truth data. The SPA algorithm extracts endmembers fromhyperspectral data without having to reduce the data dimensionality. It uses the spectralangle (alike IEA) and the spatial adjacency of pixels in the image to constrain the selectionof candidate pixels representing an endmember. We designed SPA based on theobservation that many targets have spatial continuity (e.g. bedrock lithologies) in imageryand thus a spatial constraint would be beneficial in the endmember search. An additionalproduct of the SPA is data describing the change of the simplex volume ratio between successive iterations during the endmember extraction. It illustrates the influence of a newendmember on the data structure, and provides information on the convergence of thealgorithm. It can provide a general guideline to constrain the total number of endmembersin a search.


Introduction
Linear spectral mixture analysis (SMA) is based on the simple assumption that remotely sensed spectral measurements are mixed signatures that vary across the scene as the relative proportion of endmembers change. It is commonly used for the analysis of hyperspectral data [1][2][3][4][5][6][7][8], but to obtain accurate unmixing results the endmembers selected must be representative of surface components that occur in relatively pure form [9]. For this reason much literature has focused on the critical step of endmember extraction with the aim to determine the "purest" spectral representation of materials present in the scene.
Spectral endmembers can be derived from the imagery (image endmembers) or measurements in the laboratory/field (library endmembers). Library endmembers may not always be available, and if available, they are not necessarily acquired under the same conditions as airborne or satellite image data and may not be good representations of the image components. Thus there are advantages in being able to extract endmembers directly from imagery. The selection of image endmembers is typically achieved through the implicit (PPI, pixel purity index [10]) or explicit use of convex geometry [11]. A simplex is fit to the convex hull of the n-dimensional data cloud and the vertices of the simplex define the spectral properties of the endmembers. Based on this concept, a number of algorithms have been developed over the past decade to automatically find image endmembers and these include the N-FINDR [12], iterative error analysis (IEA) [13], vertex component analysis (VCA) [14], MaxD (Maximum Distance) [15], Sequential Maximum Angle Convex Cone (SMACC ) [16], iterated constrained endmembers (ICE) [17], simplex growing algorithm (SGA) [18], minimum volume constrained nonnegative factorization (MVC-NMF) [19] and optical real-time adaptive spectral identification system (ORASIS) [20]. While these methods have proven effective under different situations, most use one pixel (the most extreme pixel) to represent one endmember, which results in the inclusion of outlier pixels (e.g. bad pixels) as endmembers. Indeed Howes et al. (2004) reported that convex-based endmember extraction methods were susceptible to outliers since only a single spurious pixel can significantly alter the endmember simplex [21]. Outliers may result for example from noise or atmospheric effects in the data.
In this paper we present a different endmember-search algorithm called the Successive Projection Algorithm (SPA). SPA builds on the convex geometry endmember search algorithms described above by including a constraint on the spatial adjacency of endmember candidate pixels, whereby this approach can reduce the susceptibility to outlier pixels and generates realistic endmembers. This is demonstrated using two case studies where image endmembers can be validated with ground truth data. The spatial constraint was introduced based on success we have had with the spatial-spectral endmember extraction algorithm (SSEE) that makes use of the spectral and spatial characteristics of image pixels during the search for image endmembers [22]. SSEE operates differently from SPA using a roving endmember search window that covers the entire input image and it was designed to find similar but distinct endmembers (e.g. spectrally similar but distinct rock units).
In Section 2, we present the concept of convex geometry and its relevance for endmember selection, followed by a summary of current convex-based endmember-search algorithms. In Section 3, we describe SPA and its functionality. Section 4 describes the characteristics of the two test datasets (AVIRIS Cuprite cube and Probe-1 imagery for Baffin Island) that are used to evaluate the SPA algorithm. The experimental results are presented in Section 5, followed by a discussion.

Spectral endmembers in convex geometry
Linear spectral mixture analysis (LSMA) assumes that the pixel-to-pixel variability in a scene results from varying abundances of spectral endmembers. It follows that the spectral response for each pixel is a linear combination of endmember spectra, weighted by their fractional abundances. Assuming that the number of endmembers and their spectral signatures are known, the fractional abundances of endmembers in a given pixel are typically determined from a least squares fit [23,24]. Let ) , ( j i p denote the spectrum for the pixel in the image coordinates (i, j), the foundation of LSMA can be defined by the following formulation where m is the number of endmembers, k e is the kth endmember, ) , ( j i ε is the approximation error term (residual), which could be due to the noise in the data or due to modeling error (or both), and , ( is the fractional abundance for the kth endmember of pixel (i, j).
Spectra can be represented as points in an n-dimensional scatter plot where n is the number of bands. If we omit the error term in (1), the possible linear mixtures computed from (1) and (2) [25]. According to Gritzmann and Klee (1994) [26], the volume of the simplex m C can be calculated as where ] , , , ) are determined, their abundance can be estimated through the least squares method, which is equivalent to a projection on the simplex [25].
Using this framework, if all data points (pixels) are examined in n-dimensional space, endmembers present in the scene should be found at the vertices of the simplex. The interior space of the simplex then represents feasible mixtures. Thus, the task of finding endmembers is actually the identification of the simplex vertices, which has been the foundation for the geometric interpretation of hyperspectral data and for endmember extraction algorithms based on convex geometry. The spectral endmembers are determined as the spectral points closest to the vertices of the simplex formed by the image data in n-dimensional space, and are thus, the spectrally purest points of the image data.
The set of endmembers determined from convex geometry has the following properties that are relevant to the SPA algorithm proposed in this paper: Property 1: The endmembers represent the pixels that contain the largest data "volume" [12,26]. This property is used in the SPA algorithm to determine if it is converging. Property 2: A vector (pixel) with maximum Euclidean norm (magnitude) must be located at one of the vertices of the simplex [25,27]. It is the main step in SPA to identify pixels at the vertices of the simplex. Property 3: For a given point in the simplex, a point with maximum distance must be a vertex of the simplex [27]. Property 4: The affine transformation (e.g. orthogonal projection) of a simplex is also a simplex, and endmembers are still located in the vertices of the new simplex after this transformation [14,26]. In SPA this allows the use of orthogonal subspace projections as the core mechanism for endmember extraction.

Endmember extraction algorithms based on convex geometry
Search algorithms based on convex geometry rely on the four properties listed above, but differ in their approach to locate the vertices of the simplex. Such methods include PPI, N-FINDER, IEA, VCA, Max_D, ORASIS, SMACC, ICE, MVC-NMF and SGA. N-FINDER finds the set of pixels that define the simplex with the maximum volume inscribed within the dataset. IEA uses a series of constrained unmixing and chooses as the endmembers those pixels that minimize the residual error in the unmixing images. VCA and Max_D exploit the orthogonal projection approach to iteratively find the vertices of the simplex. SMACC is another algorithm for endmember extraction, which uses a convex cone model (also known as Residual Minimization) and constrained oblique projection to derive endmembers [16]. The patented ICE combines ideas from convex geometry and multivariate curve resolution techniques to find endmembers, which trades off goodness-of fit of the convex geometry model against the size of the simplex [17]. SGA finds a set of desired endmembers by growing a sequence of simplexes, improving the commonly used N-FINDR algorithm. ORASIS performs the endmember selection using the learning vector quantization (LVQ) concept and a minimum volume transform (MVT). MVC-NMF integrates the least squares analysis and the convexgeometry model by incorporating a volume constraint into the nonnegative matrix factorization (NMF). With the exception of ICE, ORASIS and MVC-NMF, the methods mentioned above have an assumption that vertices of the simplex (e.g. endmembers) can be represented by corresponding pixels in the scene. When no pixels in the scene match the vertices, the nearest pixels (e.g. mixtures) are selected. Whether this assumption is met is in part a function of the nature of the scene (spatial arrangement of targets) and the spatial resolution of the imagery. In the case of SMACC, N-FINDER, VCA, SGA and Max-D, one pixel (the most extreme pixel) is selected to represent one endmember and thus these methods suffer from the inherent sensitivity of convex geometry to outlier pixels [21]. Although principal component analysis (PCA), minimum noise fraction transform (MNF) and singular value decomposition (SVD) can be applied to the data prior to the endmember extraction to minimize the impact of noise [14,28], these can also result in the loss of detection of useful endmember pixels characterized by subtle spectral detail. In order to overcome the outlier problem, IEA selects multiple pixels with a small spectral angle to the most extreme pixel, and the average of these pixels is used as an endmember. The robustness of this approach to noise is reported in Plaza's work on simulated data [29,30]. However, in the case of isolated outlier pixels, the spectral angle between the outlier and other pixels can be large. In such a case IEA will still select the outlier pixel as an endmember resulting in a false endmember. An additional risk to the use of spectral angle for the selection of pixels that represent a given endmember is that these pixels may represent spatially distinct targets in the scene that are characterized by a similar spectral shape but with distinct subtle spectral difference as indicated by Turner II et al. (2004) [31]. In such a case the distinct character of each target would be lost due to spectral averaging and the lack of consideration of their spatial association. Thus one endmember would be identified rather then two.

Spectral similarity and spatial adjacency as selection criteria
In this study we propose a more robust approach that uses the spectral angle (alike IEA) and the spatial adjacency of pixels in the image to constrain the selection of candidate pixels representing an endmember. Two assumptions are made: 1) pixels that are spatially adjacent are more likely to have similar spectral properties and thus represent one endmember, and, 2) the probability that two adjacent pixels are both spurious is low. These assumptions are certainly reasonable if the target application is geological mapping because mappable units (e.g. bedrock lithologies) typically have spectral properties with spatial continuity in hyperspectral imagery.
The next section (e.g. 3.2.) describes how a vertex (e.g. an extreme pixel) is identified based on its spectral uniqueness in the simplex (the distinctness is measured in terms of the vector Euclidean norm or the distance of the pixel to the subspace defined by the previously selected endmembers). A meaningful endmember for this vertex is then the average of multiple candidate pixels that are spectrally distinct (e.g. they are located at or near one of the corners of the simplex) and are spatially adjacent. To find these candidate pixels we construct a pixel set, possible P , consisting of r pixels (the value of r is user defined and here set to 10) that are closest to the vertex (this step is identical to IEA). Then a subset, is selected from these r pixels ( possible P ) subject to conditions ( (5) and (6)): where i i y x , are pixel image coordinates, and pixel t _ is a threshold value in pixels defining the acceptable spatially adjacency. The smallest possible value is 1 pixel to avoid the inclusion of mixed pixels. If the scene is not spatially complex, this value can be increased.
is the spectral angle between two spectra and is calculated as and θ _ t is the threshold value for the spectral angle beyond which two spectra are not considered similar. The value of θ _ t was set at 2.5 degrees based on experiments. The average vector of candidate P , represents one endmember spectrum. When no pixels in possible P satisfy the conditions (5) and (6), the most extreme pixel in possible P is selected as the endmember.

Description of the SPA algorithm
SPA starts by identifying the two most distinct endmembers, 1 e and 2 e , typically representing respectively the brightest and darkest pixels. It then iteratively finds remaining endmembers, using orthogonal projections, until the number of endmembers defined by the user is obtained. Below is a step-by-step description of the SPA algorithm.

1) Step 1: Parameter setting
Values for the following three parameters must be set: the number of endmembers (m) to find, the spectral angle threshold ( θ _ t ) and the spatial threshold ( pixel t _ ).

2)
Step 2: Extraction of the first endmember ( 1 e ) The vector norms of all pixels in the image are calculated to locate the pixel that has the largest norm. According to Property 2, this pixel is at one of the simplex vertices and typically is the brightest pixels in the image cube. The first endmember 1 e is then estimated as described in Section 3.1.

3) Step 3: Extraction of the second endmember ( 2 e )
The distances between all pixels and 1 e are calculated and the pixel that has the largest distance is located. According to Property 3, this pixel will be at another vertex of the simplex usually corresponding to the darkest object in the scene (e.g. water body or shade). The 2 nd endmember, 2 e , can then be estimated according to section 3.1.

4) Step 4: Orthogonal projection and extraction of a new endmember
is then constructed using the two previously defined endmembers, and all pixels are projected to the subspace, proj S , orthogonal to the space spanned by where I is the identity matrix, and + U is the pseudo inverse of U , denoted by (10) In the projected subspace ( proj S ), the contribution to the mixtures from endmembers in U is eliminated. According to Property 4, the projected data in the new space still conform to the convexity, that is to say the endmembers are still at the vertices of the simplex. The vector (pixel) with the maximum norm in the projected subspace ( proj S ) will correspond to a new endmember (Property 2) in this case 3 e , and this pixel is located at the apex of the simplex furthest away from the subspace spanned by the previously defined endmembers, 1 e and 2 e .

5) Step 5: Complete the search of all endmembers
The endmember matrix, ] , , is then updated and Step 4 repeated to define a new endmember. This step is repeated until the predetermined number (m) of endmembers is reached. We calculate the change of the simplex volume with each subspace projection because it provides an insight on the convergence of the algorithm. The volume of the simplex can be calculated only when the simplex has more than 3 vertices. According to Property 1, the complete endmember set defines a simplex with the maximum volume, assuming that a simplex can fit the hyperspectral data perfectly. Thus, as a new endmember is estimated from the data, a new vertex is added to the simplex defined by the previous endmembers, and the volume of the simplex increases until the requested number of endmembers is reached. The volume increase is determined by the spectral contrast between the current endmember and the previously defined endmember set. Assuming data with m endmembers, if As endmembers are selected (e.g. the value of l is approaching m), the l ratio v _ decreases and theoretically converges to 1.0 if the simplex can fit the hyperspectral data very well [32]. The noise level and the complexity of the data will impact how quickly the volume ratio ( l ratio v _ ) converges and whether it converges to 1.0.

Description of the test data
The SPA algorithm was evaluated using two hyperspectral datasets. The first one was collected over the Cuprite mining district, Nevada, in July 1995 wit the Airborne Visible Infra-Red Imaging Spectrometer (AVIRIS) as part of an AVIRIS Group Shoot [2] and is available in the tutorial CD of the ENVI software. The second dataset was acquired by the Probe-1 airborne hyperspectral sensor flown over Baffin Island, Canada in July 2000. Before describing the characteristics of these data, we emphasize that they differ in many ways including in their spectral/spatial resolution, scene complexity and extent of vegetation cover. These differences allow a more thorough investigation of algorithm performance.

AVIRIS data for Cuprite
This hyperspectral cube has 400 * 350 pixels, and 50 bands of short-wave infrared data (1.9 μm ~ 2.4 μm). The spatial and spectral resolutions are respectively 20m and 10 nm. The data were corrected to reflectance using the ATREM (ATmospheric REMoval) method [33], and residual noise was minimized using the EFFORT (Empirical Flat Field Optimized Reflectance Transform) procedure [34]. Cuprite is located in west-central Nevada where large areas of exposed Cambrian sediments and Tertiary volcanics were intensively altered in mid-to late-Miocene [35]. Imagery for this site has been extensively investigated and reported in the remote sensing literature because of minimal vegetation cover and the presence of large outcrops exposing a suite of spectrally distinct alteration minerals [35][36][37][38]. Kruse and Huntington (1996) analyzed the AVIRIS dataset of this study and used a Pixel Purity Index (PPI) to identified endmembers corresponding to seven alteration minerals (zeolite, alunite, buddingtonite, calcite, kaolinite, silica and muscovite/illite) [2]. The results were consistent with that found by Swayze et al. (1992) [39] using the Multiple Spectral Feature Mapping Algorithm (MSFMA) who validated the predictions using field samples examined with X-Ray Diffraction. Below we refer to the image endmembers defined by Kruse and Huntington (1996) [2] as PPI-endmembers as part of the validation of the SPA algorithm.
The AVIRIS dataset was used to: 1) determine whether SPA can extract the 7 mineral endmembers documented by previous authors; 2) determine whether SPA converges; and 3) assess the merits of the spatial constraint in SPA. The SPA was applied to the Cuprite Cube with the following parameters: the total number of endmembers for this scene was set to 19 and the threshold values for θ _ t and pixel t _ were set to 2.5 degrees and 1 pixel, respectively. The choice of the total number of endmembers is based on a previous hyperspectral study over the same area by Plaza and Chang (2005) who utilized the concept of virtual dimensionality to determine the number of endmembers [30].

Probe-1 data for Baffin Island
The test data from the Baffin island study area ( Figure 1) covers part of the northeastern segment of the Paleoproterozoic Trans-Hudson Orogen [40], where the Lake Harbour Group comprises upper amphibolite to granulite grade metamorphosed granodiorite, monzonite, tonalite, syenite, peridotite, gabbro, carbonate, and clastic metasedimentary units ( Figure 1) [41,42]. The later include garnetiferous psammite, pelitic and semi-pelitic rocks. The calcareous rocks are commonly interlayered with siliciclastic rocks. Within the metasedimentary rocks orthoquartzite occurs as discrete layers and garnet-rich monzogranite outcrops as tabular bodies 100's of meters thick. Vegetation cover is limited (~25%), comprising primarily moss and grass, with dwarf shrub willows. Rock encrusting lichens covering a few percent to almost 100 percent of the rock are common to the majority of rock units. The region also includes numerous small lakes and year-round snow cover in gullies and shaded areas.
The airborne hyperspectral data (~3.5 x 7 km) were acquired with the Probe I sensor, which comprises 128 channels from 446 -2543 nm with an average band Full Width Half Maximum of ~15 nm and a Ground Instantaneous Field of View of ~7 m. (Figure 2). A vicarious atmospheric correction of the data was performed by the Canada Centre for Remote Sensing using field spectra acquired at the Iqaluit airport concurrently with the overflight. Excluding bands with low signal due to water absorption near 1.4µm and 1.9 µm, 101 bands were used for the test to which no additional preprocessing (e.g. smoothing filter) was applied. Field sampling and collection of spectra took place along traverses oriented perpendicular to the dominant structural and stratigraphic trends (Figure 2). The spectra were acquired with a portable ASD ® field spectrometer that has 2151 bands covering the 350 -2500 nm spectral range. A total of 217 spectral measurements were acquired for 56 sites, some of which lie outside, but proximal to the study area, and are representative of the geology shown in Figure 1. Multiple measurements were taken at each site for fresh, weathered, polished, and partial to fully lichen coated surfaces. We chose to evaluate the performance of SPA with this test data owing to 1) excellent bedrock exposure and limited continuous vegetation; 2) the spectral diversity of the rock units and the relevance of some units to mining exploration (gaussan and peridotite); 3) the variable spatial distribution of the rock units spanning large continuous exposures to small sporadic outcroppings; and, 4) the availability of field spectra and spectra of rock samples for the validation of endmembers extracted from imagery. The extraction of geological endmembers from this imagery is more challenging than for the imagery of Cuprite. This can be attributed to the presence of snow, tundra vegetation and rock encrusting lichen, which lower the relative spectral contrast between geological endmembers.
We also compare the endmembers derived from SPA with that derived from IEA, given that IEA has been reported as the most robust convex-based algorithm [29,30]. For the analysis with SPA the threshold values of θ _ t and pixel t _ were set to 2.5 degrees and 1 pixel, as was done for the Cuprite region. The number of endmembers was set to 30 for SPA and IEA. For IEA the spectral angle was set to 2.5 degrees.

AVIRIS data for Cuprite
Comparison with PPI endmembers validated in the literature We first examine the SPA endmembers in the context of the seven mineral PPI endmembers (zeolite, alunite, buddingtonite, calcite, kaolinite, silica and muscovite/illite) previously reported by Kruse and Huntington (1996) [2]. Out of 19 endmembers derived from the SPA, 16 are for minerals and 3 for shade/shadow (Table 1). A comparison of the 16 mineral endmembers and "true" endmembers (PPI_endmembers, Section 4.1.) is shown in Figure 3. Each SPA endmember was calculated from more than 2 pixels (2)(3)(4)(5)(6)(7)(8)(9). For each of the seven minerals we found at least one SPA endmember with a good match in spectral shape to that of the field validated PPI endmembers. When SPA picks multiple endmembers for a given mineral these differ in spectral magnitude or in subtle variations in their spectral shape (Table 1 and Figure 3). For example, two SPA endmembers (SPA_12 and SPA_15) were selected for the mineral alunite_2 (Figure 3b). SPA and PPI_endmembers result from the average of multiple candidate pixels located at vertices of the simplex. However they differ in their list of candidate pixels as PPI does not take into account spatial information, and thus, the averaging process generates different solutions resulting in endmember spectra that are distinct in their detailed shape and amplitude. One endmember, SPA_9 (with an absorption feature at 2.27 µm, Figure  3j), could not be matched to a PPI mineral endmember and has yet to be properly labeled though the observed feature is consistent with the mineral jarosite discussed in Clark et al. (2003) [43].

Merits of the spatial constraint
In figure 4 we compare endmembers derived from SPA and from SMACC (implemented in ENVI). Both methods successfully extracted endmembers for the known minerals occurrences. Because SMACC uses individual pixels to form each endmember, the SMACC endmembers display greater residual calibration errors (Figure 4a). SMACC also extracted additional endmembers capturing noise (Figure 4b), a pitfall that was not observed for SPA. As a reminder SPA begins by selecting multiple pixels as the possible candidate set before applying the spectral angle and spatial adjacency selection constraints. Consequently SPA finds the most noisy pixel and examines if there are neighboring pixels that meet the spectral and spatial criteria within the candidate set. Otherwise, it repeats the search process with the second most extreme pixel. If none of the pixels within the possible candidate set can meet the spectral and spatial criteria, the most extreme pixel is considered an endmember.

Changes in simplex volume
To illustrate the convergence of SPA we show the changes in the simplex volume between successive iterations (e.g. volume ratio, equation 11) as a function of the number of iterations ( Figure  5). The first 6 iterations capture the most important changes in the simplex volume. With additional endmembers, the change in volume ratio decreases and converges at endmember 18 ) beyond which the volume ratio is less than 1.0. For this particular data set, the geological endmembers of interest are extracted before endmember 18 (Table 1), thus convergence of the volume ratio is in this case a good indicator of the number of useful endmembers in the scene.

Probe-1 data for Baffin Island
Comparison with spectra collected in the field Out of thirty endmembers, twenty-one represent vegetation, water, snow and shadow (Table 2). Examples for snow, vegetation and lichen with closely matched ground-based reflectance spectra are illustrated in Figure 6. The remaining nine endmembers are geological and belong to six rock types: Fe-rich metasediments, clay-metasediments, marble, felsic rock (varnish/granite), peridotite and quartzite (Figures 7,8; Table 2). Eight geological endmembers (SPA_3, SPA_6, SPA_12, SPA_16, SPA_22, SPA_23, SPA_28 and SPA_30) closely match field spectra ( Figure 7). The SPA_12 endmember correlates well with field spectra of Fe-metasediment characterized by a strong absorption feature near 0.9 µm attributed to goethite or hematite [44] (Figure 7a). In the field, this endmember also corresponds to the occurrence of gaussans. The SPA_30 is characterized by a broad ferrous-iron absorption feature at 0.93µm observed in spectra of pyroxene and hornblende [38] and a Fe,Mg-OH vibrational feature at 2.32µm, and is a close match to the field spectrum of peridotite (Figure 7b). SPA_6 and SPA_23 match the field spectrum of marble based on a strong carbonate (CO 3 ) feature near 2.30-2.35 µm, but they differ in overall spectral amplitude. For SPA_ 23 the carbonate feature is centered near 2.32 µm but for SPA_6 it lies near 2.34 µm. (Figure 7c). SPA_3 is a close match to the field spectrum of granite ( Figure 7d) and represents multiple felsic targets with common spectral properties (e.g. overall higher reflectance and lack of obvious spectral absorption features). SPA_16 is another metasediment but it displays a clay absorption feature at 2.2 µm and a weak iron feature at 0.9µm (Figure 7e). Two endmembers spectrally similar but varying in amplitude, SPA_22 and SPA_28, are identified as quartzite based on a match with field spectra (Figure 7f). We failed to find field spectra that closely match the endmember SPA_11. The closest match is peridotite (Figure 8), but this endmember lack the spectral absorption feature at 2.32µm that is observable in the SPA_30 peridotite endmember and field spectra (Figure 7b). The accurate identification of this endmember requires additional fieldwork.

Comparison with IEA image endmembers
We also performed a comparison between endmembers of geological interest derived from IEA and SPA (Table 3). For IEA the total number of endmembers was set to 30 and the spectral angle to 2.5 degrees. Results from both methods are comparable with two exceptions. First, SPA misses a endmember (metasediment) identified by IEA that has a good match with field spectra (Figure 9). In the second exception SPA extracts a second endmemember for peridotite ( Figure 8). The endmember abundance maps (not shown) suggests that both peridotite endmembers map distinct spatial areas.

Changes in simplex volume
The change in the volume ratio ( l ratio v _ ) between successive iterations is shown in Figure 10. The curve converges at endmember 24 ( 24 ) after which the volume ratio remains less than 1.0. However, we found that endmembers of geological interest were extracted after endmember 24. For example, peridotite, an important rock type for the mining exploration of nickel, is extracted as the 30 th endmember. The majority of the snow, water, shade and vegetation endmembers were derived before this point, as shown in Table-2. SPA Endmember Label 1,4,8,9,13,14,15,17,19,20,24,26   The circle marks the region where absorption feature is present on the field spectrum of peridotite but absent from the SPA_11. *This endmember is labeled as peridotite based on Figure 8.

Discussion and future work
The SPA algorithm extracts endmembers from hyperspectral data without having to reduce the data dimensionality. It uses the spectral angle (alike IEA) and the spatial adjacency of pixels in the image to constrain the selection of candidate pixels representing an endmember. We designed SPA based on the observation that many targets have spatial continuity (e.g. bedrock lithologies) in hyperspectral imagery and thus a spatial constraint would be beneficial in the endmember search. We assumed that pixels that are spatially adjacent are more likely to have similar spectral properties and thus represent one endmember, and, that the probability that two adjacent pixels are both spurious is low. Experiments on two datasets demonstrate that SPA can effectively extract endmembers while requiring minimal user interaction.
It should be pointed out that the procedure to identify the simplex vertices in SPA is similar to that for advanced convex-based endmember selection methods such as MAX-D, VCA and SGA. However, of the convex-based endmember-search algorithms discussed in this paper, only SPA makes use of both the spectral angle and spatial adjacency to determine which pixels should form one endmember. By using the average of multiple pixels as one endmember, the SPA-derived endmember spectra appear less noisy (e.g. smoother), which is helpful for the improvement of unmixing results [45]. Given that spatially adjacent pixels are not likely of being simultaneously spurious, the use of the spatial adjacency makes SPA endmembers less sensitive to isolated noisy pixels, an inherent problem for convex-based endmember-search methods [21]. Although we initially designed SPA for geological applications (e.g. spatial continuity of bedrock), it offers potential for a variety of applications where the premise of spatial adjacency applies. Ecological examples include tree crowns and plant communities.
An additional product of the SPA is data describing the change of the simplex volume ratio between successive iterations during the endmember extraction. It illustrates the influence of a new endmember on the data structure, and provides information on the convergence of the algorithm. Though the rate of convergence speed can vary with the complexity of the scene, the patterns are similar showing the largest changes in volume ratio at the beginning of the endmember extraction process, followed by progressively smaller changes and convergence towards a plateau. If the endmember search terminates before the convergence point (the volume ratio is close to 1.0), significant endmembers will be missed. However as seen in the Baffin island example, endmembers for targets of interest may also be found beyond the convergence point. Thus additional research is required to properly constrain the number of endmember for a given search and application.
Comparison of endmembers obtained from SPA and IEA showed that both algorithms generate similar results (Table 2). Both methods operate on reflectance data, which is different from other algorithms such as VCA, N-FINDER and ICE that require a data dimensional reduction step. In the experiment with the Baffin island data, SPA missed one rock type (Metasediment), but obtained a additional rock endmember identified as a possible peridotite. Clearly different endmember-search algorithms can yield different endmember sets, indicating that the use of multiple search algorithms may reduce the chance of missing endmembers of interest.
The computation load of endmember-search algorithms is an important issue for the automatic extraction of endmembers, given the increasing volumes of hyperspectral data available. We did not study the computational efficiency of SPA, but because SPA is fundamentally similar to VCA and Max_D, that reported to be of high computational efficiency [14,27], we believe the efficiency of SPA should be comparable. It takes about 35 minutes for SPA to extract 30 endmembers for a data cube of 512 pixels, 512 lines and 101 bands using a PC with PIV CPU (1.0GMHz) and 512M of RAM. This test is based on the current IDL implementation of SPA algorithm which has not been optimized for computational efficiency.
There are a number of potential improvements to SPA that require further research namely: 1) a means for the automatic determination of the spectral angle threshold ( θ _ t ) and spatial threshold ( pixel t _ ), and 2) a means to constrain the total number of the endmembers in the scene. In our experiments we found that 2.5 degrees for θ _ t and 1 pixel for pixel t _ provide good endmember estimates. However, the selection of pixel t _ and θ _ t should be scene dependent because of the varying spatial and spectral complexity in different data. Currently, the choice of these two thresholds is still arbitrary and it would valuable to develop a more robust way to define values for θ _ t pixel t _ . Finally, we do not yet have a definite means of constraining the number of endmembers in a given search, though the simplex volume change during the search offers a qualitative assessment. The concept of virtual dimensionality (VD) has been proposed and proven useful for this purpose in recent years [18,46]. The use of the VD concept within SPA may allow the study of the relationship between VD and simplex volume in the future and prove valuable to constrain the number of endmembers.